2. hbase -- integrate Phoenix to implement SQL like operation hbase

I. Phoenix overview

1, introduction
Phoenix can be understood as the query engine of HBase, Phoenix, a project open-source by saleforce.com, and then donated to Apache. It is equivalent to a Java middleware to help developers, such as using jdbc to access relational databases, and accessing NoSql database HBase.

phoenix, the table and data of operation, is stored on HBase. phoenix just needs to be associated with HBase. Then use the tool to read or write.

In fact, Phoenix can only be regarded as a tool to replace the syntax of HBase. Although you can use java to connect Phoenix with jdbc, and then operate HBase, you can't use it in OLTP in the production environment. In the online transaction processing environment, low latency is required, while Phoenix does some optimization when querying HBase, but the latency is not small. So it is still used in OLAT, and then the results are returned and stored.

Two, deployment

Basic environment:
hadoop
hbase
zookeeper
This basic environment is the environment after the previous hbase deployment. Please deploy according to the previous one. There is no repeat deployment here
Next, we use Phoenix as the middleware to operate HBase cluster data. According to the version of HBase, download the corresponding version of Phoenix. apache-phoenix-4.14.2-HBase-1.3-bin.tar.gz is used here. And it's deployed on bigdata 121
Unpack the package:

tar zxf  apache-phoenix-4.14.2-HBase-1.3-bin.tar.gz -C /opt/modules/
mv /opt/modules/apache-phoenix-4.14.2-HBase-1.3-bin /opt/modules/phoenix-4.14.2-HBase-1.3-bin

Configure environment variables:

vim /etc/profile.d/phoenix.sh
#!/bin/bash
export PHOENIX_HOME=/opt/modules/phoenix-4.14.2-HBase-1.3
export PATH=$PATH:${PHOENIX_HOME}/bin

source /etc/profile.d/phoenix.sh

Copy the conf/hbase-site.xml of HBase to / opt/modules/phoenix-4.14.2-HBase-1.3-bin/bin.

cp ${HBASE_HOME}/conf/hbase-site.xml /opt/modules/phoenix-4.14.2-HBase-1.3-bin/bin

Then copy some dependency packages of Phoenix accessing hbase to the lib directory of hbase. Note: you need to copy them to all hbase nodes

cd /opt/modules/phoenix-4.14.2-HBase-1.3-bin
cp phoenix-4.10.0-HBase-1.2-server.jar phoenix-core-4.10.0-HBase-1.2.jar ${HBASE_HOME}/lib/

//Copy to the other two hbase nodes
scp phoenix-4.10.0-HBase-1.2-server.jar phoenix-core-4.10.0-HBase-1.2.jar bigdata122:${HBASE_HOME}/lib/
scp phoenix-4.10.0-HBase-1.2-server.jar phoenix-core-4.10.0-HBase-1.2.jar bigdata122:${HBASE_HOME}/lib/

Start Phoenix command line to test whether hbase can be connected

sqlline.py zkserver address
//Such as:
sqlline.py bigdata121,bigdata122,bigdata123:2181

It should be noted that Phoenix is actually a plug-in library of hbase, not an independent component. After hbase has this plug-in, restart hbase, and then use Phoenix to connect hbase and operate hbase.

III. basic use order

What are the display forms

!table

Create table

create table "student"(
id integer not null primary key,
name varchar);
If the table name is not quoted, all the table names will be converted to uppercase letters by default. If the table name is quoted, it will not be. When table names are used in later commands, the situation is similar

Delete table

drop table "test"

insert data

upsert into test values(1,'Andy');
When inserting value, do not use double quotation marks for strings. Only use single quotation marks, or an error will be reported. Remember that this is an upsert, not an insert. Don't get it wrong

Query data

select * from "test".  The usage is basically the same as the common sql SELECT

Delete data

delete from "test" where id=2

Modify table structure

Add field: alter table "student" add address varchar
 Delete field: alter table "student" drop column address

Create mapping table

hbase Create table
create 'fruit','info','accout'
//insert data
put 'fruit','1001','info:name','apple'
put 'fruit','1001','info:color','red'
put 'fruit','1001','info:price','10'
put 'fruit','1001','account:sells','20'
put 'fruit','1002','info:name','orange'
put 'fruit','1002','info:color','orange'
put 'fruit','1002','info:price','8'
put 'fruit','1002','account:sells','100'

Phoenix Create mapping table,Note that it must have the same name
create view "fruit"(
"ROW" varchar primary key,
"info"."name" varchar,
"info"."color" varchar,
"info"."price" varchar,
"account"."sells" varchar,
)；

//Then you can see the data in hbase in Phoenix

IV. connect Phoenix with jdbc

pom.xml of maven project

<dependency>
    <groupId>org.apache.phoenix</groupId>
    <artifactId>phoenix-core</artifactId>
    <version>4.14.2-HBase-1.3</version>
</dependency>

Code:

package PhoenixTest;

import org.apache.phoenix.jdbc.PhoenixDriver;

import java.sql.*;

public class PhoenixConnTest {
    public static void main(String[] args) throws ClassNotFoundException, SQLException {
        //Loading the jdbc driver class of phoenix
        Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
        //Build connection string
        String url = "jdbc:phoenix:bigdata121,bigdata122,bigdata123:2181";
        //Create connection
        Connection connection = DriverManager.getConnection(url);
        //Create session
        Statement statement = connection.createStatement();
        //Execute the sql statement. Note that since the table name needs double quotation marks, add \ for escape
        boolean execute = statement.execute("select * from \"fruit\"");
        if (execute) {
            //Get the returned execution result and print it
            ResultSet resultSet = statement.getResultSet();
            while (resultSet.next()) {
                System.out.println(resultSet.getString("name"));
            }
        }
        statement.close();
        connection.close();

    }
}

Problems in connecting Phoenix with jdbc

Exception in thread "main" com.google.common.util.concurrent.ExecutionError: java.lang.NoSuchMethodError: com.lmax.disruptor.dsl.Disruptor.<init>(Lcom/lmax/disruptor/EventFactory;ILjava/util/concurrent/ThreadFactory;Lcom/lmax/disruptor/dsl/ProducerType;Lcom/lmax/disruptor/WaitStrategy;)V
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2254)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3985)
    at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4788)
    at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:241)
    at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:147)
    at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:270)
    at PhoenixTest.PhoenixConnTest.main(PhoenixConnTest.java:11)
Caused by: java.lang.NoSuchMethodError: com.lmax.disruptor.dsl.Disruptor.<init>(Lcom/lmax/disruptor/EventFactory;ILjava/util/concurrent/ThreadFactory;Lcom/lmax/disruptor/dsl/ProducerType;Lcom/lmax/disruptor/WaitStrategy;)V
    at org.apache.phoenix.log.QueryLoggerDisruptor.<init>(QueryLoggerDisruptor.java:72)
    at org.apache.phoenix.query.ConnectionQueryServicesImpl.<init>(ConnectionQueryServicesImpl.java:414)
    at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:248)
    at org.apache.phoenix.jdbc.PhoenixDriver$3.call(PhoenixDriver.java:241)
    at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4791)
    at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3584)
    at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372)
    at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335)
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250)
    ... 8 more

First we see this line:

java.lang.NoSuchMethodError: com.lmax.disruptor.dsl.Disruptor

It shows that the method com.lmax.disruptor.dsl.Disruptor does not exist, so I found it in the idea. There is this method, and then I Baidu. This package is a package that hbase and Phoenix depend on. Since the method exists, but no error is reported. According to experience, it is likely that the version of this dependency package is incorrect, which results in some methods being incompatible. So I try to change the version of the newer disruptor package, find it on maven, select a version of 3.3.7 (the default is 3.3.0), and add it to pom.xml

<dependency>
    <groupId>com.lmax</groupId>
    <artifactId>disruptor</artifactId>
    <version>3.3.7</version>
</dependency>

And then rerun the program. A miracle happened. It was running normally. That's obviously because of the version of the package.

V. bug s in the combination of Phoenix and hbase

First of all, the fields in hbase's column cluster have no concept of type, and they are all stored directly in binary mode, and hbase itself can only parse string type. When we use Phoenix to create a table with regular sql, the fields are of type. Bugs that
1. hbase displays garbled code
In this case, when Phoenix creates the table, the fields have non string types, such as int, double and so on. Then when inserting data from Phoenix using insert and other statements, and then viewing data from Phoenix using select statements, it is normal and no problem. However, when scan is used in hbase to view table data, it will be found that other fields of non string type are all displayed in random code. This is normal, because hbase can't parse non string type, which is displayed in the way of direct binary. I have no solution to this bug.
The best way to do this is not to query the data from hbase, but from Phoenix. And in this case, in hbase, even the column cluster, rowkey, and column display strange characters. I can't understand it at all

2. Phoenix display is abnormal (not garbled)
A table (with data) already exists in hbase implementation, and then a mapping table is created in Phoenix. It is normal to view data on hbase side, but through Phoenix, it is found that the display of non string type is abnormal, such as int, double and so on, which becomes some abnormal numbers. For example:

select * from "fruit";
+-------+---------+---------+--------------+--------+
|  ROW  |  name   |  color  |    price     | sells  |
+-------+---------+---------+--------------+--------+
| 1001  | apple   | red     | -1322241488  | null   |
| 1002  | orange  | orange  | -1204735952  | null   |
+-------+---------+---------+--------------+--------+

The reason is very simple, because hbase's own table does not store any data type information, but Phoenix forcibly parses it into other data types, which will not match naturally, so the display is abnormal.

price and sales are clearly numerical types, but what they show is not normal. The solution to this situation is: when Phoenix creates the mapping table, all the field types are defined as the string type of varchar.

Keywords: Big Data HBase Java Apache JDBC

Added by zubinkasad on Tue, 05 Nov 2019 02:24:38 +0200

Programming VIP