Hive [environment setup 02] [hive-3.1.2 version HiveServer2/beeline configuration use]

Hive has built-in HiveServer and HiveServer2 services, both of which allow clients to connect using multiple programming languages. However, HiveServer cannot handle concurrent requests from multiple clients, so HiveServer2 is generated. HiveServer2 (HS2) allows remote clients to submit requests to hive and retrieve results in various programming languages, and supports multi client concurrent access and authentication.

HS2 is a single process composed of multiple services, including Thrift based Hive service (TCP or HTTP) and Jetty Web service for Web UI.

HiveServer2 has its own CLI tool, beeline. Beeline is a JDBC client based on SQLLine. At present, HiveServer2 is the focus of Hive development and maintenance, so beeline is more recommended than Hive CLI. The following mainly explains the configuration of beeline.

1. Modify Hadoop configuration

Modify the core - site of Hadoop cluster XML configuration file, add the following configuration to specify that the root user of Hadoop can act as an agent for all users on the machine. For the installation of Hadoop, please refer to Hadoop 3.1.3 stand alone installation and deployment in Linux Environment.

<property>
	<name>hadoop.proxyuser.root.hosts</name>
	<value>*</value>
</property>
<property>
	<name>hadoop.proxyuser.root.groups</name>
	<value>*</value>
</property>

The reason to configure this step is that hadoop introduced the security camouflage mechanism after 2.0, So that hadoop does not allow the upper system (such as hive) to directly transfer the actual user to the hadoop layer, but should transfer the actual user to a super agent, which will perform operations on hadoop to avoid arbitrary operation of hadoop by any client. If this step is not configured, an AuthorizationException may be thrown in subsequent connections.

For the user agent mechanism of Hadoop, please refer to the user agent mechanism of Hadoop or superusers acting onhalf of other users

2 start hiveserver2

Since the environment variable has been configured above, you can start it directly here:

# nohup hiveserver2 &

Non background startup displays:

[root@tcloud ~]# hiveserver2
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/java/bin:/usr/local/mysql/bin:/usr/local/zookeeper/bin:/usr/local/zookeeper/sbin:/usr/local/hadoop-3.1.3/bin:/usr/local/hadoop-3.1.3/sbin:/usr/local/spark/bin:/usr/local/hive/bin:/root/bin)
2021-08-03 15:03:17: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 2353799a-5285-44fd-a212-9c882d547099
Hive Session ID = 9349415b-0e9a-469f-9044-fb32914675d6
Hive Session ID = 2d98aa22-0ac6-474d-ac87-e7d4db10d309
Hive Session ID = a1047f0b-d61d-4e3c-8a05-d764dcc86957

3 use beeline

You can use the following command to enter the beeline interactive command line. If Connected appears, the connection is successful.

# beeline -u jdbc:hive2://tcloucd:10000 -n root
# Connected to: Apache Hive (version 1.2.2)

Keywords: Big Data hive Data Warehouse

Added by greenber on Sun, 02 Jan 2022 03:01:29 +0200