Big data: installation details of Hive

What is hive?

  1. Open source by facebook, used to solve the data statistics of massive structured logs;
  2. A data warehouse tool based on hadoop uses HDFS to store and map structured data files into a table, and provides the function of sql like query. The bottom layer uses MR to calculate;
  3. The essence is to transform HQL into MR program.

Preparation

  • Java 1.5 or above (my is jdk1.8)
  • Hadoop 2.0 and above (mine is 2.8.4)

Installation process

# 1. Download and unzip the installation package
cd /usr/local
wget http://archive.apache.org/dist/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz
tar -zxvf apache-hive-0.13.1-bin.tar.gz

 

2.Configure environment variables

vi /etc/profile
#Add content:
#Hive
export HIVE_HOME=/usr/local/hive-1.2.1
export PATH=$PATH:$HIVE_HOME/bin

source /etc/profile

 

# 3. Configuration file
# conf/hive-env.sh

cd conf
cp hive-env.sh.template  hive-env.sh  

vi hive-env.sh
#Add content:
#Hadoop&&Hive
    HADOOP_HOME=/usr/local/hadoop-2.8.4
    export HIVE_CONF_DIR=/usr/local/hive-1.2.1/conf

 

#4. Configure hive-site.xml

[root@master conf]# cp hive-default.xml.template hive-default.xml
[root@master conf]# vi hive-site.xml

##Add content:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    <description>JDBC connect string for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
    <description>username to use against metastore database</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>hive</value>
    <description>password to use against metastore database</description>
  </property>
</configuration>

 

Install and configure mysql

Here, we use MySQL database to store Hive metadata, rather than the derby that comes with Hive.

1. For the installation of mysql under centos7, please refer to: centos7 install MySQL

##Install Mysql
cd /usr/local
wget http://dev.mysql.com/get/Downloads/MySQL-5.5/MySQL-5.5.48-1.linux2.6.x86_64.rpm-bundle.tar
tar -zxvf MySQL-5.5.48-1.linux2.6.x86_64.rpm-bundle.tar
yum install perl
rpm -ivh MySQL-server-5.5.48-1.linux2.6.x86_64.rpm
rpm -ivh MySQL-client-5.5.48-1.linux2.6.x86_64.rpm
rpm -e [Original database] --nodeps
service mysql start
/usr/bin/mysql-secure-installation

Mysql installation is not successful. I'm replacing it with MariaDB. There is no problem starting Hive at present. I don't know if I will encounter problems in other aspects.

##Install mariaDB
##The MySQL image built in centos7 has given up Oracle's mysql, and instead uses the MySQL branch MariaDB to install MariaDB:
yum install mariadb

##Then use the command systemctl start mariadb, and the prompt is as follows:
Failed to start mariadb.service: Unit mariadb.service failed to load: No such file or directory

##The mariadb service could not be found. The reason why it cannot be found is that the installation of mariadb itself has not been completed. Execute the following command to check the dependency of mariadb:
$ sudo yum search mariadb

//Perform the following to install the missing dependency package:
$ yum install mariadb-embedded mariadb-libs mariadb-bench mariadb mariadb-sever

##Then start mariadb. Normally, if you want to set the power on and self start mariadb, use the following command:
$ systemctl enable mariadb

##Remember that when you install mysql using the yum install command, you need to add an additional wildcard "*", using the following command:
yum install mariadb*

As for the relationship between mysql and mariaDB, it's still a story: Why did CentOS 7 abandon MySQL and use MariaDB instead?

2. Download mysql jdbc package, Download mysql-connector-java-5.1.46.tar.gz

cd /usr/local
tar -zxvf mysql-connector-java-5.1.46.tar.gz
cp mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar  /usr/local/hive-1.2.1/lib

3. Start and log in mysql shell

mysql -uroot -p
create database hive;
grant all on *.* to hive@localhost identified by 'hive';

4. Create a new hive database:

mysql -uroot -p
create database hive;

 

5. Configure mysql to allow hive access:

grant all on *.* to hive@localhost identified by 'hive';
flush privileges;

 

6. Start hive

start-dfs.sh 
start-yarn.sh
hive   

Before you start hive, start the hadoop cluster.

Hive is shown as follows:

Reference resources:

https://www.zhihu.com/question/41832866

https://blog.csdn.net/eclothy/article/details/52733891

https://blog.csdn.net/liyifan687/article/details/80103285

Keywords: Big Data hive MySQL MariaDB RPM

Added by calbolino on Tue, 10 Dec 2019 20:06:18 +0200