Installation and deployment of HBase

1, Deploy front-end environment

First deploy the distributed high availability version of Hadoop, namely ZooKeeper+Hadoop.

https://www.cnblogs.com/live41/p/15483192.html

*The plan of the deployed server name and directory is the same as that in the link. Namely

c1:192.168.100.105
c2:192.168.100.110
c3:192.168.100.115
c4:192.168.100.120

The folders of ZooKeeper, Hadoop and HBase are all placed in the / home / directory.

*   Although HBase has built-in ZooKeeper, the built-in version is generally closed and the independent deployment version is used (because there are other processes that need ZooKeeper to avoid maintaining two sets).

 

2, Download HBase and configure environment variables

*The following steps should be performed for each machine

1. Download

http://hbase.apache.org/downloads.html

Download the bin file, such as hbase-2.4.6-bin.tar

 

2. Upload to the server and unzip

(1) As mentioned above, the HBase folder is placed in the home directory, which is / home/hbase

tar -xvf hbase-2.4.6-bin.tar

(2) Rename (cleanliness + obsessive compulsive disorder)

mv hbase-2.4.6 hbase

 

3. Configure environment variables

vim ~/.bashrc

Add the following to it:

export PATH=$PATH:/usr/local/hbase/bin

Update environment variables

source ~/.bashrc

 

 

3, Configure HBase

*First, execute the operation on machine c1, and then synchronize the configuration file to other machines with scp.

1. Configure hbase-env.sh

vim /home/hbase/conf/hbase-env.sh

Add the following after adding or de commenting:

export JAVA_HOME=/usr/bin/java1.8.0
export HBASE_CLASSPATH=/home/hbase/conf
export HBASE_MANAGES_ZK=false

 

2. Configure hbase-site.xml

<configuration>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://ns6/hbase</value>  <!--This attribute corresponds to hdfs-site.xml of dfs.nameservices attribute-->
    </property>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>c1:2181,c2:2181,c3:2181,c4:2181</value>
        <!--If configured as<value>c1,c2,c3,c4</value>,Configuration is required hbase.zookeeper.property.clientPort attribute-->
    </property>
    <property>
        <name>hbase.master</name>
        <value>60000</value> <!--HBase HA In this mode, you only need to configure the port-->
    </property>
    <!-- <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property> -->
</configuration>

 

3. Configure regionservers

vim regionservers

Add the following content. If there is a localhost, delete it first.

c1
c2
c3
c4

Here is the configuration of the corresponding hosts file, as mentioned earlier.

 

4. Copy the configuration file to other nodes

scp /home/hbase/conf/*.* c2:/home/hbase/conf
scp /home/hbase/conf/*.* c3:/home/hbase/conf
scp /home/hbase/conf/*.* c4:/home/hbase/conf

 

 

4, Start HBase

1. At the master node

start-hbase.sh

 

2. At the standby node

It needs to be started manually, otherwise ZooKeeper will only start one. Of course, you can also choose not to start.

hbase-daemon.sh start master

 

 

Appendix: hbase-site.xml parameter description

  • hbase.rootdir

This directory is the shared directory of RegionServer and is used to persist HBase. Note that the HDFS address in hbase.rootdir must be consistent with the HDFS IP address, domain name and port of fs.defaultFS in Hadoop's core-site.xml. (dfs.nameservices in HA environment)   (decided by zookeeper)

  • hbase.cluster.distributed

Operation mode of HBase. False indicates stand-alone mode, and true indicates distributed mode. If false, HBase and ZooKeeper will run in the same JVM

  • hbase.master

If only a single Hmaster is set, the hbase.master attribute parameter needs to be set to master:60000 (hostname: 60000)

If you want to set up multiple hmasters, we only need to provide port 60000, because the selection of a real master will be handled by zookeeper

  • hbase.tmp.dir

Temporary folder for the local file system. Can be modified to a more persistent directory. (/ tmp will be cleared on restart)

  • hbase.zookeeper.quorum

For ZooKeeper configuration. At least list all ZooKeeper hosts in the hbase.zookeeper.quorum parameter, separated by commas. The default value of this attribute value is localhost, which obviously cannot be used in distributed applications.

  • hbase.zookeeper.property.dataDir

This parameter is used to set the storage location of ZooKeeper snapshot. The default value is / tmp. Obviously, it will be cleared when restarting. Because the author's ZooKeeper is installed independently, the path here points to $ZooKeeper_ The position set by dataDir in home / conf / zoo.cfg.

  • hbase.zookeeper.property.clientPort

Indicates the port on which the client connects to ZooKeeper.

  • zookeeper.session.timeout

ZooKeeper session timed out. Hbase changes this value to zk cluster and recommends the maximum timeout of a session to it

  • hbase.regionserver.restart.on.zk.expire

When the regionserver encounters ZooKeeper session expired, the regionserver will choose restart instead of abort.

 

Keywords: Big Data

Added by boonika on Mon, 01 Nov 2021 13:08:25 +0200