1, Deploy front-end environment
First deploy the distributed high availability version of Hadoop, namely ZooKeeper+Hadoop.
https://www.cnblogs.com/live41/p/15483192.html
*The plan of the deployed server name and directory is the same as that in the link. Namely
c1:192.168.100.105 c2:192.168.100.110 c3:192.168.100.115 c4:192.168.100.120
The folders of ZooKeeper, Hadoop and HBase are all placed in the / home / directory.
* Although HBase has built-in ZooKeeper, the built-in version is generally closed and the independent deployment version is used (because there are other processes that need ZooKeeper to avoid maintaining two sets).
2, Download HBase and configure environment variables
*The following steps should be performed for each machine
1. Download
http://hbase.apache.org/downloads.html
Download the bin file, such as hbase-2.4.6-bin.tar
2. Upload to the server and unzip
(1) As mentioned above, the HBase folder is placed in the home directory, which is / home/hbase
tar -xvf hbase-2.4.6-bin.tar
(2) Rename (cleanliness + obsessive compulsive disorder)
mv hbase-2.4.6 hbase
3. Configure environment variables
vim ~/.bashrc
Add the following to it:
export PATH=$PATH:/usr/local/hbase/bin
Update environment variables
source ~/.bashrc
3, Configure HBase
*First, execute the operation on machine c1, and then synchronize the configuration file to other machines with scp.
1. Configure hbase-env.sh
vim /home/hbase/conf/hbase-env.sh
Add the following after adding or de commenting:
export JAVA_HOME=/usr/bin/java1.8.0 export HBASE_CLASSPATH=/home/hbase/conf export HBASE_MANAGES_ZK=false
2. Configure hbase-site.xml
<configuration> <property> <name>hbase.rootdir</name> <value>hdfs://ns6/hbase</value> <!--This attribute corresponds to hdfs-site.xml of dfs.nameservices attribute--> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>c1:2181,c2:2181,c3:2181,c4:2181</value> <!--If configured as<value>c1,c2,c3,c4</value>,Configuration is required hbase.zookeeper.property.clientPort attribute--> </property> <property> <name>hbase.master</name> <value>60000</value> <!--HBase HA In this mode, you only need to configure the port--> </property> <!-- <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> --> </configuration>
3. Configure regionservers
vim regionservers
Add the following content. If there is a localhost, delete it first.
c1
c2
c3
c4
Here is the configuration of the corresponding hosts file, as mentioned earlier.
4. Copy the configuration file to other nodes
scp /home/hbase/conf/*.* c2:/home/hbase/conf scp /home/hbase/conf/*.* c3:/home/hbase/conf scp /home/hbase/conf/*.* c4:/home/hbase/conf
4, Start HBase
1. At the master node
start-hbase.sh
2. At the standby node
It needs to be started manually, otherwise ZooKeeper will only start one. Of course, you can also choose not to start.
hbase-daemon.sh start master
Appendix: hbase-site.xml parameter description
- hbase.rootdir
This directory is the shared directory of RegionServer and is used to persist HBase. Note that the HDFS address in hbase.rootdir must be consistent with the HDFS IP address, domain name and port of fs.defaultFS in Hadoop's core-site.xml. (dfs.nameservices in HA environment) (decided by zookeeper)
- hbase.cluster.distributed
Operation mode of HBase. False indicates stand-alone mode, and true indicates distributed mode. If false, HBase and ZooKeeper will run in the same JVM
- hbase.master
If only a single Hmaster is set, the hbase.master attribute parameter needs to be set to master:60000 (hostname: 60000)
If you want to set up multiple hmasters, we only need to provide port 60000, because the selection of a real master will be handled by zookeeper
- hbase.tmp.dir
Temporary folder for the local file system. Can be modified to a more persistent directory. (/ tmp will be cleared on restart)
- hbase.zookeeper.quorum
For ZooKeeper configuration. At least list all ZooKeeper hosts in the hbase.zookeeper.quorum parameter, separated by commas. The default value of this attribute value is localhost, which obviously cannot be used in distributed applications.
- hbase.zookeeper.property.dataDir
This parameter is used to set the storage location of ZooKeeper snapshot. The default value is / tmp. Obviously, it will be cleared when restarting. Because the author's ZooKeeper is installed independently, the path here points to $ZooKeeper_ The position set by dataDir in home / conf / zoo.cfg.
- hbase.zookeeper.property.clientPort
Indicates the port on which the client connects to ZooKeeper.
- zookeeper.session.timeout
ZooKeeper session timed out. Hbase changes this value to zk cluster and recommends the maximum timeout of a session to it
- hbase.regionserver.restart.on.zk.expire
When the regionserver encounters ZooKeeper session expired, the regionserver will choose restart instead of abort.