[Hbase] version 1.3.1 installation

Environmental Science:

1-core 4G memory 128G disk, three machines in the same configuration

192.168.242.131
192.168.242.132
192.168.242.133

——Linux Centos7 x64 system platform

——JDK, the running environment of components developed based on Java

——The storage layer of Hadoop and HBase data depends on HDFS

——Zookeeper, monitoring and coordination

Other dependencies:

sudo yum install -y net-tools
sudo yum install -y vim
sudo yum install -y wget
sudo yum install -y lrzsz
sudo yum install -y pcre pcre-devel
sudo yum install -y zlib zlib-devel
sudo yum install -y openssl openssl-devel
sudo yum install -y unzip
sudo yum install -y libtool
sudo yum install -y gcc-c++
sudo yum install -y telnet
sudo yum install -y tree
sudo yum install -y nano
sudo yum install -y psmisc
sudo yum install -y rsync
sudo yum install -y ntp

The JDK version is directly installed using yum

sudo yum install -y java-1.8.0-openjdk-devel.x86_64

There are no restrictions on the version of Zookeeper, just any. Installation reference:

https://www.cnblogs.com/mindzone/p/15468883.html

There are differences between the deployment of version 3 and version 2 of Hadoop. Here, the cluster deployment of version 2 of Hadoop is written separately

The version of Hbase should match Hadoop, which is a deployment pit

There is no problem with Hbase 1.3.1 matching Hadoop 2.7.2

Hadoop 2.7.2} installation

Unit 1 131 downloads compressed package

wget https://archive.apache.org/dist/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz

Extract to the specified directory

mkdir -p /opt/module
tar -zxvf hadoop-2.7.2.tar.gz -C /opt/module/

Configure environment variables for Hadoop and JDK

vim /etc/profile

Add variable information at the end (other machines also add it)

# HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.7.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

# JAVA_HOME here to see their own specific jdk version, do not directly cv, use find / -name java to find it
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64
export PATH=$PATH:$JAVA_HOME/bin

Make variables effective immediately:

source /etc/profile

Then test whether the variable setting is valid

hadoop version

Success will show the information of hadoop

[root@localhost ~]# hadoop version
Hadoop 2.7.2
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r b165c4fe8a74265c792ce23f546c64604acf0e41
Compiled by jenkins on 2016-01-26T00:08Z
Compiled with protoc 2.5.0
From source with checksum d0fda26633fa762bff87ec759ebe689c
This command was run using /opt/module/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar

Backup profile:

# Profile backup
cd /opt/module/hadoop-2.7.2/etc/hadoop/
cp -r core-site.xml core-site.xml.bak
cp -r hadoop-env.sh hadoop-env.sh.bak
cp -r hdfs-site.xml hdfs-site.xml.bak
cp -r mapred-env.sh mapred-env.sh.bak
cp -r mapred-site.xml mapred-site.xml.bak
cp -r yarn-env.sh yarn-env.sh.bak
cp -r yarn-site.xml yarn-site.xml.bak

core-site.xml

Declare the address of the master node

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- appoint HDFS in NameNode Address of -->
<property>
        <name>fs.defaultFS</name>
      <value>hdfs://192.168.242.131:9000</value>
</property>

<!-- appoint Hadoop Storage directory of files generated at run time -->
<property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/module/hadoop-2.7.2/data/tmp</value>
</property>
</configuration>

hadoop-env.sh

Just declare the JDK location

# The java implementation to use.
# export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64

hdfs-site.xml

Define the number of replicas and the address of the secondary node

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
        <name>dfs.replication</name>
        <value>3</value>
</property>

<!-- appoint Hadoop Secondary name node host configuration -->
<property>
      <name>dfs.namenode.secondary.http-address</name>
      <value>192.168.242.133:50090</value>
</property>
</configuration>

mapred-env.sh

Declare JDK path

# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64

marpred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- appoint MR Run in Yarn upper -->
<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
</property>
</configuration>

yarn-env.sh can not be changed. The script directly obtains $JAVA_HOME

yarn-site.xml

Specify the Explorer address

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
<!-- Reducer How to get data -->
<property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
</property>

<!-- appoint YARN of ResourceManager Address of -->
<property>
        <name>yarn.resourcemanager.hostname</name>
        <value>192.168.242.132</value>
</property>
</configuration>

Configure cluster node address

vim /opt/module/hadoop-2.7.2/etc/hadoop/slaves

Write down all machine addresses, but don't leave more spaces and blank lines

192.168.242.131
192.168.242.132
192.168.242.133

Then distribute Hadoop to the remaining machines

# xsync script
xsync /opt/module/hadoop-2.7.2

# No xsync script, copy with scp
scp /opt/module/hadoop-2.7.2 root@192.168.242.132:/opt/module/
scp /opt/module/hadoop-2.7.2 root@192.168.242.133:/opt/module/

HDFS needs to be formatted for the first startup

hdfs namenode -format

If you need to format again, clear the data in the data directory first

rm -rf /opt/module/hadoop-2.7.2/data
hdfs namenode -format

Cluster deployment completed!

Hadoop cluster startup:

# Unit 1 starts 
$HADOOP_HOME/sbin/start-dfs.sh

# Unit 2 starts
$HADOOP_HOME/sbin/start-yarn.sh

Execution information of unit 1:

[root@192 ~]# $HADOOP_HOME/sbin/start-dfs.sh
Starting namenodes on [192.168.242.131]
192.168.242.131: starting namenode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-root-namenode-192.168.242.131.out
192.168.242.131: starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-root-datanode-192.168.242.131.out
192.168.242.133: starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-root-datanode-192.168.242.133.out
192.168.242.132: starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-root-datanode-192.168.242.132.out
Starting secondary namenodes [192.168.242.133]
192.168.242.133: starting secondarynamenode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-root-secondarynamenode-192.168.242.133.out
[root@192 ~]#

Execution information of unit 2:

[root@192 ~]# $HADOOP_HOME/sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-root-resourcemanager-192.168.242.132.out
192.168.242.133: starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-root-nodemanager-192.168.242.133.out
192.168.242.131: starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-root-nodemanager-192.168.242.131.out
192.168.242.132: starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-root-nodemanager-192.168.242.132.out
[root@192 ~]#

Keep Zookeeper running:

[root@192 ~]# zk-cluster status
---------- zookeeper 192.168.242.131 state ------------
/usr/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/module/apache-zookeeper-3.7.0/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
---------- zookeeper 192.168.242.132 state ------------
/usr/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/module/apache-zookeeper-3.7.0/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader
---------- zookeeper 192.168.242.133 state ------------
/usr/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/module/apache-zookeeper-3.7.0/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
[root@192 ~]#

Hbase cluster installation:

Compressed package downloaded by unit 1:

wget https://dlcdn.apache.org/hbase/stable/hbase-2.4.9-bin.tar.gz

Unpack to the specified directory

tar -zxvf hbase-2.4.9-bin.tar.gz -C /opt/module/

Backup profile

cp -r /opt/module/hbase-2.4.9/conf/hbase-env.sh /opt/module/hbase-2.4.9/conf/hbase-env.sh.bak
cp -r /opt/module/hbase-2.4.9/conf/hbase-site.xml /opt/module/hbase-2.4.9/conf/hbase-site.xml.bak
cp -r /opt/module/hbase-2.4.9/conf/regionservers /opt/module/hbase-2.4.9/conf/regionservers.bak

hbase-env.sh

Append environment variable

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64
export HBASE_MANAGES_ZK=false

hbase-site.xml

1. Note that the HDFS ports of rootdir and hadoop are the same. (in this article, use Ctrl + F to find 9000)

2. The datadir of Zookeeper writes out the path set by itself, otherwise the shell of Hbase cannot be found zk during execution

<property>
<name>hbase.rootdir</name>
<value>hdfs://192.168.242.131:9000/HBase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
 <!-- 0.98 New changes after, not in the previous version.port,The default port is 60000 -->
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>
<property> 
<name>hbase.zookeeper.quorum</name>
 <value>192.168.242.131,192.168.242.132,192.168.242.133</value>
</property>
<property> 
<name>hbase.zookeeper.property.dataDir</name>
    <!-- designated zk catalogue -->
 <value>/opt/module/apache-zookeeper-3.7.0/zk-data</value>
</property>

regionservers

No spaces or blank lines

192.168.242.131
192.168.242.132
192.168.242.133

Soft link Hadoop configuration file

ln -s /opt/module/hadoop-3.3.1/etc/hadoop/core-site.xml /opt/module/hbase-2.4.9/conf/core-site.xml
ln -s /opt/module/hadoop-3.3.1/etc/hadoop/hdfs-site.xml /opt/module/hbase-2.4.9/conf/hdfs-site.xml

Hbase arrives here, the installation of unit 1 is completed, and then distributed to the rest of the machines

# xsync script
xsync /opt/module/hbase-1.3.1

# No xsync script, copy with scp
scp /opt/module/hbase-1.3.1 root@192.168.242.132:/opt/module/
scp /opt/module/hbase-1.3.1 root@192.168.242.133:/opt/module/

Server time synchronization:

This problem will report an error when the Hbase shell checks the status status

Error Description: the primary node is uninitialized, which does not affect the use, but the reason is that the cluster time is inconsistent

reference resources:

https://blog.csdn.net/renzhewudi77/article/details/86301395

The solution is to synchronize at the same time

Take unit 1 as the unified time standard and make the time server

Install ntp service (execute it once whether there is one or not)

sudo yum install -y ntp

First edit the ntp configuration of unit 1

vim /etc/ntp.conf

Main contents:

# Change to your own network segment. For example, my network segment is 242
# (authorization 192168.1.0-192.168.1.255 All machines on the network segment can query and synchronize time from this machine)
# restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap by
restrict 192.168.242.0 mask 255.255.255.0 nomodify notrap

#  Note the information below
(Cluster in LAN without using other Internet time)
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst
For this reason:
# server 0.centos.pool.ntp.org iburst
# server 1.centos.pool.ntp.org iburst
# server 2.centos.pool.ntp.org iburst
# server 3.centos.pool.ntp.org iburst

# Standby local time provides synchronization
(When the node loses its network connection, it can still use the local time as the time server to provide time synchronization for other nodes in the cluster)
# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats
server 127.127.1.0
fudge 127.127.1.0 stratum 10

System ntp configuration modification

vim /etc/sysconfig/ntpd

Add configuration item:

# add to the content
#(synchronize hardware time with system time)
SYNC_HWCLOCK=yes

The rest is the running processing of ntp

# View ntp status
service ntpd status

# Service startup
service ntpd start

# Power on self start
chkconfig ntpd on

Other machines only need to set a scheduled task to synchronize with the time server

# Write scheduled tasks (machines other than machine 1 execute, and machine 1 acts as a time server)
crontab -e

# Write in the task editing script:
# (other machines are configured to synchronize with the time server once every 10 minutes)
*/10 * * * * /usr/sbin/ntpdate 192.168.242.131

Start and stop Hbase cluster

/opt/module/hbase-2.4.9/bin/start-hbase.sh
/opt/module/hbase-2.4.9/bin/stop-hbase.sh

start-up

[root@192 ~]# /opt/module/hbase-1.3.1/bin/start-hbase.sh
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
starting master, logging to /opt/module/hbase-1.3.1/bin/../logs/hbase-root-master-192.168.242.131.out
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
192.168.242.133: starting regionserver, logging to /opt/module/hbase-1.3.1/bin/../logs/hbase-root-regionserver-192.168.242.133.out
192.168.242.132: starting regionserver, logging to /opt/module/hbase-1.3.1/bin/../logs/hbase-root-regionserver-192.168.242.132.out
192.168.242.131: starting regionserver, logging to /opt/module/hbase-1.3.1/bin/../logs/hbase-root-regionserver-192.168.242.131.out
192.168.242.133: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
192.168.242.133: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
192.168.242.133: OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
192.168.242.132: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
192.168.242.132: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
192.168.242.132: OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
192.168.242.131: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
192.168.242.131: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
192.168.242.131: OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
[root@192 ~]#

Access the Shell of Hbase and check whether it works normally

# visit
/opt/module/hbase-2.4.9/bin/hbase shell

View status

status

Output information:

hbase(main):002:0> status
1 active master, 0 backup masters, 3 servers, 1 dead, 1.0000 average load

Access address view:

http://192.168.242.131:16010

Added by khalidorama on Tue, 01 Feb 2022 14:28:30 +0200

Programming VIP