hadoop_ Hdfs07 hdfsha cluster configuration & ZK cluster configuration & yarnha configuration
Note: notes
(1) Cluster planning
Hadoop102 | Hadoop03 | Hadoop04 |
---|---|---|
ZK | ZK | ZK |
JournaleNode | JournaleNode | JournaleNode |
NameNode | NameNode | |
DataNode | DataNode | DataNode |
ResourceManager | ResourceManager | |
NodeManager | NodeManager | NodeManager |
(2) Configuring Zookeeper clusters
-
Official website: https://archive.apache.org/dist/
-
Unzip ZK installation package
cmd+shirft+p enter sftp mode and drag the installation package
[user02@hadoop102 software]$ tar -zxvf zookeeper-3.4.9.tar.gz -C /opt/module/ [user02@hadoop102 software]$ cd /opt//module/zookeeper-3.4.9/ [user02@hadoop102 zookeeper-3.4.9]$ mkdir -p zkData
- Configure zoo Cfg file
[user02@hadoop102 zookeeper-3.4.9]$ mv ./conf/zoo_sample.cfg ./conf/zoo.cfg [user02@hadoop102 conf]$ vim zoo.cfg # Add the following configuration dataDir=/opt/module/zookeeper-3.4.9/zkData ######cluster#### server.2=hadoop102:2888:3888 server.3=hadoop103:2888:3888 server.4=hadoop104:2888:3888
server. 2 = Hadoop 102:2888:3888 Description:
2: Is a number indicating the second server
Hadoop 102: indicates the ip address of the server
2888: the port on which the Leader server exchanges information in the server cluster
3888: if the Leader server in the cluster hangs, you need a port to re elect. Select a new Leader. This port is the port used to communicate with each other during the election
In the cluster mode, a file myid is configured. This file is in the dataDir directory. There is a data in this file, which is the value of "2" (the second server). When Zookeeper starts, read this file and get the zoo in it Compare the CFG configuration information to determine which is the server
- zk cluster operation
- In / opt / moudle / zookeeper-3.4 Create a myid file in the 9 / zkdata directory
[user02@hadoop102 zKdata]$ touch myid [user02@hadoop102 zKdata]$ vim myid # Add the number 2 corresponding to the server 2
- Copy to other clusters and modify the myid content to 3 and 4 respectively
[user02@hadoop102 module]$ xsync zookeeper-3.4.9/
- Single point start zk
[user02@hadoop102 zookeeper-3.4.9]$ bin/zkServer.sh start [user02@hadoop103 zookeeper-3.4.9]$ bin/zkServer.sh start [user02@hadoop104 zookeeper-3.4.9]$ bin/zkServer.sh start
- View status
[user02@hadoop102 zookeeper-3.4.9]$ bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: follower [user02@hadoop103 zookeeper-3.4.9]$ bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: leader [user02@hadoop104 zookeeper-3.4.9]$ bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: follower
(3) Configure HDFS-HA cluster
-
Official website: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
-
Copy hadoop
[user02@hadoop104 module]$ mkdir ha [user02@hadoop104 hadoop-2.7.2]$ cp -r hadoop-2.7.2/ /opt/module/ha
-
Configure Hadoop env sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
-
Configure core site xml
<!--appoint hdfs in namenode Address of--> <!--Take two nn The addresses are assembled into a cluster--> <property> <name>fs.defaultFS</name> <!--hdfs://hadoop102:9000--> <value>hdfs://mycluster</value> </property> <!--appoint hadoop The storage directory where files are generated at run time--> <property> <name>hadoop.tmp.dir</name> <value>/opt/module/ha/hadoop-2.7.2/data/tmp</value> </property>
-
Configure HDFS site xml
<!--Fully distributed cluster name--> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <!--In cluster namenode What are the nodes--> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <!--nn1 of RPC mailing address--> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>hadoop102:8020</value> </property> <!--nn1 of http mailing address--> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>hadoop102:50070</value> </property> <!--nn2 of RPC mailing address--> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>hadoop103:8020</value> </property> <!--nn2 of http mailing address--> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>hadoop103:50070</value> </property> <!--appoint NameNode Metadata in JournalNode Storage location on--> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop102:8485;hadoop103:8485;hadoop104:8485/mycluster</value> </property> <!--Configure isolation mechanism,That is, only one service mechanism can respond at the same time--> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!--Required when using isolation mechanism ssh No key login--> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/user02/.ssh/id_rsa</value> </property> <!--statement journalnode Server storage directory--> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/module/ha/hadoop-2.7.2/data/jn</value> </property> <!--Turn off permission check--> <property> <name>dfs.permissions.enable</name> <value>false</value> </property> <!--Access proxy class:client,mycluster,active Implementation mode of automatic switching in case of configuration failure--> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property>
-
Configure distribution to other nodes
[user02@hadoop104 module]$ xsync ./ha
(4) Start HDFS-HA cluster (single point)
-
Each journalnode node starts the journalnode service
Note: sbin should be added when starting. There are two hdfs in the path
[user02@hadoop102 ~]$ cd /opt/module/ha/hadoop-2.7.2/ [user02@hadoop103 ~]$ cd /opt/module/ha/hadoop-2.7.2/ [user02@hadoop104 ~]$ cd /opt/module/ha/hadoop-2.7.2/ [user02@hadoop102 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start journalnode [user02@hadoop103 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start journalnode [user02@hadoop104 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start journalnode
-
Format [nn1] and start (one format, one synchronization)
Delete the data and logs folders first
[user02@hadoop102 hadoop-2.7.2]$ rm -rf data/ logs/ [user02@hadoop103 hadoop-2.7.2]$ rm -rf data/ logs/ [user02@hadoop104 hadoop-2.7.2]$ rm -rf data/ logs/
Format nn1
[user02@hadoop102 hadoop-2.7.2]$ bin/hdfs namenode -format
Start nn1
[user02@hadoop102 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
- Synchronize nn1 metadata information on [nn2] and start
[user02@hadoop103 hadoop-2.7.2]$ bin/hdfs namenode -bootstrapStandby
[user02@hadoop103 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
-
View the web pages Hadoop 102:50070 and Hadoop 103:50070, both of which are in standby status
-
Start all datanode s on [nn1]
[user02@hadoop102 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start datanode [user02@hadoop103 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start datanode [user02@hadoop104 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start datanode
-
Switch [nn1] to active
[user02@hadoop102 hadoop-2.7.2]$ bin/hdfs haadmin -transitionToActive nn1
-
Check whether it is active
[user02@hadoop102 hadoop-2.7.2]$ bin/hdfs haadmin -getServiceState nn1 active
(5) Configure HDFS-HA automatic failover
-
prospect
When kill -9 Yes hadoop02 of namenode,Think again nn2 Switch to active Reject link will be reported,Hang up one,Unable to communicate,Need to hadoop102 Start up to ensure standby,Re cut. (Manual switching)
-
to configure
1) hdfs-site.xml ``` <!--hdfs-ha Automatic failover--> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> ``` [user02@hadoop102 hadoop]$ xsync ./hdfs-site.xml 2) core-site.xml ``` <!--to configure hdfs-ha Automatic failover--> <property> <name>ha.zookeeper.quorum</name> <value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value> </property> ``` [user02@hadoop102 hadoop]$ xsync ./core-site.xml
-
start-up
1) Close all hdfs service ``` sbin/stop-dfs.sh ``` 2) start-up ZK colony bin/zkServer.sh start ``` [user02@hadoop102 ~]$ cd /opt/module/zookeeper-3.4.9/ [user02@hadoop102 zookeeper-3.4.9]$ bin/zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper-3.4.9/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [user02@hadoop102 zookeeper-3.4.9]$ jps 3351 Jps 3326 QuorumPeerMain Three stations rise together ``` 3) initialization HA stay ZK Status in bin/hdfs zkfc -formatZK ``` [user02@hadoop102 zookeeper-3.4.9]$ cd /opt/module/ha/hadoop-2.7.2/ [user02@hadoop102 hadoop-2.7.2]$ bin/hdfs zkfc -formatZK ``` /opt/module/zookeeper-3.4.9/bin/zkServer.sh status display Mode:follower 4) start-up hdfs service sbin/start-dfs.sh ``` [user02@hadoop102 hadoop-2.7.2]$ sbin/start-dfs.sh Starting namenodes on [hadoop102 hadoop103] hadoop102: starting namenode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-namenode-hadoop102.out hadoop103: starting namenode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-namenode-hadoop103.out hadoop103: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-datanode-hadoop103.out hadoop102: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-datanode-hadoop102.out hadoop104: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-datanode-hadoop104.out Starting journal nodes [hadoop102 hadoop103 hadoop104] hadoop104: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-journalnode-hadoop104.out hadoop102: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-journalnode-hadoop102.out hadoop103: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-journalnode-hadoop103.out Starting ZK Failover Controllers on NN hosts [hadoop102 hadoop103] hadoop103: starting zkfc, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-zkfc-hadoop103.out hadoop102: starting zkfc, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-zkfc-hadoop102.out ``` 5) In each NameNode Start on node DFSZK Failover Controller,Which machine to start first,Which machine NameNode namely Active NameNode ``` sbin/hadoop-daemin.sh start zkfc ``` visit/opt/module/zookeeper-3.4.9/bin/zkcli.sh implement ls /One more hadoop -ha process
-
verification
visit http://hadoop102:50070/dfshealth.html#tab-overview ---active visit http://hadoop103:50070/dfshealth.html#tab-overview ---standby kill active of nn after,standby Switch to immediately active 1) take Active NameNode process kill ``` kill -9 namenode Process of id ``` 2) take Active NameNode The machine is disconnected from the network ``` service network stop ```
-
Process of each node
``` [user02@hadoop102 sbin]$ jps 3424 QuorumPeerMain--------ZK Cluster process 3666 NameNode 3779 DataNode 3978 JournalNode 4170 DFSZKFailoverController 4284 Jps [user02@hadoop103 bin]$ jps 3586 JournalNode 3491 DataNode 3412 NameNode 3326 QuorumPeerMain 3806 Jps 5188 DFSZKFailoverController [user02@hadoop104 bin]$ jps 3553 Jps 3475 JournalNode 3380 DataNode 3305 QuorumPeerMain ```
(6) YARN-HA configuration
-
Specific configuration
- yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <!--Reducer How to get data--> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!--appoint YARN of ResourceManager Address of--> <!-- <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop103</value> </property> --> <!-- yarn-ha --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>cluster-yarn1</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>hadoop102</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>hadoop103</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>hadoop102:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>hadoop103:8088</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value> </property> <!-- Start automatic recovery --> <property> <name>yarn.resoucemanager.recovery.enabled</name> <value>true</value> </property> <!-- appoint resoucemanager The status information of is stored in zookeeper upper --> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recover.ZKRMStateStore</value> </property> <!-- Log aggregation function --> <propery> <name>yarn.log-aggregation-enable</name> <value>true</value> </propery> <!-- Log retention for 7 days --> <propery> <name>yarn.nodemanager.log.retain-seconds</name> <value>604800</value> </propery> </configuration>
- Distribute xsync
-
Start HDFS
- Start the journalnode service on the three journalnodes
sbin/hadoop-daemon.sh start journalnode
- On nn1, format namenode1 and start
First RM - RF/ data ./ logs
bin/hdfs namenode -format sbin/hadoop-daemon.sh start namenode
- On nn2, synchronize the metadata information of nn1
bin/hdfs namenode -bootstrapStandby
- Start nn2
sbin/hadoop-daemon.sh start namenode
- Start all datanode s
sbin/hadoop-daemon.sh start datanode
- Switch nn1 to active
# Automatic failover execution is not configured: bin/hdfs haadmin -transitionToActive nn1 # Configure automatic failover execution: two nn nodes start the zkfc service. Whichever starts first is active sbin/hadoop-daemon.sh start zkfc
-
Start YARN
- Execute in Hadoop 102
sbin/start-yarn.sh
- Execute in Hadoop 103
sbin/yarn-daemon.sh start resourcemanager
- View service status
bin/yarn rmadmin -getServiceState rm1
According to web Hadoop 102:8088, yarn can't see which node is active and which standby is active. After killing one of the rm processes, the other word is switched to active
(7) Port number
secondary namenode http mailing address: 50090 DFS namenode. secondary. http-address
namenode http address: 50070 DFS namenode. http-address. mycluster. nn1
namenode rpc communication address: 9000 or 8020 DFS namenode. rpc-address. mycluster. nn1
yarn web address: 8088 yarn resourcemanager. webapp. address. rm1
zookeeper : 2181 ha.zookeeper.quorum