hadoop_ Hdfs07 hdfsha cluster configuration & ZK cluster configuration & yarnHA configuration

hadoop_ Hdfs07 hdfsha cluster configuration & ZK cluster configuration & yarnha configuration

Note: notes

(1) Cluster planning

Hadoop102Hadoop03Hadoop04
ZKZKZK
JournaleNodeJournaleNodeJournaleNode
NameNodeNameNode
DataNodeDataNodeDataNode
ResourceManagerResourceManager
NodeManagerNodeManagerNodeManager

(2) Configuring Zookeeper clusters

  1. Official website: https://archive.apache.org/dist/

  2. Unzip ZK installation package

    cmd+shirft+p enter sftp mode and drag the installation package

[user02@hadoop102 software]$ tar -zxvf zookeeper-3.4.9.tar.gz -C /opt/module/
[user02@hadoop102 software]$ cd /opt//module/zookeeper-3.4.9/
[user02@hadoop102 zookeeper-3.4.9]$ mkdir -p zkData
  1. Configure zoo Cfg file
[user02@hadoop102 zookeeper-3.4.9]$ mv ./conf/zoo_sample.cfg ./conf/zoo.cfg
[user02@hadoop102 conf]$ vim zoo.cfg 
# Add the following configuration
dataDir=/opt/module/zookeeper-3.4.9/zkData
######cluster####
server.2=hadoop102:2888:3888
server.3=hadoop103:2888:3888
server.4=hadoop104:2888:3888

server. 2 = Hadoop 102:2888:3888 Description:

2: Is a number indicating the second server

Hadoop 102: indicates the ip address of the server

2888: the port on which the Leader server exchanges information in the server cluster

3888: if the Leader server in the cluster hangs, you need a port to re elect. Select a new Leader. This port is the port used to communicate with each other during the election

In the cluster mode, a file myid is configured. This file is in the dataDir directory. There is a data in this file, which is the value of "2" (the second server). When Zookeeper starts, read this file and get the zoo in it Compare the CFG configuration information to determine which is the server

  1. zk cluster operation
  1. In / opt / moudle / zookeeper-3.4 Create a myid file in the 9 / zkdata directory
[user02@hadoop102 zKdata]$ touch myid
[user02@hadoop102 zKdata]$ vim myid
# Add the number 2 corresponding to the server
2
  1. Copy to other clusters and modify the myid content to 3 and 4 respectively
[user02@hadoop102 module]$ xsync zookeeper-3.4.9/
  1. Single point start zk
[user02@hadoop102 zookeeper-3.4.9]$ bin/zkServer.sh start
[user02@hadoop103 zookeeper-3.4.9]$ bin/zkServer.sh start
[user02@hadoop104 zookeeper-3.4.9]$ bin/zkServer.sh start
  1. View status
[user02@hadoop102 zookeeper-3.4.9]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower

[user02@hadoop103 zookeeper-3.4.9]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: leader

[user02@hadoop104 zookeeper-3.4.9]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower


(3) Configure HDFS-HA cluster

  1. Official website: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

  2. Copy hadoop

    [user02@hadoop104 module]$ mkdir ha
    [user02@hadoop104 hadoop-2.7.2]$ cp -r hadoop-2.7.2/ /opt/module/ha
    
  3. Configure Hadoop env sh

    export JAVA_HOME=/opt/module/jdk1.8.0_144
    
  4. Configure core site xml

     				<!--appoint hdfs in namenode Address of-->
            <!--Take two nn The addresses are assembled into a cluster-->
            <property>
                    <name>fs.defaultFS</name>
                    <!--hdfs://hadoop102:9000-->
                    <value>hdfs://mycluster</value>
            </property>
            <!--appoint hadoop The storage directory where files are generated at run time-->
            <property>
                    <name>hadoop.tmp.dir</name>
                    <value>/opt/module/ha/hadoop-2.7.2/data/tmp</value>
            </property>
    
  5. Configure HDFS site xml

    				<!--Fully distributed cluster name-->
            <property>
              <name>dfs.nameservices</name>
              <value>mycluster</value>
            </property>
    
            <!--In cluster namenode What are the nodes-->
            <property>
              <name>dfs.ha.namenodes.mycluster</name>
              <value>nn1,nn2</value>
            </property>
    
            <!--nn1 of RPC mailing address-->
            <property>
               <name>dfs.namenode.rpc-address.mycluster.nn1</name>
               <value>hadoop102:8020</value>
            </property>
    
            <!--nn1 of http mailing address-->
            <property>
              <name>dfs.namenode.http-address.mycluster.nn1</name>
              <value>hadoop102:50070</value>
            </property>
    
            <!--nn2 of RPC mailing address-->
            <property>
              <name>dfs.namenode.rpc-address.mycluster.nn2</name>
              <value>hadoop103:8020</value>
            </property>
    
            <!--nn2 of http mailing address-->
            <property>
              <name>dfs.namenode.http-address.mycluster.nn2</name>
              <value>hadoop103:50070</value>
            </property>
    
            <!--appoint NameNode Metadata in JournalNode Storage location on-->
            <property>
              <name>dfs.namenode.shared.edits.dir</name>
              <value>qjournal://hadoop102:8485;hadoop103:8485;hadoop104:8485/mycluster</value>
            </property>
    
            <!--Configure isolation mechanism,That is, only one service mechanism can respond at the same time-->
             <property>
               <name>dfs.ha.fencing.methods</name>
               <value>sshfence</value>
             </property>
    
            <!--Required when using isolation mechanism ssh No key login-->
           <property>
             <name>dfs.ha.fencing.ssh.private-key-files</name>
             <value>/home/user02/.ssh/id_rsa</value>
           </property>
    
            <!--statement journalnode Server storage directory-->
            <property>
              <name>dfs.journalnode.edits.dir</name>
              <value>/opt/module/ha/hadoop-2.7.2/data/jn</value>
            </property>
    
            <!--Turn off permission check-->
            <property>
              <name>dfs.permissions.enable</name>
              <value>false</value>
            </property>
    
            <!--Access proxy class:client,mycluster,active Implementation mode of automatic switching in case of configuration failure-->
            <property>
              <name>dfs.client.failover.proxy.provider.mycluster</name>
              <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
            </property>
    
  6. Configure distribution to other nodes

    [user02@hadoop104 module]$ xsync ./ha
    

(4) Start HDFS-HA cluster (single point)

  1. Each journalnode node starts the journalnode service

    Note: sbin should be added when starting. There are two hdfs in the path

    [user02@hadoop102 ~]$ cd /opt/module/ha/hadoop-2.7.2/
    [user02@hadoop103 ~]$ cd /opt/module/ha/hadoop-2.7.2/
    [user02@hadoop104 ~]$ cd /opt/module/ha/hadoop-2.7.2/
    [user02@hadoop102 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start journalnode
    [user02@hadoop103 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start journalnode
    [user02@hadoop104 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start journalnode
    
  2. Format [nn1] and start (one format, one synchronization)

Delete the data and logs folders first

[user02@hadoop102 hadoop-2.7.2]$ rm -rf data/ logs/
[user02@hadoop103 hadoop-2.7.2]$ rm -rf data/ logs/
[user02@hadoop104 hadoop-2.7.2]$ rm -rf data/ logs/

Format nn1

[user02@hadoop102 hadoop-2.7.2]$ bin/hdfs namenode -format

Start nn1

[user02@hadoop102 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
  1. Synchronize nn1 metadata information on [nn2] and start

[user02@hadoop103 hadoop-2.7.2]$ bin/hdfs namenode -bootstrapStandby

[user02@hadoop103 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode

  1. View the web pages Hadoop 102:50070 and Hadoop 103:50070, both of which are in standby status

  2. Start all datanode s on [nn1]

    [user02@hadoop102 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start datanode
    [user02@hadoop103 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start datanode
    [user02@hadoop104 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start datanode
    
  3. Switch [nn1] to active

    [user02@hadoop102 hadoop-2.7.2]$ bin/hdfs haadmin -transitionToActive nn1
    
  4. Check whether it is active

    [user02@hadoop102 hadoop-2.7.2]$ bin/hdfs haadmin -getServiceState nn1
    active
    

(5) Configure HDFS-HA automatic failover

  1. prospect

          When kill -9 Yes hadoop02 of namenode,Think again nn2 Switch to active Reject link will be reported,Hang up one,Unable to communicate,Need to hadoop102 Start up to ensure standby,Re cut. (Manual switching)
    
  2. to configure

          1)  hdfs-site.xml
    
          ```
           				<!--hdfs-ha Automatic failover-->
                  <property>
                    <name>dfs.ha.automatic-failover.enabled</name>
                    <value>true</value>
                  </property>
          ```
    
          [user02@hadoop102 hadoop]$ xsync ./hdfs-site.xml 
    
          2) core-site.xml
    
          ```
           				<!--to configure hdfs-ha Automatic failover-->
                  <property>
                    <name>ha.zookeeper.quorum</name>
                    <value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
                  </property>
          ```
    
          [user02@hadoop102 hadoop]$ xsync ./core-site.xml
    
  3. start-up

          1) Close all hdfs service
    
          ```
          sbin/stop-dfs.sh
          ```
    
          2) start-up ZK colony 
    
          bin/zkServer.sh  start
    
          ```
          [user02@hadoop102 ~]$ cd /opt/module/zookeeper-3.4.9/
          [user02@hadoop102 zookeeper-3.4.9]$ bin/zkServer.sh  start
          ZooKeeper JMX enabled by default
          Using config: /opt/module/zookeeper-3.4.9/bin/../conf/zoo.cfg
          Starting zookeeper ... STARTED
          [user02@hadoop102 zookeeper-3.4.9]$ jps
          3351 Jps
          3326 QuorumPeerMain
          
          Three stations rise together
          ```
    
          3) initialization HA stay ZK Status in
    
          bin/hdfs zkfc -formatZK
    
          ```
          [user02@hadoop102 zookeeper-3.4.9]$ cd /opt/module/ha/hadoop-2.7.2/
          [user02@hadoop102 hadoop-2.7.2]$ bin/hdfs zkfc -formatZK
          ```
    
          /opt/module/zookeeper-3.4.9/bin/zkServer.sh status
    
          display Mode:follower
    
          4) start-up hdfs service
    
          sbin/start-dfs.sh
    
          ```
          [user02@hadoop102 hadoop-2.7.2]$ sbin/start-dfs.sh
          Starting namenodes on [hadoop102 hadoop103]
          hadoop102: starting namenode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-namenode-hadoop102.out
          hadoop103: starting namenode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-namenode-hadoop103.out
          hadoop103: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-datanode-hadoop103.out
          hadoop102: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-datanode-hadoop102.out
          hadoop104: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-datanode-hadoop104.out
          Starting journal nodes [hadoop102 hadoop103 hadoop104]
          hadoop104: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-journalnode-hadoop104.out
          hadoop102: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-journalnode-hadoop102.out
          hadoop103: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-journalnode-hadoop103.out
          Starting ZK Failover Controllers on NN hosts [hadoop102 hadoop103]
          hadoop103: starting zkfc, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-zkfc-hadoop103.out
          hadoop102: starting zkfc, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-user02-zkfc-hadoop102.out
          ```
    
          
    
          5) In each NameNode Start on node DFSZK Failover Controller,Which machine to start first,Which machine NameNode namely Active NameNode
    
          ```
          sbin/hadoop-daemin.sh start zkfc
          ```
    
          visit/opt/module/zookeeper-3.4.9/bin/zkcli.sh
    
          implement ls /One more hadoop -ha process
    
  4. verification

          visit http://hadoop102:50070/dfshealth.html#tab-overview ---active
    
          visit http://hadoop103:50070/dfshealth.html#tab-overview ---standby
    
          kill active of nn after,standby Switch to immediately active
    
          1) take Active NameNode process kill
    
          ```
          kill -9 namenode Process of id
          ```
    
          2) take Active NameNode The machine is disconnected from the network
    
          ```
          service network stop
          ```
    
  5. Process of each node

          ```
          [user02@hadoop102 sbin]$ jps
          3424 QuorumPeerMain--------ZK Cluster process
          3666 NameNode
          3779 DataNode
          3978 JournalNode
          4170 DFSZKFailoverController
          4284 Jps
          
          [user02@hadoop103 bin]$ jps
          3586 JournalNode
          3491 DataNode
          3412 NameNode
          3326 QuorumPeerMain
          3806 Jps
          5188 DFSZKFailoverController
          
          [user02@hadoop104 bin]$ jps
          3553 Jps
          3475 JournalNode
          3380 DataNode
          3305 QuorumPeerMain
          ```
    

(6) YARN-HA configuration

  1. Specific configuration

    1. yarn-site.xml
    <configuration>
    
    <!-- Site specific YARN configuration properties -->
             <!--Reducer How to get data-->
            <property>
                    <name>yarn.nodemanager.aux-services</name>
                    <value>mapreduce_shuffle</value>
            </property>
    
            <!--appoint YARN of ResourceManager Address of-->
            <!-- <property>
                    <name>yarn.resourcemanager.hostname</name>
                    <value>hadoop103</value>
            </property> -->
    
            <!-- yarn-ha -->
            <property>
              <name>yarn.resourcemanager.ha.enabled</name>
              <value>true</value>
            </property>
            <property>
              <name>yarn.resourcemanager.cluster-id</name>
              <value>cluster-yarn1</value>
            </property>
            <property>
              <name>yarn.resourcemanager.ha.rm-ids</name>
              <value>rm1,rm2</value>
            </property>
            <property>
              <name>yarn.resourcemanager.hostname.rm1</name>
              <value>hadoop102</value>
            </property>
            <property>
              <name>yarn.resourcemanager.hostname.rm2</name>
              <value>hadoop103</value>
            </property>
            <property>
              <name>yarn.resourcemanager.webapp.address.rm1</name>
              <value>hadoop102:8088</value>
            </property>
            <property>
              <name>yarn.resourcemanager.webapp.address.rm2</name>
              <value>hadoop103:8088</value>
            </property>
            <property>
              <name>yarn.resourcemanager.zk-address</name>
              <value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
            </property>
    
            <!-- Start automatic recovery -->
            <property>
                    <name>yarn.resoucemanager.recovery.enabled</name>
                    <value>true</value>
            </property>
    
            <!-- appoint resoucemanager The status information of is stored in zookeeper upper -->
            <property>
                    <name>yarn.resourcemanager.store.class</name>
                    <value>org.apache.hadoop.yarn.server.resourcemanager.recover.ZKRMStateStore</value>
            </property>
    
            <!-- Log aggregation function -->
            <propery>
             <name>yarn.log-aggregation-enable</name>
             <value>true</value>
            </propery>
    
            <!-- Log retention for 7 days -->
            <propery>
             <name>yarn.nodemanager.log.retain-seconds</name>
             <value>604800</value>
            </propery>
    </configuration>
    
    1. Distribute xsync
  2. Start HDFS

    1. Start the journalnode service on the three journalnodes
    sbin/hadoop-daemon.sh start journalnode
    
    1. On nn1, format namenode1 and start

    First RM - RF/ data ./ logs

    bin/hdfs namenode -format
    sbin/hadoop-daemon.sh start namenode
    
    1. On nn2, synchronize the metadata information of nn1
    bin/hdfs namenode -bootstrapStandby
    
    1. Start nn2
    sbin/hadoop-daemon.sh start namenode
    
    1. Start all datanode s
    sbin/hadoop-daemon.sh start datanode
    
    1. Switch nn1 to active
    # Automatic failover execution is not configured:
    bin/hdfs haadmin -transitionToActive nn1
    # Configure automatic failover execution: two nn nodes start the zkfc service. Whichever starts first is active
    sbin/hadoop-daemon.sh start zkfc
    
  3. Start YARN

    1. Execute in Hadoop 102
    sbin/start-yarn.sh
    
    1. Execute in Hadoop 103
    sbin/yarn-daemon.sh start resourcemanager
    
    1. View service status
    bin/yarn rmadmin -getServiceState rm1
    

    According to web Hadoop 102:8088, yarn can't see which node is active and which standby is active. After killing one of the rm processes, the other word is switched to active

(7) Port number

secondary namenode http mailing address: 50090 DFS namenode. secondary. http-address

namenode http address: 50070 DFS namenode. http-address. mycluster. nn1

namenode rpc communication address: 9000 or 8020 DFS namenode. rpc-address. mycluster. nn1

yarn web address: 8088 yarn resourcemanager. webapp. address. rm1

zookeeper : 2181 ha.zookeeper.quorum

Keywords: Big Data Hadoop Zookeeper hdfs Yarn

Added by DarkArchon on Sun, 26 Dec 2021 06:36:22 +0200