The mechanism of hadoop is only available in hadoop 2.x. the implementation of this function depends on a distributed component: zookeeper.
Brief introduction to zookeeper
zookeeper mainly provides distributed coordination services. Main functions: 1. Provide storage and management of a small amount of data. 2. Provide monitoring function for data nodes.
The role of zookeeper: leader (responsible for data writing) and follower. Leaders and followers are elected dynamically when they are launched.
The function of zookeeper is to control the election of main machine and coordinate the operation in a distributed way.
zookeeper uses a file tree structure to manage data. Each node becomes a datanode. A node can save a certain amount of data (less than 1M) and also has child nodes.
The application scenario of zookeeper.
- Unified naming service: Dubbo Remote call: webservice and rpc. Unified naming: give services on multiple machines a unified name.
- Unified configuration management Store the data of all distributed applications into the zookeeper cluster.
- Cluster management Using zookeeper to realize the election of dynamic master node
Implementation mechanism of NN high availability scheme
Problem: the role of the read-write request of the corresponding customer is NN, so once NN goes down, the service of the whole cluster will stop. NN+SN mechanism can only realize the reliability of metadata, but can not achieve the high availability of services.
Solution: add another NN. Point of thinking:
- Can two NN receive normal customer requests? No, two NN s can only have one corresponding customer request (the status is active) and the other is standBy.
- The nodes in standby state must be able to switch to active quickly and seamlessly, which requires that the metadata on the two NN should be consistent at all times. Metadata = fsimage+edits, so the main problem is to keep the real-time consistency of edits. If each metadata modification of the edits on two NN is synchronized immediately, the probability of operation failure will increase (if acNN synchronizes to stNN through the network, it may fail). If it is a batch accumulation and then synchronizes, if NN goes down before synchronization, then the two NN have Metadata is not consistent. Then neither of the two NN should hold the edit separately, and put it on a third-party application. The third-party application should be a cluster with high availability and reliability. The third-party cluster designed by hadoop is qjournal (the bottom layer depends on the zookeeper cluster). In this way, the complete consistency of metadata on the two NN is achieved.
- How standby perceives the NN of active has been abnormal You can use NN to register session level data with a zookeeper, and drop the line automatically. You can also create a separate process zkfc on each NN to manage the state of two NNs. You can also use zookeeper to achieve coordination: register the state data of NN with zookeeper cluster, monitor and read the state data of another NN, and update your own state when you need to switch state. How to solve the pseudo death phenomenon when zkfc reads it, i.e. how to avoid brain crack during state switching: when stNN finds that acNN is abnormal, stNN will not immediately switch state, and will send a remote command to acNN through ssh first to kill all related processes on acNN. If the command is successfully executed by acNN node, it will get a return code, and ensure that the The process is dead, and then stNN switches state. However, if the command sent out does not wait until the result is returned, after a predetermined time, stNN will execute a custom script program to ensure that acNN is completely hung up, such as turning off the power on acNN, and then switching the state.
- The concept of federation A pair of acNN and stNN become a federation. The function is to realize multiple federations in a host cluster.
- Roles required for a minimum high availability hadoop cluster: (NN1, zkfc1), (NN2, zkfc2), (zk1, zk2, zk3), (jn1, jn2, jn3), (DN1, nodemanager), (resourcemanager) requires at least three machines.
HA's hadoop cluster construction
1. Profile modification
Write the address of JAVA_HOME in the file to the Java installation path on your machine.
Modification of core-site.xml
<configuration> <!-- Appoint hdfs Of nameservice by ns1 --> <property> <name>fs.defaultFS</name> <value>hdfs://ns1/</value> </property> <!-- Appoint hadoop Temporary directory --> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/app/hadoop-2.4.1/tmp</value> </property> <!-- Appoint zookeeper address --> <property> <name>ha.zookeeper.quorum</name> <value>weekend05:2181,weekend06:2181,weekend07:2181</value> </property> </configuration>
Modification of hdfs-site.xml
<configuration> <!--Appoint hdfs Of nameservice by ns1，Need and core-site.xml Consistent in --> <property> <name>dfs.nameservices</name> <value>ns1</value> </property> <!-- ns1 There are two below NameNode，Namely nn1，nn2 --> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property> <!-- nn1 Of RPC Mailing address --> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>weekend01:9000</value> </property> <!-- nn1 Of http Mailing address --> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>weekend01:50070</value> </property> <!-- nn2 Of RPC Mailing address --> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>weekend02:9000</value> </property> <!-- nn2 Of http Mailing address --> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>weekend02:50070</value> </property> <!-- Appoint NameNode The metadata of JournalNode Storage location on --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://weekend05:8485;weekend06:8485;weekend07:8485/ns1</value> </property> <!-- Appoint JournalNode Where to store data on the local disk --> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/hadoop/app/hadoop-2.4.1/journaldata</value> </property> <!-- open NameNode Fail auto switch --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- Configuration failure auto switch implementation mode --> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- Configure the isolation mechanism method. Multiple mechanisms are divided by line breaking, that is, each mechanism temporarily uses one line, shell In parentheses are the script paths used--> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <!-- Use sshfence Isolation mechanism ssh No landfall --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <!-- To configure sshfence Isolation mechanism timeout --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> </configuration>
Modification of mapred-site.xml
<configuration> <!-- Appoint mr The framework is yarn mode --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
Modification of yarn-site.xml
<configuration> <!-- open RM High availability --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!-- Appoint RM Of cluster id --> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yrc</value> </property> <!-- Appoint RM Name --> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <!-- Assign separately RM Address --> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>weekend03</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>weekend04</value> </property> <!-- Appoint zk Cluster address --> <property> <name>yarn.resourcemanager.zk-address</name> <value>weekend05:2181,weekend06:2181,weekend07:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
Modify the slaves file
DN and RM are not reflected in the configuration file on those machines. In fact, they are written in the text file slaves. The script file that starts hdfs and the script file that starts yarn will read this file, not necessarily the same. The slaves file is only used for startup script. It is not written in slaves in time. Manually start DN or RM one by one. After startup, it can also be successfully added to the cluster.
Set up password free login mechanism
Start NN1 node and start NN1,NN2 node and all DN nodes to complete the unclassified configuration (SSH Gen and SSH copy ID), start RM1 node and all NM nodes to complete the unclassified configuration. RM2 needs to be manually started again. NN1 and NN2 are configured to each other for unclassified login.
- Start zookeeper cluster Execute the command zkServer.sh start
- Manually start each journal node Execute the command hadoop-daemon.sh start journalnode on each node where you plan to deploy journalNode
- Format the namenode of hdfs First, execute the format command on NN1, and then copy the formatted data on NN1 to NN2, or execute the HDFS namenode - bootstrappstandby command on NN2. Ensure that the fsimage s of the two NN are consistent.
- Format zkfs Execute the command: hdfs zkfc -formatZK, only execute on one NN (NN1).
- Start hdfs Execute command: start-dfs.sh, only execute on one NN (NN1)
- Start yarn Execute the command on RM1: start-yarn.sh, but manually start the resource manager process on RM2, and start it with hadoop-day.sh.