1, Zookeeper cluster operation
Objective: deploy Zookeeper on Hadoop 101, Hadoop 102 and Hadoop 103 nodes.
1.1 decompression and installation
In Hadoop 101, unzip the Zookeeper installation package to the / usr/local / directory
decompression [root@hadoop101 local]# tar -zxvf apache-zookeeper-3.6.3-bin.tar.gz Delete compressed package [root@hadoop101 local]# rm -rf apache-zookeeper-3.6.3-bin.tar.gz get into zookeeper-3.6.3/catalogue [root@hadoop101 local]# cd zookeeper-3.6.3/ establish zkData catalogue [root@hadoop101 zookeeper-3.6.3]# mkdir zkData get into zkData/catalogue [root@hadoop101 zookeeper-3.6.3]# cd zkData/ Create and edit myid(myid It is used to enter the unique identification of the server, and the corresponding differentiated identification of the server, such as the three servers id Is 1, 2, 3)As shown below [root@hadoop101 zkData]# vim myid get into conf/catalogue [root@hadoop101 zkData]# cd ../conf/ rename [root@hadoop101 conf]# mv zoo_sample.cfg zoo.cfg edit:As shown below [root@hadoop101 conf]# vim zoo.cfg edit:ip mapping [root@hadoop101 conf]# vim /etc/hosts
Figure 1 myid (three servers are 1, 2 and 3 respectively)
Add the number corresponding to the server in the file (Note: there should be no blank lines at the top and bottom, and no spaces at the left and right)
Figure 2 cluster zoo CFG configuration (all three servers have the same zoo.cfg configuration)
Figure 3 (edit: ip mapping: all three servers are configured)
1.2 send zookeeper to other servers using rsync (only one is shown here)
install rsync [root@hadoop101 local]# yum -y install rsync Copy configured zookeeper To other machines [root@hadoop101 local]# rsync -r ./zookeeper-3.6.3 root@192.168.182.144:/usr/local The authenticity of host '192.168.182.144 (192.168.182.144)' can't be established. ECDSA key fingerprint is fc:cb:8b:5f:d1:6a:7b:f7:c9:e6:6d:99:5e:b6:b7:44. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.182.144' (ECDSA) to the list of known hosts. root@192.168.182.144's password:
1.3 turn off the firewall and start the service to view the status
Three servers turn off the firewall systemctl stop firewalld The three servers are started separately zookeeper And view the status [root@hadoop101 zookeeper-3.6.3]# bin/zkServer.sh start ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.6.3/bin/../conf/zoo.cfg Starting zookeeper ... STARTED # Here, the first server is guaranteed. Because the cluster is configured, more than half of the nodes are required to confirm the leader (regardless of the error reported by the first server, observe the status after starting the second server) [root@hadoop101 zookeeper-3.6.3]# bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.6.3/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Error contacting service. It is probably not running.
final result
2, Election mechanism (interview focus)
2.1 Zookeeper election mechanism (first launch)
- SID: server ID. It is used to uniquely identify the machines in a ZooKeeper cluster. Each machine cannot be duplicated and is consistent with myid.
- ZXID: transaction ID. ZXID is a transaction ID used to identify a server state change. At a certain time, the ZXID value of each machine in the cluster may not be exactly the same, which is related to the processing logic of ZooKeeper server for client "update request".
- Epoch: the code of each Leader's tenure. When there is no Leader, the logical clock value in the same round of voting is the same. This figure increases with each vote
Perform steps
(1) Server 1 starts and initiates an election. Server 1 voted for itself. At this time, server 1 has one vote, less than half (3 votes), the election cannot be completed, and the state of server 1 remains LOOKING
(2) Server 2 starts and initiates another election. Servers 1 and 2 vote for themselves and exchange vote information: at this time, server 1 finds that the myid of server 2 is larger than that of their current vote (server 1), and changes the vote to recommend server 2. At this time, there are 0 votes for server 1 and 2 votes for server 2. Without more than half of the results, the election cannot be completed, and the status of server 1 and 2 remains LOOKING
(3) Server 3 starts and initiates an election. Servers 1 and 2 change to server 3. The voting results: 0 votes for server 1, 0 votes for server 2 and 3 votes for server 3. At this time, server 3 has more than half of the votes, and server 3 is elected Leader. The status of server 1 and 2 is changed to FOLLOWING, and the status of server 3 is changed to LEADING;
(4) Server 4 starts and initiates an election. At this time, servers 1, 2 and 3 are no longer in the LOOKING state, and the ballot information will not be changed. Results of exchange of ballot information: server 3 has 3 votes and server 4 has 1 vote. At this time, server 4 obeys the majority, changes the vote information to server 3, and changes the status to FOLLOWING;
(5) Server 5 starts up, the same as 4, when FOLLOWING.
2. Zookeeper election mechanism (not the first time)
(1) When one of the following two situations occurs to a server in the ZooKeeper cluster, it will start to enter the Leader election:
• server initialization starts.
• unable to maintain connection with the Leader while the server is running.
(2) When a machine enters the Leader election process, the current cluster may also be in the following two states:
• a Leader already exists in the cluster.
For the first case where a Leader already exists, when the machine attempts to elect a Leader, it will be informed of the Leader information of the current server. For the machine, it only needs to establish a connection with the Leader machine and synchronize the status.
• there is no Leader in the cluster.
Suppose ZooKeeper is composed of five servers with SID of 1, 2, 3, 4 and 5, ZXID of 8, 8, 8, 7 and 7, and the server with SID of 3 is the Leader. At some point, the 3 and 5 servers failed, so the Leader election began.
Voting of machines with SID 1, 2 and 4:
(EPOCH,ZXID,SID ) | (EPOCH,ZXID,SID ) | (EPOCH,ZXID,SID ) |
---|---|---|
(1,8,1) | (1,8,2) | (1,7,4) |
Election Leader rules: the Leader with the largest EPOCH wins directly ② the Leader with the same EPOCH and the Leader with the largest transaction id ③ the Leader with the same transaction id and the Leader with the largest server id wins
3, ZK cluster start stop script
2.1 note that the three servers need to be configured with ssh keys
Configure ssh reference link: https://blog.csdn.net/weixin_56219549/article/details/122378045.
2.2 write the following in the script
Create a zk in a directory of a server where zk is installed SH execute script (put it in / usr/local/zookeeper-3.6.3/bin / directory here)
[root@hadoop101 bin]# vim zk.sh
#!/bin/bash case $1 in "start"){ for i in hadoop101 hadoop102 hadoop103 do echo ---------- zookeeper $i start-up ------------ ssh $i "/usr/local/zookeeper-3.6.3/bin/zkServer.sh start" done };; "stop"){ for i in hadoop101 hadoop102 hadoop103 do echo ---------- zookeeper $i stop it ------------ ssh $i "/usr/local/zookeeper-3.6.3/bin/zkServer.sh stop" done };; "status"){ for i in hadoop101 hadoop102 hadoop103 do echo ---------- zookeeper $i state ------------ ssh $i "/usr/local/zookeeper-3.6.3/bin/zkServer.sh status" done };; esac
2.3 add script execution permission (green after adding indicates successful authorization)
[root@hadoop101 bin]# chmod u+x zk.sh
2.4 server configuration JAVA_HOME run script
Zkenv of zookeeper on all servers Add java to sh_ Home is shown in the figure below
If not configured, the following error will be reported
JAVA_HOME is not set and java could not be found in PATH.
[root@hadoop101 bin]# ./zk.sh stop ---------- zookeeper hadoop101 stop it ------------ Error: JAVA_HOME is not set and java could not be found in PATH. bash:Line 1: stop: Command not found ---------- zookeeper hadoop102 stop it ------------ Error: JAVA_HOME is not set and java could not be found in PATH. bash:Line 1: stop: Command not found ---------- zookeeper hadoop103 stop it ------------ Error: JAVA_HOME is not set and java could not be found in PATH. bash:Line 1: stop: Command not found
Normal execution after configuration
[root@hadoop101 bin]# ./zk.sh stop ---------- zookeeper hadoop101 stop it ------------ ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.6.3/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED ---------- zookeeper hadoop102 stop it ------------ ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.6.3/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED ---------- zookeeper hadoop103 stop it ------------ ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.6.3/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED
4, Client command line operation
4.1 command line syntax
Basic command syntax | Function description |
---|---|
help | Display all operation commands |
ls path | Use the ls command to view the child nodes of the current znode [can listen] - w listen for changes in child nodes - s additional secondary information |
create | Normal creation |
-s | Containing sequence |
-e | Temporary (restart or timeout disappears) |
get path | Get the value of the node [can be monitored] -w monitor the content change of the node -s additional secondary information |
set | Set the specific value of the node |
stat | View node status |
delete | Delete node |
deleteall | Recursively delete nodes |
Start client
[root@hadoop101 bin]# ./zkCli.sh -server hadoop101:2181
Display all operation commands
[zk: hadoop101:2181(CONNECTED) 0] help
4.2 znode node data information
View the content contained in the current znode
[zk: hadoop101:2181(CONNECTED)1] ls / [zookeeper]
View detailed data of current node
[zk: hadoop101:2181(CONNECTED) 2] ls -s / [zookeeper] cZxid = 0x0 ctime = Thu Jan 01 08:00:00 CST 1970 mZxid = 0x0 mtime = Thu Jan 01 08:00:00 CST 1970 pZxid = 0x0 cversion = -1 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 0 numChildren = 1
attribute | explain |
---|---|
czxid | Create the transaction zxid of the node, and a ZooKeeper transaction ID will be generated every time the ZooKeeper state is modified. The transaction ID is the total order of all modifications in ZooKeeper. Each modification has a unique zxid. If zxid1 is less than zxid2, zxid1 occurs before zxid2. |
ctime | Number of milliseconds znode was created (since 1970) |
mzxid | znode last updated transaction zxid |
mtime | Last modified milliseconds of znode (since 1970) |
pZxid | Last updated child node zxid of znode |
cversion | Change number of znode child node, modification times of znode child node |
dataversion | znode data change number |
aclVersion | Change number of znode access control list |
ephemeralOwner | If it is a temporary node, this is the session id of the znode owner. 0 if it is not a temporary node. |
dataLength | Data length of znode |
numChildren | Number of child nodes of znode |
4.3 node type (persistent / transient / with serial number / without serial number)
Persistent: after the client and server are disconnected, the created node will not be deleted
Ephemeral: after the client and server are disconnected, the created node is deleted by itself
Note: set the sequence ID when creating znode. A value will be added after the znode name. The sequence number is a monotonically increasing counter maintained by the parent node
Note: in the distributed system, the sequence number can be used to sort all events globally, so that the client can infer the sequence of events through the sequence number
- Persistent directory node
After the client is disconnected from Zookeeper, the node still exists - Persistent sequence number catalog node
After the client is disconnected from Zookeeper, the node still exists, but Zookeeper numbers the node name sequentially - Temporary directory node
After the client disconnects from Zookeeper, the node is deleted - Temporary sequence number directory node
After the client is disconnected from Zookeeper, the node is deleted, but Zookeeper numbers the name of the node in sequence.
Create 2 ordinary nodes respectively (permanent node + without serial number)
# When zookeeper creates a node, ensure that the parent node exists before creating a child node. Otherwise, an error is reported [zk: hadoop101:2181(CONNECTED) 0] create /presistent/one Node does not exist: /presistent/one [zk: hadoop101:2181(CONNECTED) 1] create /presistent "Create persistent node" Created /presistent [zk: hadoop101:2181(CONNECTED) 2] create /presistent/one "Create persistent node,Child node one" Created /presistent/one
Note: when creating nodes, assign values
Gets the value of the node
-s get node details
[zk: hadoop101:2181(CONNECTED) 3] get /presistent Create persistent node [zk: hadoop101:2181(CONNECTED) 4] get /presistent/one Create persistent node,Child node one [zk: hadoop101:2181(CONNECTED) 5] get -s /presistent Create persistent node cZxid = 0x50000007e ctime = Sat Jan 08 17:08:39 CST 2022 mZxid = 0x50000007e mtime = Sat Jan 08 17:08:39 CST 2022 pZxid = 0x50000007f cversion = 1 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 18 numChildren = 1 [zk: hadoop101:2181(CONNECTED) 6] get -s /presistent/one Create persistent node,Child node one cZxid = 0x50000007f ctime = Sat Jan 08 17:09:11 CST 2022 mZxid = 0x50000007f mtime = Sat Jan 08 17:09:11 CST 2022 pZxid = 0x50000007f cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 31 numChildren = 0
Create node with sequence number (permanent node + node with sequence number)
[zk: hadoop101:2181(CONNECTED) 7] create -s /presistent/serialnumber "lasting+Serial number" Created /presistent/serialnumber0000000001 [zk: hadoop101:2181(CONNECTED) 8] create -s /presistent/serialnumber "lasting+Serial number 1" Created /presistent/serialnumber0000000002 [zk: hadoop101:2181(CONNECTED) 9] create -s /presistent/serialnumbertwo "lasting+Serial number 2" Created /presistent/serialnumbertwo0000000003 # Here we want to show that you need to press the table key instead of the enter key, and the enter key will report an error [zk: hadoop101:2181(CONNECTED) 10] get -s /presistent/serialnumber serialnumber0000000001 serialnumber0000000002 serialnumbertwo0000000003 [zk: hadoop101:2181(CONNECTED) 11] get /presistent/serialnumber0000000001 lasting+Serial number [zk: hadoop101:2181(CONNECTED) 12] get /presistent/serialnumber0000000002 lasting+Serial number 1 [zk: hadoop101:2181(CONNECTED) 13] get /presistent/serialnumbertwo0000000003 lasting+Serial number 2
If there is no Sn node, the SN will be incremented from 1. If there are 2 nodes under the original node, the reordering starts from 3, and so on.
Create a short node without sequence number
[zk: hadoop101:2181(CONNECTED) 14] create -e /ephemeral "Transient no node" Created /ephemeral [zk: hadoop101:2181(CONNECTED) 15] get /ephemeral Transient no node [zk: hadoop101:2181(CONNECTED) 16] get -s /ephemeral Transient no node cZxid = 0x500000087 ctime = Sat Jan 08 17:21:38 CST 2022 mZxid = 0x500000087 mtime = Sat Jan 08 17:21:38 CST 2022 pZxid = 0x500000087 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x10000a4732a000a dataLength = 15 numChildren = 0
set modification value
[zk: hadoop101:2181(CONNECTED) 17] set /ephemeral "Modify the node free value" [zk: hadoop101:2181(CONNECTED) 18] get /ephemeral Modify the node free value
Create short numbered nodes
## You can no longer create child nodes on the nodes without serial numbers (temporary nodes without serial numbers cannot have child nodes) [zk: hadoop101:2181(CONNECTED) 19] create -e -s /ephemeral/node "Transient node" Ephemerals cannot have children: /ephemeral/node # Similarly, you cannot skip the parent node and directly create a child node [zk: hadoop101:2181(CONNECTED) 20] create -e -s /dunzan/node "Transient node" Node does not exist: /dunzan/node # You can create a persistent node first and create a temporary node under the persistent node [zk: hadoop101:2181(CONNECTED) 21] create /dunzan Created /dunzan [zk: hadoop101:2181(CONNECTED) 22] create -e -s /dunzan/node "Temporary node with serial number" Created /dunzan/node0000000000 [zk: hadoop101:2181(CONNECTED) 23] get /dunzan/node0000000000 Temporary node with serial number [zk: hadoop101:2181(CONNECTED) 24] set /dunzan/node0000000000 "Modified node with serial number" [zk: hadoop101:2181(CONNECTED) 25] get /dunzan/node0000000000 Modified node with serial number
View all nodes
[zk: hadoop101:2181(CONNECTED) 26] ls / [dunzan, ephemeral, presistent, zookeeper]
(4) Exit the current client and then restart the client
[zk: hadoop101:2181(CONNECTED) 27] qiut # For a short time, we create persistent nodes, so there are [zk: hadoop101:2181(CONNECTED) 0] ls / [dunzan, presistent, zookeeper] # Enter the dunzan node to see that it is no longer available [zk: hadoop101:2181(CONNECTED) 1] ls /dunzan []
5, Listener principle
The client registers and listens to the directory node it cares about. When the directory node changes (data changes, node deletion, sub directory node addition and deletion), ZooKeeper will notify the client. The monitoring mechanism ensures that any change of any data saved by ZooKeeper can quickly respond to the application listening to the node.
5.1 detailed explanation of monitoring principle
- First, there must be a main() thread
- Create a Zookeeper client in the main thread. At this time, two threads will be created, one for network connection communication and the other for listener.
- Send the registered listening event to Zookeeper through the connect thread.
- Add the registered listening event to the list in Zookeeper's registered listener list.
- Zookeeper will send this message to the listener thread when it detects data or path changes.
- The process() method was called inside the listener thread.
5.2 common monitoring
1) Monitor changes in node data
get path [watch]
2) Listen for changes in the increase or decrease of child nodes
ls path [watch]
Node value change monitoring
(1) Register and listen for data changes of the / present node on the Hadoop 103 host
[zk: hadoop103:2181(CONNECTED) 0] get -w /presistent Create persistent node
(2) Modify the data of / present node on Hadoop 101 host
[zk: hadoop101:2181(CONNECTED) 0] set /presistent "Modify persistent node"
(3) Observe the monitoring of data changes received by Hadoop 103 host
[zk: hadoop103:2181(CONNECTED) 1] WATCHER:: WatchedEvent state:SyncConnected type:NodeDataChanged path:/presistent
Note: if you modify the value of / sanguo in Hadoop 101 several times, the monitoring will not be received on Hadoop 103. Because you can register once, you can only listen once. If you want to listen again, you need to register again.
Child node change monitoring of node (path change)
(1) Register and listen for changes of child nodes of the / present node on the Hadoop 103 host
[zk: hadoop103:2181(CONNECTED) 0] ls -w /presistent [one, serialnumber0000000001, serialnumber0000000002, serialnumbertwo0000000003]
(2) Create child nodes on Hadoop 102 host / sanguo node
[zk: hadoop101:2181(CONNECTED) 1] create /presistent/three "Persistent node III" Created /presistent/three
(3) Observe that the Hadoop 104 host receives the monitoring of child node changes
[zk: hadoop103:2181(CONNECTED) 1] WATCHER:: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/presistent
Note: the path change of a node is also registered once and takes effect once. If you want to take effect multiple times, you need to register multiple times.
5.3 node deletion and viewing
[zk: hadoop101:2181(CONNECTED) 2] ls / [dunzan, presistent, zookeeper] [zk: hadoop101:2181(CONNECTED) 3] ls /presistent [one, serialnumber0000000001, serialnumber0000000002, serialnumbertwo0000000003, three] [zk: hadoop101:2181(CONNECTED) 4] delete /presistent/three
Recursive deletion
[zk: hadoop101:2181(CONNECTED) 5] ls /presistent [one, serialnumber0000000001, serialnumber0000000002, serialnumbertwo0000000003] [zk: hadoop101:2181(CONNECTED) 6] deleteall /presistent [zk: hadoop101:2181(CONNECTED) 7] ls / [dunzan, zookeeper]
View node status
[zk: hadoop101:2181(CONNECTED) 8] stat /zookeeper cZxid = 0x0 ctime = Thu Jan 01 08:00:00 CST 1970 mZxid = 0x0 mtime = Thu Jan 01 08:00:00 CST 1970 pZxid = 0x0 cversion = -2 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 0 numChildren = 2