Knowledge and use of Docker Swarm

Docker series
Chapter 3 docker swarm

preface

This is an extension of the previous two chapters on Docker and Docker Compose. If you haven't learned it yet, you can learn it in order and look at this chapter again. The principle will be boring, but the basic theory is the basis of practice. Only when you understand it can you know the profound meaning.

1, Introduction to Docker Swarm

1.1. What is docker swarm?

docker Swarm is docker's cluster management tool. Almost all of them use Go language to complete development. It abstracts several docker host pools into a single virtual host. Docker Swarm provides a standard Docker API to manage various docker resources on the docker host. Any tool that has communicated with the docker daemon can be easily extended to multiple hosts using Swarm.

1.2. Why Docker Swarm

whether it is Docker container deployment of a single service or arranging multiple services using Docker composition, the operation is limited to one host. However, in practice, the deployment of our services is often in cluster mode and distributed mode. A single host cannot support the whole distributed cluster, let alone avoid the single point of failure problem. Even if multiple servers are deployed, cross host containers cannot communicate. Therefore, Docker Swarm provides solutions to this series of problems.

1.3. Docker Swarm features

Integration of cluster management into Docker Engine
Using the built-in cluster management function, you can directly use Docker CLI to create and manage Swarm clusters without relying on other external software tools.
Decentralized design
Node refers to an instance of Docker Engine in the Swarm cluster. Nodes are divided into Manager and work. During the operation of Swarm cluster, you can expand or shrink the cluster without suspending or restarting cluster services, such as adding Manager nodes and deleting worker nodes.
Declarative Service Model
Docker Engine uses declarative methods to let you define the required state of various services in the application stack. For example, you might describe an application that consists of a Web front-end service with a Message Queuing service and a database back-end.
Service expansion and capacity reduction
For each service, you can declare the number of tasks to run. When you scale up or down, the group manager automatically adapts by adding or deleting tasks to maintain the desired state.
Coordinate the consistency between the expected state and the actual state
The Swarm Manager Node in the cluster will continuously monitor and coordinate the cluster state, and adjust the difference between the current state and the expected state to make the expected state consistent with the actual state. If a service is set to run 10 replica containers, if the server node of 2 replicas crashes, the Manager will create two new replicas to replace the crashed replica. And assign new replicas to the available Worker nodes, consistent with the expected number.
Multi host network
You can specify an Overlay network for the service to be deployed. When the service is initialized or updated, Swarm Manager automatically assigns an IP address (virtual IP) to the container in the specified Overlay network
Service discovery
Swarm Manager load balances running containers by assigning unique DNS names to each service in the cluster. Through this DNS, we can query the status of containers in the running cluster.
load balancing
Within Swarm, you can specify how to distribute service containers among nodes to achieve load balancing. If you want to use a load balancer outside the cluster, you can expose the Service Container port.
security policy
Nodes within the Swarm cluster are forced to use TLS based two-way authentication, and secure encrypted communication is carried out between a single Node and multiple nodes in the cluster. We can choose to use a self signed root certificate or a custom Root CA certificate.
Rolling update
For service updates, you can perform incremental deployment updates on multiple nodes. In addition, Swarm Manager supports the sequential deployment of multiple services on multiple nodes by setting the delay specified time interval using Docker CLI. This method can flexibly control the update. If a service update fails, you can pause the subsequent update operation and roll back to the previous version.

1.4. Swarm key concepts

Swarm
A set of Docker Engine clusters. The Swarm kit embedded in Docker Engine is used for cluster management and scheduling. You can start the Swarm mode or join the existing Swarm when Docker is initialized.
Node
A single Docker Engine instance. Or as a Docker Node. You can run one or more nodes on a single physical machine or ECs, or on multiple physical machines and ECs. Based on Docker Engine, you can create and manage multiple Docker containers. When the cluster is initially built, Swarm Manager is the first Node in the cluster, which is divided into Manager Node and Worker Node according to functions.
Manager Node
Management Node. A cluster requires at least one Manager Node. Responsible for the management of the whole cluster, including maintaining cluster status, scheduling services, and providing services for group mode HTTP API endpoints. For example, if you are responsible for scheduling tasks, a Task indicates that you want to start the Docker container on a Node in the swarm cluster, and one or more Docker containers run on a Worker Node in the cluster. At the same time, the Manager Node is also responsible for arranging container and cluster management functions and maintaining cluster status. By default, the Manager Node also executes tasks as a Worker Node. Swarm supports configuring a dedicated management Node for the Manager seat.
Worker Node
Work node. Receive the assigned Task scheduled by the Manager Node, start a Docker container to run the specified service, and the Worker Node needs to report the execution status of the assigned Task to the Manger Node.
Service
Service is an abstract concept. It is just a description of the expected state of application services running on the swarm cluster. Like a list, it describes the specific service name, which images are used, how many copies are running, the network to which the container is connected, the mapped port, etc. A service is the definition of tasks executed on the work node. When creating a service, you need to specify the container image to use. A service can only run one image, but can run multiple containers from the same image.
Stack
Stack is a group of services, similar to docker compose. We define a stack in a YAML file. By default, one stack shares one Network, which can be accessed to each other and isolated from other stack networks. statck is just for the convenience of choreography. The docker stack command can easily operate a stack without operating services one by one.
Task
Tasks are atomic units scheduled in the cluster, corresponding to a single Docker container running in a Service. Each Task is assigned a unique digital ID in the Service, from 1 to the replica value set by the Service, and each Task has a corresponding container. A Task is a Docker container.

1.5. Relationship among stacks, Services and Tasks

1.6. Difference between docker swarm and Docker Compose

Docker swarm, like Docker Compose, is the official container orchestration tool of Docker. The difference is that Docker Compose can only create and manage multiple containers on a single server or host, while Docker swarm can create container cluster services on multiple servers or hosts. This will be a better choice for deploying Docker Swarm for microservices.

1.7. Comparison between docker stack and Docker Compose

1.7.1. Introduction

Docker engine integrates Docker Swarm in version 1.12 and brings a new container orchestration tool Docker stack. That is, you can use the Docker stack command to create a Docker container stack using the docker-compose.yml file without installing Docker Compose.

1.7.2. Similarities

both use the same way. Both use yml files to define container choreography. In addition, both can manipulate the service s, volumes, networks, etc. defined in the yml file.

# compose
docker-compose -f docker-compose up
# stack
docker stack deploy -c docker-compose.yml <stackname>

1.7.3. Difference

Support composition version difference
docker stack does not support the compose.yml file written in v2 specification, and the latest V3 version must be used. The docker compose tool can still handle v2 and v3

tool	Compose v1	Compose v2	Compose v3
docker-compose	√	√	√
docker stack	×	×	√

Instruction difference
docker stack does not support build instructions in the compose file, that is, images cannot be created in the stack command. docker stack requires existing images. Therefore, compared with docker compose, it can create images on site, which is more suitable for iterative development, testing and rapid verification of prototypes. Both of them will ignore some instructions in the legal yml document. For details, please refer to the official documents https://docs.docker.com/compose/compose-file/compose-file-v3/ Precautions under instructions in.
Birth difference
Docker compose is a Python project. At first, a Python project named fig was able to parse fig.yml and start the docker container stack. This tool is slowly produced and renamed docker compose, but it is always a Python tool that acts on the top level of the docker engine. Internally, it uses the Docker API to start the container according to the specification, so docker compose must still be installed separately to be used with docker. The docker stack comes from the native support of the docker engine. There is no need to install additional tools to start the docker container stack.

1.8. Swarm basic architecture

Docker Engine 1.12 introduces swarm mode, enabling you to create a cluster composed of one or more Docker engines, called swarm. Swarm consists of one or more nodes: physical or virtual machines running Docker engine version 1.12 or higher in swarm mode.

Manager nodes
The manager node uses Raft protocol to ensure data consistency in distributed scenarios. Generally, in order to ensure the high availability of manager nodes, Docker recommends using an odd number of manager nodes, so that the failed manager nodes can be recovered without shutting down. N manager nodes can tolerate the failure of (N-1) / 2 manager nodes at the same time to ensure high availability. For example, 3 manager nodes can tolerate the failure of 1 manager node at the same time to ensure high availability. Of course, it does not mean that adding more management nodes has higher scalability or performance. Generally speaking, the opposite is true. It is officially recommended that there should be no more than 7 manager nodes in a cluster.
Worker nodes
The worker node does not participate in the Raft distributed state, does not make scheduling decisions, and does not provide services for the swarm mode HTTP API. Its only purpose is to execute the container. The worker nodes communicate through the control plane. This communication uses the gossip protocol and is asynchronous.

1.8. Swarm network communication

the cluster managed by Swarm establishes a unified network layer internally. The port you bind is not directly bound to the host, but bound to the overlay network. For the cluster, it is equivalent to being tied to a port of the network and enabling the load policy to forward requests.

2, Common commands

2.1. Managing profiles

docker config

# View created profiles
docker config ls
# Add the existing configuration file to the docker configuration file
docker config create docker <Profile name> <Local profile>

2.2. Managing Swarm nodes

docker node

# Viewing nodes in a cluster
docker node ls
# Demote manager role to worker
docker node demote <host name>
# Upgrade worker role to manager
docker node promote <host name>
# View node details, default json format
docker node inspect <host name>
	# View node information tile format
	docker node inspect --pretty <host name>
# View the number of one or more nodes and tasks running. The default is the current node
docker node ps
# Delete a node from swarm
docker node rm <host name>
# Update a node
docker node update
	# Set the status of the node ("active" normal | "pause |" drain "exclude their own work tasks)
	docker node update --availability

2.3. Manage sensitive data storage

# Manage sensitive data storage
docker secret

2.4. Management service stack

docker stack

# Service stack, in the form of stack, is generally used as an arrangement, and the format is the same as docker compose
docker stack
	# Deploy via. yml file directive
	docker stack deploy -c <file name>.yml <Orchestration service name>
	# View orchestration services
	docker stack ls

2.5. Cluster management

docker swarm

# Initialize a swarm
docker swarm init
	# Specifies the initialization ip address node
	docker swarm init --advertise-addr <Management end IP address>
	# Remove all manager identities other than local
	docker swarm init --force-new-cluster
	# Join the node to the swarm cluster. There are two join modes: manager and worker
	docker swarm join
		# The work node joining the management node needs to pass the join token authentication
		docker swarm join-token
		# Get the docker initialization command again
		docker swarm join-token worker
	# Leave swarm
	docker swarm leave
	# Update the configuration of swarm cluster
	docker swarm update

2.6. Management services

docker service

# Create a service
docker service create
      # Number of copies created
      docker service create --replicas <Number of copies>
      # Specify container name
      docker service create --name <name>
      # The update interval between each container and the container.
      docker service create --update-delay <second>
      # The number of concurrent updates during update. The default is 1
      docker service create --update-parallelism <number>
      # The mode when the task container update fails, (pause to stop and continue), default pause.
      docker service create --update-failure-action <type>
      # The rollback interval between each container.
      docker service create --rollback-monitor 20s
      # The rollback failure rate is allowed to run if it is less than the percentage
      docker service create --rollback-max-failure-ratio .<numerical value>(example .2 by%20)
      # Add network
      docker service create --network <net name>
      # Create a volume type data volume
      docker service create --mount type=volume,src=volume<name>,dst=<Container directory>
      # Create bind read / write directory mount
      docker service create --mount type=bind,src=<Host order>record,dst=<Container directory>
      # Create bind read-only directory mount
      docker service create --mount type=bind,src=<home directory>,dst=<Container directory>,readonly
      # Create dnsrr load balancing mode
      docker service create --endpoint-mode dnsrr <service name>
      # Create the docker configuration file to the local directory of the container
      docker service create --config source=<docker configuration file>,target=<Profile path>
      # Create add port
      docker service create --publish <Exposed port>:<Container port> <service name>
# View service details, default json format
docker service inspect
      # View service information tile
      docker service inspect --pretty <service name>
# View in service output
docker service logs
# List services
docker service ls
# List service task information
docker service ps　　　　
      # View service startup information
      docker service ps <service name>
      # Filter run only task information
      docker service ps -f "desired-state=running" <service name>
# Delete service
docker service rm
# Capacity reduction and expansion service
docker service scale
      # Number of extended service container replicas
      docker service scale <service name>=<Number of copies>
# Update service related configuration
docker service update
      # Container addition instruction
      docker service update --args <instructions> <service name>
      # Update service container version
      docker service update --image <Updated version> <service name >        
      # Rollback service container version
      docker service update --rollback <Rollback service name>
      # Add container network
      docker service update --network-add <net name> <service name>
      # Delete container network
      docker service update --network-rm <net name> <service name>
      # Add exposed port to service
      docker service update --publish-add <Exposed port>:<Container port> <service name>
      # Remove exposed ports
      docker service update --publish-rm <Exposed port>:<Container port> <service name>
      # Modify the load balancing mode to dnsrr
      docker service update --endpoint-mode dnsrr <service name>

3, Cluster practice

3.1. Preparation

3.1.1. Prepare the machine

Prepare 3 network interworking machines (2 if not, and more if possible)

IP	role
192.168.61.129	manager
192.168.61.130	worker
192.168.61.131	worker

3.1.2. Open port

Before creating a cluster, the following ports need to be opened. If the firewall is turned off in the test environment, it can be ignored. If the firewall is turned on in the formal environment, the corresponding ports need to be opened.

port	purpose
2377	TCP port 2377 is used for cluster management communication
7946	TCP and UDP ports 7946 are used for communication between nodes
4789	TCP and UDP ports 4789 are used to cover network traffic

Attachment: turn off firewall

# Turn off the firewall with immediate effect
systemctl stop firewalld
# Set restart to take effect
systemctl disable firewalld

3.2. Create management node

Create node

# Run the following command to create a node. The first node created is the management node
docker swarm init --advertise-addr 192.168.61.129

Remember the token information after successful execution. This token is used as a voucher for other nodes to join the cluster.

3.3. View node information

The view node information command can only be executed on the management node

# Mode 1
docker node list
# Mode II
docker node ls

3.4. Add work node

# Execute the following commands on the work node machine respectively, where token is the value generated by the management node created in the previous step
docker swarm join --token SWMTKN-1-0a6fjic9st893u2ajmtkyjf2kxz882t2ti3f3r2vgjn87pbvrc-7l5k7wjkoahclx8zf55g0we27 192.168.61.129:2377

Successfully joined the work node

Check the node list information and find that there are already 3 machines

3.5. Add more management nodes

When the cluster needs multiple management nodes, you can execute the following command on the management node (the working node cannot execute the command), copy the command to join the management node, and execute the command on the new machine that needs to join the management node

# Execute the following command on the management node to obtain the token information added to the management node
docker swarm join-token manager

You can see the commands related to joining the management node and copy them to the new machine for execution

3.6. Exit node

# If you want to exit the node, you can execute the following command on the machine you want to exit
docker swarm leave

View the node list information on the management node. If it is found that the status is Down, exit successfully. If you need to join, execute the initial join work node command

3.7.Docker Service practice

3.7.1 Deployment Services

Here, an nginx cluster with three replicas is deployed on three machines

# Execute the following command on the management node (self pulling if there is no local image)
docker service create --name nginx --replicas 3 -p 80:80 nginx:1.7.9

3.7.2 viewing services

# View the list of services (executed on the management node)
docker service ls
# View service details (executed on the management node)
docker service ps nginx

Viewing the service on the work node, you can see that there is a running nginx container

nginx services can be accessed on any node machine ip

3.7.3. Service coordination

After the service cluster runs, it will manage the whole cluster in the desired state. For example, the cluster composed of three nginx just created. If one of them goes down, swarm will automatically create a new one to supplement the number of task s.
Simulate downtime and delete a container docker rm -fv on a machine at a few points

Query the nginx service on the management node to see the new service

3.7.4. Rolling upgrade

The running nginx service version can be upgraded by rolling without shutting down the service, and the original version 1.9.7 can be upgraded to stable version (or other specified version, such as 1.9.7)

# Perform a rolling upgrade on the management node
docker service update --image nginx:stable nginx

Viewing the service, you can see that the nginx version has been upgraded to stable

3.7.5. Dynamic capacity expansion

Dynamic capacity expansion can be carried out on the node cluster during service operation. For example, the capacity expansion nginx service is 4

# Dynamic capacity expansion on the management node
docker service scale nginx=4

Looking at the running services again, you can see that the running nginx services have been expanded to 4

3.8.Docker Stack practice

Multi service deployment and management in large-scale scenarios is difficult, and Docker Stack can easily solve this problem. Similar to Docker Compose, it provides desired status, rolling upgrade, capacity expansion and health check through. yml configuration file.

3.8.1. Deploying web applications with docker stack

3.8.1.1.Compose and Docker compatibility

The specifications corresponding to the Compose configuration file version and docker version can be viewed through the official documents:
https://docs.docker.com/compose/compose-file/

Writing document format	Docker engine Publishing
3.8	19.03.0+
3.7	18.06.0+
3.6	18.02.0+
3.5	17.12.0+
3.4	17.09.0+
3.3	17.06.0+
3.2	17.04.0+
3.1	1.13.1+
3.0	1.13.0+
2.4	17.12.0+
2.3	17.06.0+
2.2	1.13.0+
2.1	1.12.0+
2.0	1.10.0+

3.8.1.2. Create docker-stack.yml

# file: docker-compose.yml
version: "3.7"
services:
  aloneService: # service name
    image: alonedockerhub/alone:1.0 #The image name uses a public warehouse to ensure that each node machine can pull the image

    networks: # Network settings
      overlay:

    ports: # Port mapping [host: container]
      - "8088:8082"

    deploy: # Deployment settings
      mode: replicated
      replicas: 3

      restart_policy: # Restart strategy [condition, delay, maximum times, detection time]
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 30s

      update_config: # Upgrade configuration [concurrency, delay, failure handling, listening time, update rules]
        parallelism: 1
        delay: 5s
        failure_action: rollback
        monitor: 5s
        order: start-first

      resources: # Resource control [cpu,mem]
        limits:
          cpus: '0.2'
          memory: 1024M

networks: # Define network
  overlay:

3.8.1.2.docker stack deployment service

Note: it should be noted that if all nodes cannot get the image address, it will not be created successfully. If a node machine cannot pull the image (for example, the image is in the local warehouse of one of the node machines), the node machine that cannot be pulled will not run. Multiple replicas will run on the nodes that can pull the image, resulting in uneven distribution. Therefore, the image address (image name) in the previous configuration file needs a public warehouse to ensure that each node machine can pull the image.

# Perform deployment on the management node machine
docker stack deploy -c <FILE_NAME>.yml <STACK_NAME>

3.8.1.3. View the service started by stack

(1) View services

# Perform view on management node
docker service ls

(2) View startup services

# View on management node machine
docker service ps <SERVICE_NAME>

(3) View stack service

# View on management node machine
docker stack ls

(4) View started services

# Perform a view on the management node machine
docker stack ps <STACK_NAME>

(5) View services on other node machines

# View other node machines
docker ps

Other nodes can see the successful operation

(6) Access services
The service started on each node can be accessed successfully

3.8.1.3. Remove stack

# Perform the removal on the management node
docker stack rm <STACK_NAME>

Release docker + k8s in the next chapter

Keywords: Operation & Maintenance Docker

Added by ArmanIc on Tue, 12 Oct 2021 22:45:23 +0300

Programming VIP