ubuntu-20.04.3 installing kubernetes(k8s) clusters using kubedm
1. Initialize virtual machine environment
- Use VM VirtualBox to install ubuntu-20.04.3-live-server-amd64.iso image and create three virtual machines:
- abcMaster: 192.168.0.100
- abcNode1: 192.168.0.115
- abcNode2: 192.168.0.135
- Uniformly modify the root user password
sudo passwd root
- Modify the sshd service configuration to allow xshell to access as root
# Modify / etc / SSH / sshd as root_ Config file, set permitrotlogin to yes # Restart sshd service service ssh restart
- Turn off firewall, virtual switch partition, selinux
# Turn off firewall ufw disable # Close the virtual switch (note swap configuration in fstab) vim /etc/fstab # Close selinux (not found)
- Set host hosts
cat >> /etc/hosts << EOF 192.168.0.100 abcmaster 192.168.0.115 abcnode1 192.168.0.135 abcnode2 EOF
- Deliver bridged IPV4 traffic to iptables chain
# to configure cat >> /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF # take effect sysctl --system
2. Install docker
- Set apt get domestic source
# Backup / etc/apt/sources.list cp /etc/apt/sources.list /etc/apt/sources.list.bak # Replace the resource address in / etc/apt/sources.list with vim deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse ## to update apt-get update
- Install docker
apt-get install -y docker.io
- Set docker domestic image source
# to configure tee /etc/docker/daemon.json <<-'EOF' { "registry-mirrors": ["https://wi175f8g.mirror.aliyuncs.com"] } EOF # restart systemctl daemon-reload systemctl restart docker
3. Installation k8s
- Configure k8s installation source, and then install kubelet, kubedm, and kubectl
apt-get update && apt-get install -y apt-transport-https curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - cat >> /etc/apt/sources.list.d/kubernetes.list << EOF deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main EOF apt-get update apt-get install -y kubelet kubeadm kubectl
- Execute the kubedm initialization command on the master node
kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=192.168.0.100 --image-repository registry.aliyuncs.com/google_containers
Error reporting 1:
[ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/coredns:v1.8.4: output: Error response from daemon: manifest for registry.aliyuncs.com/google_containers/coredns:v1.8.4 not found: manifest unknown: manifest unknown , error: exit status 1
Reason: unable to pull the coredns image from aliyuncs warehouse.
Solution: directly use docker to pull the coredns/coredns image, and then tag the desired image
docker pull coredns/coredns docker tag coredns/coredns:latest registry.aliyuncs.com/google_containers/coredns:v1.8.4 docker rmi coredns/coredns
Error condition 2: kubelet.service failed to start
Jun 8 09:45:35 kubelet: F0608 09:45:35.392302 24268 server.go:266] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs" Jun 8 09:45:35 systemd: kubelet.service: main process exited, code=exited, status=255/n/a Jun 8 09:45:35 systemd: Unit kubelet.service entered failed state. Jun 8 09:45:35 systemd: kubelet.service failed.
Solution:
Modify the startup mode configuration of docker service
# vim /usr/lib/systemd/system/docker.service ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd # Restart docker service systemctl daemon-reload && systemctl restart docker
Error reporting 3:
[ERROR Port-6443]: Port 6443 is in use .....
Solution: reset kubedm
kubeadm reset
Re execute the initialization command to complete the initialization
Your Kubernetes control-plane has initialized successfully!
Follow the prompts for subsequent commands
mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config export KUBECONFIG=/etc/kubernetes/admin.conf
View deployment status
kubectl get nodes # results of enforcement NAME STATUS ROLES AGE VERSION abcmaster NotReady control-plane,master 3m59s v1.22.1
- Execute kubedm join on abcnode1/2
Note: error 2 is also encountered during execution, so you need to perform problem processing in abcnode, and then execute kubedm join command
kubeadm join 192.168.0.100:6443 --token 7ni2ey.qkjhtp3ygsn0lswk \ --discovery-token-ca-cert-hash sha256:2ed9136ae664f9c74083f174e748be747c7e2926bdcf05877da003bd44f7fcc1
Effect of successful execution:
This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
Initialization result:
# kubectl get nodes NAME STATUS ROLES AGE VERSION abcmaster NotReady control-plane,master 20m v1.22.1 abcnode1 NotReady <none> 4m6s v1.22.1 abcnode2 NotReady <none> 13s v1.22.1
At present, all nodes are in NotReady state.
- Install the kube-flannel.yml plug-in in abcmaster
# Directly pull and install kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
The operation failed, unable to connect to the object address. Instead, download the kube-flannel.yml file directly from gitHub
# 1. Download the kube-flannel.yml file and copy it to abcmaster # 2. Install by command kubectl apply -f ./kube-flannel.yml
View pods status
# kubectl get pods -n kube-system root@abcmaster:~# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-7f6cbbb7b8-lx9m7 0/1 ContainerCreating 0 35m coredns-7f6cbbb7b8-r6ctb 0/1 ContainerCreating 0 35m etcd-abcmaster 1/1 Running 1 35m kube-apiserver-abcmaster 1/1 Running 1 35m kube-controller-manager-abcmaster 1/1 Running 1 35m kube-flannel-ds-amd64-m5w5w 0/1 CrashLoopBackOff 4 (79s ago) 3m39s kube-flannel-ds-amd64-rmvj4 0/1 CrashLoopBackOff 5 (13s ago) 3m39s kube-flannel-ds-amd64-wjw74 0/1 CrashLoopBackOff 4 (82s ago) 3m39s kube-proxy-djxs6 1/1 Running 1 19m kube-proxy-q9c8h 1/1 Running 0 15m kube-proxy-s7cfq 1/1 Running 0 35m kube-scheduler-abcmaster 1/1 Running 1 35m
The kube-flannel-ds-amd64 related pod cannot be started. Check the pod log and see the following error message:
# kubectl logs kube-flannel-ds-amd64-m5w5w -n kube-system ERROR: Job failed (system failure): pods is forbidden: User "system:serviceaccount:dev:default" cannot create resource "pods" in API group "" in the namespace "dev"
Solution: execute the following command:
kubectl create clusterrolebinding gitlab-cluster-admin --clusterrole=cluster-admin --group=system:serviceaccounts --namespace=dev
Review pods status:
root@abcmaster:~# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-7f6cbbb7b8-lx9m7 0/1 ImagePullBackOff 0 41m coredns-7f6cbbb7b8-r6ctb 0/1 ErrImagePull 0 41m etcd-abcmaster 1/1 Running 1 42m kube-apiserver-abcmaster 1/1 Running 1 42m kube-controller-manager-abcmaster 1/1 Running 1 42m kube-flannel-ds-amd64-75hbh 1/1 Running 0 4s kube-flannel-ds-amd64-m5w5w 1/1 Running 6 (6m19s ago) 10m kube-flannel-ds-amd64-wjw74 1/1 Running 6 (6m21s ago) 10m kube-proxy-djxs6 1/1 Running 1 26m kube-proxy-q9c8h 1/1 Running 0 22m kube-proxy-s7cfq 1/1 Running 0 41m kube-scheduler-abcmaster 1/1 Running 1 42m
There are still coredns related pod s that cannot run.
Query pod information
root@abcmaster:~# kubectl get po coredns-7f6cbbb7b8-n9hnr -n kube-system -o yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: "2021-09-15T13:52:13Z" generateName: coredns-7f6cbbb7b8- labels: k8s-app: kube-dns pod-template-hash: 7f6cbbb7b8 name: coredns-7f6cbbb7b8-n9hnr namespace: kube-system ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: coredns-7f6cbbb7b8 uid: a66cd6e2-629a-4250-9732-01cf6331acb9 resourceVersion: "6860" uid: 40b52d81-54c2-4882-87c4-a08ea5c66814 spec: containers: - args: - -conf - /etc/coredns/Corefile image: registry.aliyuncs.com/google_containers/coredns:v1.8.4 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 5 httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 60 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: coredns ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9153 name: metrics protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /ready port: 8181 scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - all readOnlyRootFilesystem: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/coredns name: config-volume readOnly: true - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-2kjjw readOnly: true dnsPolicy: Default enableServiceLinks: true nodeName: abcnode2 nodeSelector: kubernetes.io/os: linux preemptionPolicy: PreemptLowerPriority priority: 2000000000 priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: coredns serviceAccountName: coredns terminationGracePeriodSeconds: 30 tolerations: - key: CriticalAddonsOnly operator: Exists - effect: NoSchedule key: node-role.kubernetes.io/master - effect: NoSchedule key: node-role.kubernetes.io/control-plane - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 volumes: - configMap: defaultMode: 420 items: - key: Corefile path: Corefile name: coredns name: config-volume - name: kube-api-access-2kjjw projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace status: conditions: - lastProbeTime: null lastTransitionTime: "2021-09-15T13:52:13Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2021-09-15T13:52:13Z" message: 'containers with unready status: [coredns]' reason: ContainersNotReady status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2021-09-15T13:52:13Z" message: 'containers with unready status: [coredns]' reason: ContainersNotReady status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2021-09-15T13:52:13Z" status: "True" type: PodScheduled containerStatuses: - image: registry.aliyuncs.com/google_containers/coredns:v1.8.4 imageID: "" lastState: {} name: coredns ready: false restartCount: 0 started: false state: waiting: message: 'rpc error: code = Unknown desc = Error response from daemon: manifest for registry.aliyuncs.com/google_containers/coredns:v1.8.4 not found: manifest unknown: manifest unknown' reason: ErrImagePull hostIP: 192.168.0.135 phase: Pending podIP: 10.244.2.3 podIPs: - ip: 10.244.2.3 qosClass: Burstable startTime: "2021-09-15T13:52:13Z"
It can be seen from the above information that the image cannot be pulled, but the image exists in abcmaster. Continuing to analyze the above information, it is found that the deployment location of the pod is not abcmaster, but 192.168.0.135, both abcnode2. The reason is that abcnode1/2 also needs to manually pull the coredns image.
Execute the following command on abcnode1 and abcnode2 to pull and tag the image
docker pull coredns/coredns docker tag coredns/coredns:latest registry.aliyuncs.com/google_containers/coredns:v1.8.4
Check the status of pods in the master again, and all pods are running. So far, the k8s environment deployment is initially successful.
root@abcmaster:~# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-7f6cbbb7b8-n9hnr 1/1 Running 0 9m31s coredns-7f6cbbb7b8-nc46c 1/1 Running 0 20m etcd-abcmaster 1/1 Running 1 80m kube-apiserver-abcmaster 1/1 Running 1 80m kube-controller-manager-abcmaster 1/1 Running 1 80m kube-flannel-ds-amd64-75hbh 1/1 Running 0 38m kube-flannel-ds-amd64-m5w5w 1/1 Running 6 (44m ago) 48m kube-flannel-ds-amd64-wjw74 1/1 Running 6 (44m ago) 48m kube-proxy-djxs6 1/1 Running 1 64m kube-proxy-q9c8h 1/1 Running 0 60m kube-proxy-s7cfq 1/1 Running 0 80m kube-scheduler-abcmaster 1/1 Running 1 80m root@abcmaster:~# kubectl get nodes NAME STATUS ROLES AGE VERSION abcmaster Ready control-plane,master 81m v1.22.1 abcnode1 Ready <none> 65m v1.22.1 abcnode2 Ready <none> 61m v1.22.1