ubuntu-20.04.3 installing kubernetes(k8s) clusters using kubedm

1. Initialize virtual machine environment

Use VM VirtualBox to install ubuntu-20.04.3-live-server-amd64.iso image and create three virtual machines:
- abcMaster: 192.168.0.100
- abcNode1: 192.168.0.115
- abcNode2: 192.168.0.135
Uniformly modify the root user password

sudo passwd root

Modify the sshd service configuration to allow xshell to access as root

# Modify / etc / SSH / sshd as root_ Config file, set permitrotlogin to yes
# Restart sshd service
service ssh restart

Turn off firewall, virtual switch partition, selinux

# Turn off firewall
ufw disable
# Close the virtual switch (note swap configuration in fstab)
vim /etc/fstab
# Close selinux (not found)

Set host hosts

cat >> /etc/hosts << EOF
192.168.0.100 abcmaster
192.168.0.115 abcnode1
192.168.0.135 abcnode2
EOF

Deliver bridged IPV4 traffic to iptables chain

# to configure
cat >> /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
# take effect
sysctl --system

2. Install docker

Set apt get domestic source

# Backup / etc/apt/sources.list
cp /etc/apt/sources.list /etc/apt/sources.list.bak
# Replace the resource address in / etc/apt/sources.list with vim
deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
## to update
apt-get update

Install docker

apt-get install -y docker.io

Set docker domestic image source

# to configure
tee /etc/docker/daemon.json <<-'EOF'
{
  "registry-mirrors": ["https://wi175f8g.mirror.aliyuncs.com"]
}
EOF
# restart
systemctl daemon-reload
systemctl restart docker

3. Installation k8s

Configure k8s installation source, and then install kubelet, kubedm, and kubectl

apt-get update && apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - 
cat >> /etc/apt/sources.list.d/kubernetes.list << EOF
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF  
apt-get update
apt-get install -y kubelet kubeadm kubectl

Execute the kubedm initialization command on the master node

kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=192.168.0.100 --image-repository registry.aliyuncs.com/google_containers

Error reporting 1:

[ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/coredns:v1.8.4: output: Error response from daemon: manifest for registry.aliyuncs.com/google_containers/coredns:v1.8.4 not found: manifest unknown: manifest unknown
, error: exit status 1

Reason: unable to pull the coredns image from aliyuncs warehouse.

Solution: directly use docker to pull the coredns/coredns image, and then tag the desired image

docker pull coredns/coredns
docker tag coredns/coredns:latest registry.aliyuncs.com/google_containers/coredns:v1.8.4
docker rmi coredns/coredns

Error condition 2: kubelet.service failed to start

Jun  8 09:45:35 kubelet: F0608 09:45:35.392302   24268 server.go:266] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
Jun  8 09:45:35 systemd: kubelet.service: main process exited, code=exited, status=255/n/a
Jun  8 09:45:35 systemd: Unit kubelet.service entered failed state.
Jun  8 09:45:35 systemd: kubelet.service failed.

Solution:

Modify the startup mode configuration of docker service

# vim /usr/lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd
# Restart docker service
systemctl daemon-reload && systemctl restart docker

Error reporting 3:

[ERROR Port-6443]: Port 6443 is in use
.....

Solution: reset kubedm

kubeadm reset

Re execute the initialization command to complete the initialization

Your Kubernetes control-plane has initialized successfully!

Follow the prompts for subsequent commands

mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf

View deployment status

kubectl get nodes
# results of enforcement
NAME        STATUS     ROLES                  AGE     VERSION
abcmaster   NotReady   control-plane,master   3m59s   v1.22.1

Execute kubedm join on abcnode1/2

Note: error 2 is also encountered during execution, so you need to perform problem processing in abcnode, and then execute kubedm join command

kubeadm join 192.168.0.100:6443 --token 7ni2ey.qkjhtp3ygsn0lswk \
        --discovery-token-ca-cert-hash sha256:2ed9136ae664f9c74083f174e748be747c7e2926bdcf05877da003bd44f7fcc1

Effect of successful execution:

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

Initialization result:

# kubectl get nodes
NAME        STATUS     ROLES                  AGE    VERSION
abcmaster   NotReady   control-plane,master   20m    v1.22.1
abcnode1    NotReady   <none>                 4m6s   v1.22.1
abcnode2    NotReady   <none>                 13s    v1.22.1

At present, all nodes are in NotReady state.

Install the kube-flannel.yml plug-in in abcmaster

# Directly pull and install
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

The operation failed, unable to connect to the object address. Instead, download the kube-flannel.yml file directly from gitHub

# 1. Download the kube-flannel.yml file and copy it to abcmaster
# 2. Install by command
kubectl apply -f ./kube-flannel.yml

View pods status

# kubectl get pods -n kube-system
root@abcmaster:~# kubectl get pods -n kube-system
NAME                                READY   STATUS              RESTARTS      AGE
coredns-7f6cbbb7b8-lx9m7            0/1     ContainerCreating   0             35m
coredns-7f6cbbb7b8-r6ctb            0/1     ContainerCreating   0             35m
etcd-abcmaster                      1/1     Running             1             35m
kube-apiserver-abcmaster            1/1     Running             1             35m
kube-controller-manager-abcmaster   1/1     Running             1             35m
kube-flannel-ds-amd64-m5w5w         0/1     CrashLoopBackOff    4 (79s ago)   3m39s
kube-flannel-ds-amd64-rmvj4         0/1     CrashLoopBackOff    5 (13s ago)   3m39s
kube-flannel-ds-amd64-wjw74         0/1     CrashLoopBackOff    4 (82s ago)   3m39s
kube-proxy-djxs6                    1/1     Running             1             19m
kube-proxy-q9c8h                    1/1     Running             0             15m
kube-proxy-s7cfq                    1/1     Running             0             35m
kube-scheduler-abcmaster            1/1     Running             1             35m

The kube-flannel-ds-amd64 related pod cannot be started. Check the pod log and see the following error message:

# kubectl logs kube-flannel-ds-amd64-m5w5w -n kube-system
ERROR: Job failed (system failure): pods is forbidden: User "system:serviceaccount:dev:default" cannot create resource "pods" in API group "" in the namespace "dev"

Solution: execute the following command:

kubectl create clusterrolebinding gitlab-cluster-admin --clusterrole=cluster-admin --group=system:serviceaccounts --namespace=dev

Review pods status:

root@abcmaster:~# kubectl get pods -n kube-system
NAME                                READY   STATUS             RESTARTS        AGE
coredns-7f6cbbb7b8-lx9m7            0/1     ImagePullBackOff   0               41m
coredns-7f6cbbb7b8-r6ctb            0/1     ErrImagePull       0               41m
etcd-abcmaster                      1/1     Running            1               42m
kube-apiserver-abcmaster            1/1     Running            1               42m
kube-controller-manager-abcmaster   1/1     Running            1               42m
kube-flannel-ds-amd64-75hbh         1/1     Running            0               4s
kube-flannel-ds-amd64-m5w5w         1/1     Running            6 (6m19s ago)   10m
kube-flannel-ds-amd64-wjw74         1/1     Running            6 (6m21s ago)   10m
kube-proxy-djxs6                    1/1     Running            1               26m
kube-proxy-q9c8h                    1/1     Running            0               22m
kube-proxy-s7cfq                    1/1     Running            0               41m
kube-scheduler-abcmaster            1/1     Running            1               42m

There are still coredns related pod s that cannot run.

Query pod information

root@abcmaster:~# kubectl get po coredns-7f6cbbb7b8-n9hnr -n kube-system -o yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2021-09-15T13:52:13Z"
  generateName: coredns-7f6cbbb7b8-
  labels:
    k8s-app: kube-dns
    pod-template-hash: 7f6cbbb7b8
  name: coredns-7f6cbbb7b8-n9hnr
  namespace: kube-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: coredns-7f6cbbb7b8
    uid: a66cd6e2-629a-4250-9732-01cf6331acb9
  resourceVersion: "6860"
  uid: 40b52d81-54c2-4882-87c4-a08ea5c66814
spec:
  containers:
  - args:
    - -conf
    - /etc/coredns/Corefile
    image: registry.aliyuncs.com/google_containers/coredns:v1.8.4
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 5
      httpGet:
        path: /health
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 60
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    name: coredns
    ports:
    - containerPort: 53
      name: dns
      protocol: UDP
    - containerPort: 53
      name: dns-tcp
      protocol: TCP
    - containerPort: 9153
      name: metrics
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /ready
        port: 8181
        scheme: HTTP
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        memory: 170Mi
      requests:
        cpu: 100m
        memory: 70Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        add:
        - NET_BIND_SERVICE
        drop:
        - all
      readOnlyRootFilesystem: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/coredns
      name: config-volume
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-2kjjw
      readOnly: true
  dnsPolicy: Default
  enableServiceLinks: true
  nodeName: abcnode2
  nodeSelector:
    kubernetes.io/os: linux
  preemptionPolicy: PreemptLowerPriority
  priority: 2000000000
  priorityClassName: system-cluster-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: coredns
  serviceAccountName: coredns
  terminationGracePeriodSeconds: 30
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
  - effect: NoSchedule
    key: node-role.kubernetes.io/control-plane
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - configMap:
      defaultMode: 420
      items:
      - key: Corefile
        path: Corefile
      name: coredns
    name: config-volume
  - name: kube-api-access-2kjjw
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2021-09-15T13:52:13Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2021-09-15T13:52:13Z"
    message: 'containers with unready status: [coredns]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2021-09-15T13:52:13Z"
    message: 'containers with unready status: [coredns]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2021-09-15T13:52:13Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - image: registry.aliyuncs.com/google_containers/coredns:v1.8.4
    imageID: ""
    lastState: {}
    name: coredns
    ready: false
    restartCount: 0
    started: false
    state:
      waiting:
        message: 'rpc error: code = Unknown desc = Error response from daemon: manifest
          for registry.aliyuncs.com/google_containers/coredns:v1.8.4 not found: manifest
          unknown: manifest unknown'
        reason: ErrImagePull
  hostIP: 192.168.0.135
  phase: Pending
  podIP: 10.244.2.3
  podIPs:
  - ip: 10.244.2.3
  qosClass: Burstable
  startTime: "2021-09-15T13:52:13Z"

It can be seen from the above information that the image cannot be pulled, but the image exists in abcmaster. Continuing to analyze the above information, it is found that the deployment location of the pod is not abcmaster, but 192.168.0.135, both abcnode2. The reason is that abcnode1/2 also needs to manually pull the coredns image.

Execute the following command on abcnode1 and abcnode2 to pull and tag the image

docker pull coredns/coredns
docker tag coredns/coredns:latest registry.aliyuncs.com/google_containers/coredns:v1.8.4

Check the status of pods in the master again, and all pods are running. So far, the k8s environment deployment is initially successful.

root@abcmaster:~# kubectl get pods -n kube-system
NAME                                READY   STATUS    RESTARTS      AGE
coredns-7f6cbbb7b8-n9hnr            1/1     Running   0             9m31s
coredns-7f6cbbb7b8-nc46c            1/1     Running   0             20m
etcd-abcmaster                      1/1     Running   1             80m
kube-apiserver-abcmaster            1/1     Running   1             80m
kube-controller-manager-abcmaster   1/1     Running   1             80m
kube-flannel-ds-amd64-75hbh         1/1     Running   0             38m
kube-flannel-ds-amd64-m5w5w         1/1     Running   6 (44m ago)   48m
kube-flannel-ds-amd64-wjw74         1/1     Running   6 (44m ago)   48m
kube-proxy-djxs6                    1/1     Running   1             64m
kube-proxy-q9c8h                    1/1     Running   0             60m
kube-proxy-s7cfq                    1/1     Running   0             80m
kube-scheduler-abcmaster            1/1     Running   1             80m
root@abcmaster:~# kubectl get nodes
NAME        STATUS   ROLES                  AGE   VERSION
abcmaster   Ready    control-plane,master   81m   v1.22.1
abcnode1    Ready    <none>                 65m   v1.22.1
abcnode2    Ready    <none>                 61m   v1.22.1

Keywords: Kubernetes Ubuntu kubeadm

Added by ORiGIN on Sun, 19 Sep 2021 23:40:30 +0300

Programming VIP

ubuntu-20.04.3 installing kubernetes(k8s) clusters using kubedm