Tungsten Fabric Starter Series articles, which are compiled and presented by the TF Chinese community, are from the hands-on experience of technology bulls. They are designed to help novices gain a better understanding of the whole process of TF operation, installation, integration, debugging, etc.If you have relevant experience or questions, you are welcome to interact with us and further communicate with the community geeks.For more TF technical articles, click the button at the bottom of the public number Learn Article Collection.
Author: Tatsuya Naganawa Translator: TF Compilation Group
Cluster-wide updates are an important feature that provides the effectiveness of the latest functionality in a cluster while guaranteeing the SLA of a production cluster.
Because Tungsten Fabric uses a similar protocol in MPLS-***, based on my attempts, there is basic interoperability even though the module versions of Control and vRouter are different.
Therefore, the general idea is to first update the controller s one by one, then update the vRouter s one by one using vMotion or maintence mode as needed.
Tungsten Fabric controller also supports a fantastic feature called ISSU, but I think the name is confusing because Tungsten Fabric controller is very similar to route reflector, not routing-engine.
Therefore, the basic idea is to first copy all configs to the newly created controller (or route reflectors), then update the vRouter settings (and update the vRouter module if the server can be restarted) to use these new controllers.Through this process, rollback operations for vRouter module updates will also be easier.
Let me describe this process below.
in-place update
Since ansible-deployer follows idempotent behavior, updates are not much different from installations.The following command updates all modules.
cd contrail-ansible-deployer git pull vi config/instances.yaml (update CONTRAIL_CONTAINER_TAG) ansible-playbook -e orchestrator=xxx -i inventory/ playbooks/install_contrail.yml
One limitation is that this command restarts almost all nodes at once, so it is not easy to restart controller s and vRouter s one by one.In addition, fromInstances.yamlDeleting other nodes in will not work because updating one node requires some parameters of the other nodes.
- For example, vRouter updates require the control's IP, which is fromInstances.yamlDerived from the control role node in
To overcome this problem, start with R2005,Ziu.yamlAdd this feature, at least for the control plane, to update one by one.
cd contrail-ansible-deployer git pull vi config/instances.yaml (update CONTRAIL_CONTAINER_TAG) ansible-playbook -e orchestrator=xxx -i inventory/ playbooks/ziu.yml ansible-playbook -e orchestrator=xxx -i inventory/ playbooks/install_contrail.yml
As far as I've tried, when the control plane updates, it updates in series and restarts the control process, so I don't see dropouts.
-
In install_Contrail.yamlDuring this period, restart of control processes is skipped because they have been updated.
- When a vrouter-agent restart is performed, some packet dropouts still occur, so it is recommended that workload migration be performed if possible.
ISSU
Even if container formats vary greatly (from 4.x to 5.x, for example), we can use ISSU because it creates a new controller cluster and replicates data in it.
First, let me describe the simplest scenario, an old controller and a new controller, to see the entire process.All commands are typed on the new controller.
old-controller: ip: 172.31.2.209 hostname: ip-172-31-2-209 new-controller: ip: 172.31.1.154 hostname: ip-172-31-1-154 (both controllers are installed with this instances.yaml) provider_config: bms: ssh_user: root ssh_public_key: /root/.ssh/id_rsa.pub ssh_private_key: /root/.ssh/id_rsa domainsuffix: local ntpserver: 0.centos.pool.ntp.org instances: bms1: provider: bms roles: config_database: config: control: analytics: analytics_database: webui: ip: x.x.x.x ## controller's ip contrail_configuration: CONTRAIL_CONTAINER_TAG: r5.1 KUBERNETES_CLUSTER_PROJECT: {} JVM_EXTRA_OPTS: "-Xms128m -Xmx1g" global_configuration: CONTAINER_REGISTRY: tungstenfabric [commands] 1. Stop Batch Job docker stop config_devicemgr_1 docker stop config_schema_1 docker stop config_svcmonitor_1 2. stay cassandra Register new on control,Run between them bgp docker exec -it config_api_1 bash python /opt/contrail/utils/provision_control.py --host_name ip-172-31-1-154.local --host_ip 172.31.1.154 --api_server_ip 172.31.2.209 --api_server_port 8082 --oper add --router_asn 64512 --ibgp_auto_mesh 3. Synchronize data between controllers vi contrail-issu.conf (write down this) [DEFAULTS] old_rabbit_address_list = 172.31.2.209 old_rabbit_port = 5673 new_rabbit_address_list = 172.31.1.154 new_rabbit_port = 5673 old_cassandra_address_list = 172.31.2.209:9161 old_zookeeper_address_list = 172.31.2.209:2181 new_cassandra_address_list = 172.31.1.154:9161 new_zookeeper_address_list = 172.31.1.154:2181 new_api_info={"172.31.1.154": [("root"), ("password")]} ## ssh public-key can be used image_id=`docker images | awk '/config-api/{print $3}' | head -1` docker run --rm -it --network host -v $(pwd)/contrail-issu.conf:/etc/contrail/contrail-issu.conf --entrypoint /bin/bash -v /root/.ssh:/root/.ssh $image_id -c "/usr/bin/contrail-issu-pre-sync -c /etc/contrail/contrail-issu.conf" 4. Start process for real-time data synchronization docker run --rm --detach -it --network host -v $(pwd)/contrail-issu.conf:/etc/contrail/contrail-issu.conf --entrypoint /bin/bash -v /root/.ssh:/root/.ssh --name issu-run-sync $image_id -c "/usr/bin/contrail-issu-run-sync -c /etc/contrail/contrail-issu.conf" (Check logs if necessary) docker exec -t issu-run-sync tail -f /var/log/contrail/issu_contrail_run_sync.log 5. (To update vrouters) 6. Stop the job at the end and synchronize all data docker rm -f issu-run-sync image_id=`docker images | awk '/config-api/{print $3}' | head -1` docker run --rm -it --network host -v $(pwd)/contrail-issu.conf:/etc/contrail/contrail-issu.conf --entrypoint /bin/bash -v /root/.ssh:/root/.ssh --name issu-run-sync $image_id -c "/usr/bin/contrail-issu-post-sync -c /etc/contrail/contrail-issu.conf" docker run --rm -it --network host -v $(pwd)/contrail-issu.conf:/etc/contrail/contrail-issu.conf --entrypoint /bin/bash -v /root/.ssh:/root/.ssh --name issu-run-sync $image_id -c "/usr/bin/contrail-issu-zk-sync -c /etc/contrail/contrail-issu.conf" 7. from cassandra Remove old nodes and add new ones in vi issu.conf (write down this) [DEFAULTS] db_host_info={"172.31.1.154": "ip-172-31-1-154.local"} config_host_info={"172.31.1.154": "ip-172-31-1-154.local"} analytics_host_info={"172.31.1.154": "ip-172-31-1-154.local"} control_host_info={"172.31.1.154": "ip-172-31-1-154.local"} api_server_ip=172.31.1.154 docker cp issu.conf config_api_1:issu.conf docker exec -it config_api_1 python /opt/contrail/utils/provision_issu.py -c issu.conf 8. Start Batch Job docker start config_devicemgr_1 docker start config_schema_1 docker start config_svcmonitor_1
The following will be possible checkpoints.
-
After step 3, you can try using contrail-api-cli ls-l * to see if all the data has been successfully copied, and you canIst.pyCTR Nei to see if ibgp between controller s is started.
- After step 4, you can modify the old database to see if the changes can be propagated successfully to the new database.
Next, I'll discuss a more realistic scenario using the choreographer and the two vRouter s.
Layout Integration
To illustrate the case with the orchestrator, I attempted to deploy two vRouters and kubernetes with ansible-deployer.
Even when used in conjunction with a choreographer, the overall process is not much different.
It is important to note when you need to change the kube-manager to a new one.
In a sense, since kube-manager dynamically subscribes to events from kube-apiserver and updates the Tungsten Fabric configuration database (config-database), it is similar to batch jobs such as schema-transformer, svc-monitor, and device-manager.Therefore, I use this type of batch job and stop and start the old or new kube-manager (which actually includes the webui) at the same time, but it may need to be changed for each setting.
The overall process in this example is shown below.
1. Set up a controller (with a kube-manager and kubernetes-master) and two vRouter s 2. Set up a new controller (with a kube-manager, but the kubernetes-master is the same as the old controller) 3. Stop the batch job, the kube-manager of the new controller, and the webui 4. Start the ISSU process and continue execution until run-sync starts running ->iBGP will be established between controller s 5. Update vRouter one by one according to the new controller's ansible-ployer ->When a vRouter is moved to a new vRouter, the new controller will also get a route-target for k8s-default-pod-network, and pings will still work between containers.Ist.pyCTR route summary and ping results will be attached later) 6. Stop the batch job, the kube-manager on the old controller, and the webui after moving all vRouter s to the new controller After that, continue with the ISSU process, start the batch job on the new controller, kube-manager, and webui ->You cannot manually change the config-database from the beginning to the end of this phase, so it may take some maintenance time (The whole process may last 5 to 15 minutes and ping will work, but the creation of a new container will not work until a new kube-manager is started) 7. Finally, stop the control, config, and config-database on the old node
When updating vRouters, I used the controller's provider: bms-maint, k8s_master and vRouter, which have been changed to new to avoid interference due to container restart.I attached the originalInstances.yamlAnd update vRouter'sInstances.yamlSo that you can get more details.
I will also attach at each stageIst.pyCTR Nei andIst.pyThe result of the CTR route summary to illustrate the details of what happened.
- Note that in this example, I did not actually update the module, because this setting is primarily intended to highlight the ISSU process (because even if the module versions are the same, ansible-deployer will recreate the vrouter-agent container, and even if the actual module updates are completed, the number of packets lost will not be much different.)
old-controller: 172.31.19.25 new-controller: 172.31.13.9 two-vRouters: 172.31.25.102, 172.31.33.175 //Before issu begins: [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr nei Introspect Host: 172.31.19.25 +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | ip-172-31-25-102.local | 172.31.25.102 | 0 | XMPP | internal | Established | in sync | 0 | n/a | | ip-172-31-33-175.local | 172.31.33.175 | 0 | XMPP | internal | Established | in sync | 0 | n/a | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ [root@ip-172-31-13-9 ~]# [root@ip-172-31-13-9 ~]# [root@ip-172-31-13-9 ~]# [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr nei Introspect Host: 172.31.13.9 [root@ip-172-31-13-9 ~]# -> iBGP is not configured yet [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr route summary Introspect Host: 172.31.19.25 +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | name | prefixes | paths | primary_paths | secondary_paths | infeasible_paths | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | default-domain:default- | 0 | 0 | 0 | 0 | 0 | | project:__link_local__:__link_local__.inet.0 | | | | | | | default-domain:default-project:dci- | 0 | 0 | 0 | 0 | 0 | | network:__default__.inet.0 | | | | | | | default-domain:default-project:dci-network:dci- | 0 | 0 | 0 | 0 | 0 | | network.inet.0 | | | | | | | default-domain:default-project:default-virtual- | 0 | 0 | 0 | 0 | 0 | | network:default-virtual-network.inet.0 | | | | | | | inet.0 | 0 | 0 | 0 | 0 | 0 | | default-domain:default-project:ip-fabric:ip- | 7 | 7 | 2 | 5 | 0 | | fabric.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-pod-network | 7 | 7 | 4 | 3 | 0 | | :k8s-default-pod-network.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-service- | 7 | 7 | 1 | 6 | 0 | | network:k8s-default-service-network.inet.0 | | | | | | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ [root@ip-172-31-13-9 ~]# [root@ip-172-31-13-9 ~]# [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr route summary Introspect Host: 172.31.13.9 +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | name | prefixes | paths | primary_paths | secondary_paths | infeasible_paths | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | default-domain:default- | 0 | 0 | 0 | 0 | 0 | | project:__link_local__:__link_local__.inet.0 | | | | | | | default-domain:default-project:dci- | 0 | 0 | 0 | 0 | 0 | | network:__default__.inet.0 | | | | | | | default-domain:default-project:dci-network:dci- | 0 | 0 | 0 | 0 | 0 | | network.inet.0 | | | | | | | default-domain:default-project:default-virtual- | 0 | 0 | 0 | 0 | 0 | | network:default-virtual-network.inet.0 | | | | | | | inet.0 | 0 | 0 | 0 | 0 | 0 | | default-domain:default-project:ip-fabric:ip- | 0 | 0 | 0 | 0 | 0 | | fabric.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-pod-network | 0 | 0 | 0 | 0 | 0 | | :k8s-default-pod-network.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-service- | 0 | 0 | 0 | 0 | 0 | | network:k8s-default-service-network.inet.0 | | | | | | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ [root@ip-172-31-13-9 ~]# -> No routes were imported in the new controller [root@ip-172-31-19-25 contrail-ansible-deployer]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE cirros-deployment-75c98888b9-6qmcm 1/1 Running 0 4m58s 10.47.255.249 ip-172-31-25-102.ap-northeast-1.compute.internal <none> cirros-deployment-75c98888b9-lxq4k 1/1 Running 0 4m58s 10.47.255.250 ip-172-31-33-175.ap-northeast-1.compute.internal <none> [root@ip-172-31-19-25 contrail-ansible-deployer]# / # ip -o a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000\ link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever 13: eth0@if14: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue \ link/ether 02:6b:dc:98:ac:95 brd ff:ff:ff:ff:ff:ff 13: eth0 inet 10.47.255.249/12 scope global eth0\ valid_lft forever preferred_lft forever / # ping 10.47.255.250 PING 10.47.255.250 (10.47.255.250): 56 data bytes 64 bytes from 10.47.255.250: seq=0 ttl=63 time=2.155 ms 64 bytes from 10.47.255.250: seq=1 ttl=63 time=0.904 ms ^C --- 10.47.255.250 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 0.904/1.529/2.155 ms / # -> Two vRouter Each has a container, between two containers ping The results were normal. //In provision_After control: [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr nei Introspect Host: 172.31.19.25 +------------------------+---------------+----------+----------+-----------+-------------+-----------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +------------------------+---------------+----------+----------+-----------+-------------+-----------------+------------+-----------+ | ip-172-31-13-9.local | 172.31.13.9 | 64512 | BGP | internal | Idle | not advertising | 0 | n/a | | ip-172-31-25-102.local | 172.31.25.102 | 0 | XMPP | internal | Established | in sync | 0 | n/a | | ip-172-31-33-175.local | 172.31.33.175 | 0 | XMPP | internal | Established | in sync | 0 | n/a | +------------------------+---------------+----------+----------+-----------+-------------+-----------------+------------+-----------+ [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr nei Introspect Host: 172.31.13.9 [root@ip-172-31-13-9 ~]# -> iBGP On the old controller, but the new controller doesn't have those configurations (in execution) pre-sync This will then be copied to the new controller) //After run-sync: [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr nei Introspect Host: 172.31.19.25 +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | ip-172-31-13-9.local | 172.31.13.9 | 64512 | BGP | internal | Established | in sync | 0 | n/a | | ip-172-31-25-102.local | 172.31.25.102 | 0 | XMPP | internal | Established | in sync | 0 | n/a | | ip-172-31-33-175.local | 172.31.33.175 | 0 | XMPP | internal | Established | in sync | 0 | n/a | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr nei Introspect Host: 172.31.13.9 +-----------------------+--------------+----------+----------+-----------+-------------+------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +-----------------------+--------------+----------+----------+-----------+-------------+------------+------------+-----------+ | ip-172-31-19-25.local | 172.31.19.25 | 64512 | BGP | internal | Established | in sync | 0 | n/a | +-----------------------+--------------+----------+----------+-----------+-------------+------------+------------+-----------+ [root@ip-172-31-13-9 ~]# -> iBGP It's built up, ctr route summary No change because the new controller does not k8s-default-pod-network Routing objectives ( route-target),Routing destination ( route target)Filters organize the import of these prefixes. //After migrating the node to the new controller: / # ping 10.47.255.250 PING 10.47.255.250 (10.47.255.250): 56 data bytes 64 bytes from 10.47.255.250: seq=0 ttl=63 time=1.684 ms 64 bytes from 10.47.255.250: seq=1 ttl=63 time=0.835 ms 64 bytes from 10.47.255.250: seq=2 ttl=63 time=0.836 ms (snip) 64 bytes from 10.47.255.250: seq=37 ttl=63 time=0.878 ms 64 bytes from 10.47.255.250: seq=38 ttl=63 time=0.823 ms 64 bytes from 10.47.255.250: seq=39 ttl=63 time=0.820 ms 64 bytes from 10.47.255.250: seq=40 ttl=63 time=1.364 ms 64 bytes from 10.47.255.250: seq=44 ttl=63 time=2.209 ms 64 bytes from 10.47.255.250: seq=45 ttl=63 time=0.869 ms 64 bytes from 10.47.255.250: seq=46 ttl=63 time=0.857 ms 64 bytes from 10.47.255.250: seq=47 ttl=63 time=0.855 ms 64 bytes from 10.47.255.250: seq=48 ttl=63 time=0.845 ms 64 bytes from 10.47.255.250: seq=49 ttl=63 time=0.842 ms 64 bytes from 10.47.255.250: seq=50 ttl=63 time=0.885 ms 64 bytes from 10.47.255.250: seq=51 ttl=63 time=0.891 ms 64 bytes from 10.47.255.250: seq=52 ttl=63 time=0.909 ms 64 bytes from 10.47.255.250: seq=53 ttl=63 time=0.867 ms 64 bytes from 10.47.255.250: seq=54 ttl=63 time=0.884 ms 64 bytes from 10.47.255.250: seq=55 ttl=63 time=0.865 ms 64 bytes from 10.47.255.250: seq=56 ttl=63 time=0.840 ms 64 bytes from 10.47.255.250: seq=57 ttl=63 time=0.877 ms ^C --- 10.47.255.250 ping statistics --- 58 packets transmitted, 55 packets received, 5% packet loss round-trip min/avg/max = 0.810/0.930/2.209 ms / # -> stay vrouter-agent After restarting, you can see that three packages have been lost(Number 40-44). In migration vRouter When new, ping Good job. [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr nei //Check host: 172.31.19.25 +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | ip-172-31-13-9.local | 172.31.13.9 | 64512 | BGP | internal | Established | in sync | 0 | n/a | | ip-172-31-33-175.local | 172.31.33.175 | 0 | XMPP | internal | Established | in sync | 0 | n/a | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr nei Introspect Host: 172.31.13.9 +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | ip-172-31-19-25.local | 172.31.19.25 | 64512 | BGP | internal | Established | in sync | 0 | n/a | | ip-172-31-25-102.local | 172.31.25.102 | 0 | XMPP | internal | Established | in sync | 0 | n/a | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ [root@ip-172-31-13-9 ~]# -> Both controllers have XMPP Connect, set up IBGP [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr route summary //Check host: 172.31.19.25 +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | name | prefixes | paths | primary_paths | secondary_paths | infeasible_paths | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | default-domain:default- | 0 | 0 | 0 | 0 | 0 | | project:__link_local__:__link_local__.inet.0 | | | | | | | default-domain:default-project:dci- | 0 | 0 | 0 | 0 | 0 | | network:__default__.inet.0 | | | | | | | default-domain:default-project:dci-network:dci- | 0 | 0 | 0 | 0 | 0 | | network.inet.0 | | | | | | | default-domain:default-project:default-virtual- | 0 | 0 | 0 | 0 | 0 | | network:default-virtual-network.inet.0 | | | | | | | inet.0 | 0 | 0 | 0 | 0 | 0 | | default-domain:default-project:ip-fabric:ip- | 7 | 7 | 1 | 6 | 0 | | fabric.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-pod-network | 7 | 7 | 1 | 6 | 0 | | :k8s-default-pod-network.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-service- | 7 | 7 | 0 | 7 | 0 | | network:k8s-default-service-network.inet.0 | | | | | | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr route summary //Check host: 172.31.13.9 +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | name | prefixes | paths | primary_paths | secondary_paths | infeasible_paths | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | default-domain:default- | 0 | 0 | 0 | 0 | 0 | | project:__link_local__:__link_local__.inet.0 | | | | | | | default-domain:default-project:dci- | 0 | 0 | 0 | 0 | 0 | | network:__default__.inet.0 | | | | | | | default-domain:default-project:dci-network:dci- | 0 | 0 | 0 | 0 | 0 | | network.inet.0 | | | | | | | default-domain:default-project:default-virtual- | 0 | 0 | 0 | 0 | 0 | | network:default-virtual-network.inet.0 | | | | | | | inet.0 | 0 | 0 | 0 | 0 | 0 | | default-domain:default-project:ip-fabric:ip- | 7 | 7 | 1 | 6 | 0 | | fabric.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-pod-network | 7 | 7 | 3 | 4 | 0 | | :k8s-default-pod-network.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-service- | 7 | 7 | 1 | 6 | 0 | | network:k8s-default-service-network.inet.0 | | | | | | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ [root@ip-172-31-13-9 ~]# -> Because both controllers have at least one container from k8s-default-pod-network, They use iBGP To exchange prefixes, so they have the same prefix //After migrating the second vrouter to the new controller: / # ping 10.47.255.250 PING 10.47.255.250 (10.47.255.250): 56 data bytes 64 bytes from 10.47.255.250: seq=0 ttl=63 time=1.750 ms 64 bytes from 10.47.255.250: seq=1 ttl=63 time=0.815 ms 64 bytes from 10.47.255.250: seq=2 ttl=63 time=0.851 ms 64 bytes from 10.47.255.250: seq=3 ttl=63 time=0.809 ms (snip) 64 bytes from 10.47.255.250: seq=34 ttl=63 time=0.853 ms 64 bytes from 10.47.255.250: seq=35 ttl=63 time=0.848 ms 64 bytes from 10.47.255.250: seq=36 ttl=63 time=0.833 ms 64 bytes from 10.47.255.250: seq=37 ttl=63 time=0.832 ms 64 bytes from 10.47.255.250: seq=38 ttl=63 time=0.910 ms 64 bytes from 10.47.255.250: seq=42 ttl=63 time=2.071 ms 64 bytes from 10.47.255.250: seq=43 ttl=63 time=0.826 ms 64 bytes from 10.47.255.250: seq=44 ttl=63 time=0.853 ms 64 bytes from 10.47.255.250: seq=45 ttl=63 time=0.851 ms 64 bytes from 10.47.255.250: seq=46 ttl=63 time=0.853 ms 64 bytes from 10.47.255.250: seq=47 ttl=63 time=0.851 ms 64 bytes from 10.47.255.250: seq=48 ttl=63 time=0.855 ms 64 bytes from 10.47.255.250: seq=49 ttl=63 time=0.869 ms 64 bytes from 10.47.255.250: seq=50 ttl=63 time=0.833 ms 64 bytes from 10.47.255.250: seq=51 ttl=63 time=0.859 ms 64 bytes from 10.47.255.250: seq=52 ttl=63 time=0.866 ms 64 bytes from 10.47.255.250: seq=53 ttl=63 time=0.840 ms 64 bytes from 10.47.255.250: seq=54 ttl=63 time=0.841 ms 64 bytes from 10.47.255.250: seq=55 ttl=63 time=0.854 ms ^C --- 10.47.255.250 ping statistics --- 56 packets transmitted, 53 packets received, 5% packet loss round-trip min/avg/max = 0.799/0.888/2.071 ms / # -> 3 packet loss is seen (seq 38-42) [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr nei Introspect Host: 172.31.19.25 +----------------------+--------------+----------+----------+-----------+-------------+------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +----------------------+--------------+----------+----------+-----------+-------------+------------+------------+-----------+ | ip-172-31-13-9.local | 172.31.13.9 | 64512 | BGP | internal | Established | in sync | 0 | n/a | +----------------------+--------------+----------+----------+-----------+-------------+------------+------------+-----------+ [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr nei Introspect Host: 172.31.13.9 +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | ip-172-31-19-25.local | 172.31.19.25 | 64512 | BGP | internal | Established | in sync | 0 | n/a | | ip-172-31-25-102.local | 172.31.25.102 | 0 | XMPP | internal | Established | in sync | 0 | n/a | | ip-172-31-33-175.local | 172.31.33.175 | 0 | XMPP | internal | Established | in sync | 0 | n/a | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ [root@ip-172-31-13-9 ~]# -> The new controller has two XMPP Connect. [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr route summary //Check host: 172.31.19.25 +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | name | prefixes | paths | primary_paths | secondary_paths | infeasible_paths | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | default-domain:default- | 0 | 0 | 0 | 0 | 0 | | project:__link_local__:__link_local__.inet.0 | | | | | | | default-domain:default-project:dci- | 0 | 0 | 0 | 0 | 0 | | network:__default__.inet.0 | | | | | | | default-domain:default-project:dci-network:dci- | 0 | 0 | 0 | 0 | 0 | | network.inet.0 | | | | | | | default-domain:default-project:default-virtual- | 0 | 0 | 0 | 0 | 0 | | network:default-virtual-network.inet.0 | | | | | | | inet.0 | 0 | 0 | 0 | 0 | 0 | | default-domain:default-project:ip-fabric:ip- | 0 | 0 | 0 | 0 | 0 | | fabric.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-pod-network | 0 | 0 | 0 | 0 | 0 | | :k8s-default-pod-network.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-service- | 0 | 0 | 0 | 0 | 0 | | network:k8s-default-service-network.inet.0 | | | | | | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr route summary //Check host: 172.31.13.9 +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | name | prefixes | paths | primary_paths | secondary_paths | infeasible_paths | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ | default-domain:default- | 0 | 0 | 0 | 0 | 0 | | project:__link_local__:__link_local__.inet.0 | | | | | | | default-domain:default-project:dci- | 0 | 0 | 0 | 0 | 0 | | network:__default__.inet.0 | | | | | | | default-domain:default-project:dci-network:dci- | 0 | 0 | 0 | 0 | 0 | | network.inet.0 | | | | | | | default-domain:default-project:default-virtual- | 0 | 0 | 0 | 0 | 0 | | network:default-virtual-network.inet.0 | | | | | | | inet.0 | 0 | 0 | 0 | 0 | 0 | | default-domain:default-project:ip-fabric:ip- | 7 | 7 | 2 | 5 | 0 | | fabric.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-pod-network | 7 | 7 | 4 | 3 | 0 | | :k8s-default-pod-network.inet.0 | | | | | | | default-domain:k8s-default:k8s-default-service- | 7 | 7 | 1 | 6 | 0 | | network:k8s-default-service-network.inet.0 | | | | | | +----------------------------------------------------+----------+-------+---------------+-----------------+------------------+ [root@ip-172-31-13-9 ~]# -> The old controller no longer has a prefix. //At the end of the ISSU process, the new kube-manager starts: [root@ip-172-31-19-25 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE cirros-deployment-75c98888b9-6qmcm 1/1 Running 0 34m 10.47.255.249 ip-172-31-25-102.ap-northeast-1.compute.internal <none> cirros-deployment-75c98888b9-lxq4k 1/1 Running 0 34m 10.47.255.250 ip-172-31-33-175.ap-northeast-1.compute.internal <none> cirros-deployment2-648b98685f-b8pxw 1/1 Running 0 15s 10.47.255.247 ip-172-31-25-102.ap-northeast-1.compute.internal <none> cirros-deployment2-648b98685f-nv7z9 1/1 Running 0 15s 10.47.255.248 ip-172-31-33-175.ap-northeast-1.compute.internal <none> [root@ip-172-31-19-25 ~]# -> Through the new IP Create Container (10.47.255.247, 10.47.255.248 Is the new address from the new controller) [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.19.25 ctr nei Introspect Host: 172.31.19.25 +----------------------+--------------+----------+----------+-----------+--------+-----------------+------------+-----------------------------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +----------------------+--------------+----------+----------+-----------+--------+-----------------+------------+-----------------------------+ | ip-172-31-13-9.local | 172.31.13.9 | 64512 | BGP | internal | Active | not advertising | 1 | 2019-Jun-23 05:37:02.614003 | +----------------------+--------------+----------+----------+-----------+--------+-----------------+------------+-----------------------------+ [root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host 172.31.13.9 ctr nei Introspect Host: 172.31.13.9 +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | peer | peer_address | peer_asn | encoding | peer_type | state | send_state | flap_count | flap_time | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ | ip-172-31-25-102.local | 172.31.25.102 | 0 | XMPP | internal | Established | in sync | 0 | n/a | | ip-172-31-33-175.local | 172.31.33.175 | 0 | XMPP | internal | Established | in sync | 0 | n/a | +------------------------+---------------+----------+----------+-----------+-------------+------------+------------+-----------+ [root@ip-172-31-13-9 ~]# -> No more new controllers iBGP Route to the old controller.The old controller still has iBGP Routing entries, although the process will soon stop:) //After the controller is stopped, configure: [root@ip-172-31-19-25 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE cirros-deployment-75c98888b9-6qmcm 1/1 Running 0 48m 10.47.255.249 ip-172-31-25-102.ap-northeast-1.compute.internal <none> cirros-deployment-75c98888b9-lxq4k 1/1 Running 0 48m 10.47.255.250 ip-172-31-33-175.ap-northeast-1.compute.internal <none> cirros-deployment2-648b98685f-b8pxw 1/1 Running 0 13m 10.47.255.247 ip-172-31-25-102.ap-northeast-1.compute.internal <none> cirros-deployment2-648b98685f-nv7z9 1/1 Running 0 13m 10.47.255.248 ip-172-31-33-175.ap-northeast-1.compute.internal <none> cirros-deployment3-68fb484676-ct9q9 1/1 Running 0 18s 10.47.255.245 ip-172-31-25-102.ap-northeast-1.compute.internal <none> cirros-deployment3-68fb484676-mxbzq 1/1 Running 0 18s 10.47.255.246 ip-172-31-33-175.ap-northeast-1.compute.internal <none> [root@ip-172-31-19-25 ~]# -> New containers can still be created [root@ip-172-31-25-102 ~]# contrail-status Pod Service Original Name State Id Status vrouter agent contrail-vrouter-agent running 9a46a1a721a7 Up 33 minutes vrouter nodemgr contrail-nodemgr running 11fb0a7bc86d Up 33 minutes vrouter kernel module is PRESENT == Contrail vrouter == nodemgr: active agent: active [root@ip-172-31-25-102 ~]# -> With a new controller vRouter Good work / # ping 10.47.255.250 PING 10.47.255.250 (10.47.255.250): 56 data bytes 64 bytes from 10.47.255.250: seq=0 ttl=63 time=1.781 ms 64 bytes from 10.47.255.250: seq=1 ttl=63 time=0.857 ms ^C --- 10.47.255.250 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 0.857/1.319/1.781 ms / # -> stay vRouter Between Ping Succeed
Backward compatibility
Since there are several ways to update the cluster (in-place, ISSU, ifdown vhost0 or not), the choice of the method is also an important topic.
Before discussing the details, let me first describe the behavior of vrouter-agent up / down and ifup vhost0 / ifdown vhost0.
When you restart a vrouter-agent, one assumption is that the vrouter-agent container and vhost0 are re-created.
In fact, this is not the case because vhost0 andVrouter.koIs tightly coupled and needs to be unloaded from the kernel Vrouter.koAnd delete it.So from an operational point of view, ifdown vhost0 is required, so not only do you need to update the vrouter-agent, but you also need to update itVrouter.ko.(ifdown vhost0 will also execute rmmod vrouter internally).
Therefore, to discuss backward compatibility, you need to explore the following three topics.
- Compatibility of controller s with vrouter-agent
- ISSU is required if there is no backward compatibility
- vrouter-agent andVrouter.koCompatibility of
- If there is no backward compatibility, ifdown vhost0 is required, which will result in a traffic loss of at least 5-10 seconds and therefore actually means that traffic needs to be transferred to other nodes, such as live migration.
- Because vrouter-agent uses netlink andVrouter.koSynchronize data, so schema changes can cause unexpected vrouter-agent behavior (such as vrouter-agent segmentation errors on Ksync logic)
- Vrouter.koCompatibility with Kernels
-
If there is no backward compatibility, the kernel needs to be updated, so this means that traffic needs to be moved to other nodes
- WhenVrouter.koWhen there are different in-kernal API s, they cannot be loaded by kernel s, and vhost0 and vrouter-agent cannot be created
For 2 and 3, kernels are unavoidably updated for a variety of reasons, so a possible plan is to first select a new version of kernels, and then select a vrouter-agent that supports the kernels /Vrouter.koAnd check if the vrouter-agent currently in use can be used with this version of control.
- Use in-place updates if they are working well; use ISSU if they are not working for some reason or require a rollback operation
For 1, because ifmap maintains white_for each version when importing config-api definitionsList.
- voidIFMapGraphWalker::AddNodesToWhitelist(): https://github.com/Juniper/contrail-controller/blob/master/src/ifmap/ifmap_graph_walker.cc#L349
Based on my attempts, it seems to have good backward compatibility (since routing information updates are similar to BGP s, they should also work well in most cases).
To verify this, I'm trying to set it up using a different version of the module and it still looks like it works.
I-1. config 2002-latest, control 2002-latest, vrouter 5.0-latest, openstack queens
I-2. config 2002-latest, control 5.0-latest, vrouter 5.0-latest, openstack queens
II-1. config 2002-latest, control 2002-latest, vrouter r5.1, kubernetes 1.12
Note: Unfortunately, this combination does not work very well (cni cannot get port information from vrouter-agent), I think this is due to a CNI version change between 5.0.x and 5.1 (0.2.0-> 0.3.1).
II-2. config 2002-latest, control 2002-latest, vrouter 5.0-latest, kubernetes 1.12
Therefore, it is a good practice to update config / control more frequently to fix possible errors, even if you do not need to change the kernel s and vRouter versions immediately.
Tungsten Fabric Getting Started Series of Articles:
- Seven Weapons for TF Components
- Layout Integration
- Things about installation (above)
- Things about installation (next)
- Integration of Mainstream Monitoring System Tools
- Start the next day's work
- 8 Typical Failures and Troubleshooting Tips
Tungsten Fabric Architecture Resolution Series article:
First: TF Main Features and Use Cases
Article 2: How TF works
Article 3: Detailed vRouter architecture
Article 4: TF Service Chain
Article 5: Deployment options for vRouter
Article 6: How does TF collect, analyze, and deploy?
Article 7: How TF is arranged
Article 8: Overview of TF Support API s
Article 9: How TF connects to a physical network
Article 10: TF Application-based Security Policy