Cluster-wide updates are an important feature that provides the effectiveness of the latest functionality in a cluster while guaranteeing the SLA of a production cluster.

Because Tungsten Fabric uses a similar protocol in MPLS-***, based on my attempts, there is basic interoperability even though the module versions of Control and vRouter are different.

Therefore, the general idea is to first update the controller s one by one, then update the vRouter s one by one using vMotion or maintence mode as needed.

Tungsten Fabric controller also supports a fantastic feature called ISSU, but I think the name is confusing because Tungsten Fabric controller is very similar to route reflector, not routing-engine.

Therefore, the basic idea is to first copy all configs to the newly created controller (or route reflectors), then update the vRouter settings (and update the vRouter module if the server can be restarted) to use these new controllers.Through this process, rollback operations for vRouter module updates will also be easier.

Let me describe this process below.

in-place update

Since ansible-deployer follows idempotent behavior, updates are not much different from installations.The following command updates all modules.

cd contrail-ansible-deployer
git pull
vi config/instances.yaml
ansible-playbook -e orchestrator=xxx -i inventory/ playbooks/install_contrail.yml

One limitation is that this command restarts almost all nodes at once, so it is not easy to restart controller s and vRouter s one by one.In addition, fromInstances.yamlDeleting other nodes in will not work because updating one node requires some parameters of the other nodes.

  • For example, vRouter updates require the control's IP, which is fromInstances.yamlDerived from the control role node in

To overcome this problem, start with R2005,Ziu.yamlAdd this feature, at least for the control plane, to update one by one.

cd contrail-ansible-deployer
git pull
vi config/instances.yaml
ansible-playbook -e orchestrator=xxx -i inventory/ playbooks/ziu.yml
ansible-playbook -e orchestrator=xxx -i inventory/ playbooks/install_contrail.yml

As far as I've tried, when the control plane updates, it updates in series and restarts the control process, so I don't see dropouts.

  • In install_Contrail.yamlDuring this period, restart of control processes is skipped because they have been updated.

  • When a vrouter-agent restart is performed, some packet dropouts still occur, so it is recommended that workload migration be performed if possible.


Even if container formats vary greatly (from 4.x to 5.x, for example), we can use ISSU because it creates a new controller cluster and replicates data in it.

First, let me describe the simplest scenario, an old controller and a new controller, to see the entire process.All commands are typed on the new controller.

 hostname: ip-172-31-2-209
 hostname: ip-172-31-1-154

(both controllers are installed with this instances.yaml)
   ssh_user: root
   ssh_public_key: /root/.ssh/id_rsa.pub
   ssh_private_key: /root/.ssh/id_rsa
   domainsuffix: local
   ntpserver: 0.centos.pool.ntp.org
   provider: bms
   ip: x.x.x.x ## controller's ip
  JVM_EXTRA_OPTS: "-Xms128m -Xmx1g"
  CONTAINER_REGISTRY: tungstenfabric

1. Stop Batch Job
docker stop config_devicemgr_1
docker stop config_schema_1
docker stop config_svcmonitor_1

2. stay cassandra Register new on control,Run between them bgp
docker exec -it config_api_1 bash
python /opt/contrail/utils/provision_control.py --host_name ip-172-31-1-154.local --host_ip --api_server_ip --api_server_port 8082 --oper add --router_asn 64512 --ibgp_auto_mesh

3. Synchronize data between controllers
vi contrail-issu.conf
(write down this)
old_rabbit_address_list =
old_rabbit_port = 5673
new_rabbit_address_list =
new_rabbit_port = 5673
old_cassandra_address_list =
old_zookeeper_address_list =
new_cassandra_address_list =
new_zookeeper_address_list =
new_api_info={"": [("root"), ("password")]} ## ssh public-key can be used

image_id=`docker images | awk '/config-api/{print $3}' | head -1`

docker run --rm -it --network host -v $(pwd)/contrail-issu.conf:/etc/contrail/contrail-issu.conf --entrypoint /bin/bash -v /root/.ssh:/root/.ssh $image_id -c "/usr/bin/contrail-issu-pre-sync -c /etc/contrail/contrail-issu.conf"

4. Start process for real-time data synchronization
docker run --rm --detach -it --network host -v $(pwd)/contrail-issu.conf:/etc/contrail/contrail-issu.conf --entrypoint /bin/bash -v /root/.ssh:/root/.ssh --name issu-run-sync $image_id -c "/usr/bin/contrail-issu-run-sync -c /etc/contrail/contrail-issu.conf"

(Check logs if necessary)
docker exec -t issu-run-sync tail -f /var/log/contrail/issu_contrail_run_sync.log

5. (To update vrouters)

6. Stop the job at the end and synchronize all data
docker rm -f issu-run-sync

image_id=`docker images | awk '/config-api/{print $3}' | head -1`
docker run --rm -it --network host -v $(pwd)/contrail-issu.conf:/etc/contrail/contrail-issu.conf --entrypoint /bin/bash -v /root/.ssh:/root/.ssh --name issu-run-sync $image_id -c "/usr/bin/contrail-issu-post-sync -c /etc/contrail/contrail-issu.conf"
docker run --rm -it --network host -v $(pwd)/contrail-issu.conf:/etc/contrail/contrail-issu.conf --entrypoint /bin/bash -v /root/.ssh:/root/.ssh --name issu-run-sync $image_id -c "/usr/bin/contrail-issu-zk-sync -c /etc/contrail/contrail-issu.conf"

7. from cassandra Remove old nodes and add new ones in
vi issu.conf
(write down this)
db_host_info={"": "ip-172-31-1-154.local"}
config_host_info={"": "ip-172-31-1-154.local"}
analytics_host_info={"": "ip-172-31-1-154.local"}
control_host_info={"": "ip-172-31-1-154.local"}

docker cp issu.conf config_api_1:issu.conf
docker exec -it config_api_1 python /opt/contrail/utils/provision_issu.py -c issu.conf

8. Start Batch Job
docker start config_devicemgr_1
docker start config_schema_1
docker start config_svcmonitor_1

The following will be possible checkpoints.

  1. After step 3, you can try using contrail-api-cli ls-l * to see if all the data has been successfully copied, and you canIst.pyCTR Nei to see if ibgp between controller s is started.

  2. After step 4, you can modify the old database to see if the changes can be propagated successfully to the new database.

Next, I'll discuss a more realistic scenario using the choreographer and the two vRouter s.

Layout Integration

To illustrate the case with the orchestrator, I attempted to deploy two vRouters and kubernetes with ansible-deployer.

Even when used in conjunction with a choreographer, the overall process is not much different.

It is important to note when you need to change the kube-manager to a new one.

In a sense, since kube-manager dynamically subscribes to events from kube-apiserver and updates the Tungsten Fabric configuration database (config-database), it is similar to batch jobs such as schema-transformer, svc-monitor, and device-manager.Therefore, I use this type of batch job and stop and start the old or new kube-manager (which actually includes the webui) at the same time, but it may need to be changed for each setting.

The overall process in this example is shown below.

1. Set up a controller (with a kube-manager and kubernetes-master) and two vRouter s
 2. Set up a new controller (with a kube-manager, but the kubernetes-master is the same as the old controller)
3. Stop the batch job, the kube-manager of the new controller, and the webui
 4. Start the ISSU process and continue execution until run-sync starts running
  ->iBGP will be established between controller s
 5. Update vRouter one by one according to the new controller's ansible-ployer
   ->When a vRouter is moved to a new vRouter, the new controller will also get a route-target for k8s-default-pod-network, and pings will still work between containers.Ist.pyCTR route summary and ping results will be attached later)
6. Stop the batch job, the kube-manager on the old controller, and the webui after moving all vRouter s to the new controller
    After that, continue with the ISSU process, start the batch job on the new controller, kube-manager, and webui
   ->You cannot manually change the config-database from the beginning to the end of this phase, so it may take some maintenance time
   (The whole process may last 5 to 15 minutes and ping will work, but the creation of a new container will not work until a new kube-manager is started)
7. Finally, stop the control, config, and config-database on the old node

When updating vRouters, I used the controller's provider: bms-maint, k8s_master and vRouter, which have been changed to new to avoid interference due to container restart.I attached the originalInstances.yamlAnd update vRouter'sInstances.yamlSo that you can get more details.

I will also attach at each stageIst.pyCTR Nei andIst.pyThe result of the CTR route summary to illustrate the details of what happened.

  • Note that in this example, I did not actually update the module, because this setting is primarily intended to highlight the ISSU process (because even if the module versions are the same, ansible-deployer will recreate the vrouter-agent container, and even if the actual module updates are completed, the number of packets lost will not be much different.)

//Before issu begins:

[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                   | peer_address  | peer_asn | encoding | peer_type | state       | send_state | flap_count | flap_time |
| ip-172-31-25-102.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
| ip-172-31-33-175.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
[root@ip-172-31-13-9 ~]# 
[root@ip-172-31-13-9 ~]# 
[root@ip-172-31-13-9 ~]# 
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
[root@ip-172-31-13-9 ~]# 
 -> iBGP is not configured yet

[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr route summary
Introspect Host:
| name                                               | prefixes | paths | primary_paths | secondary_paths | infeasible_paths |
| default-domain:default-                            | 0        | 0     | 0             | 0               | 0                |
| project:__link_local__:__link_local__.inet.0       |          |       |               |                 |                  |
| default-domain:default-project:dci-                | 0        | 0     | 0             | 0               | 0                |
| network:__default__.inet.0                         |          |       |               |                 |                  |
| default-domain:default-project:dci-network:dci-    | 0        | 0     | 0             | 0               | 0                |
| network.inet.0                                     |          |       |               |                 |                  |
| default-domain:default-project:default-virtual-    | 0        | 0     | 0             | 0               | 0                |
| network:default-virtual-network.inet.0             |          |       |               |                 |                  |
| inet.0                                             | 0        | 0     | 0             | 0               | 0                |
| default-domain:default-project:ip-fabric:ip-       | 7        | 7     | 2             | 5               | 0                |
| fabric.inet.0                                      |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-pod-network | 7        | 7     | 4             | 3               | 0                |
| :k8s-default-pod-network.inet.0                    |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-service-    | 7        | 7     | 1             | 6               | 0                |
| network:k8s-default-service-network.inet.0         |          |       |               |                 |                  |
[root@ip-172-31-13-9 ~]# 
[root@ip-172-31-13-9 ~]# 
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr route summary
Introspect Host:
| name                                               | prefixes | paths | primary_paths | secondary_paths | infeasible_paths |
| default-domain:default-                            | 0        | 0     | 0             | 0               | 0                |
| project:__link_local__:__link_local__.inet.0       |          |       |               |                 |                  |
| default-domain:default-project:dci-                | 0        | 0     | 0             | 0               | 0                |
| network:__default__.inet.0                         |          |       |               |                 |                  |
| default-domain:default-project:dci-network:dci-    | 0        | 0     | 0             | 0               | 0                |
| network.inet.0                                     |          |       |               |                 |                  |
| default-domain:default-project:default-virtual-    | 0        | 0     | 0             | 0               | 0                |
| network:default-virtual-network.inet.0             |          |       |               |                 |                  |
| inet.0                                             | 0        | 0     | 0             | 0               | 0                |
| default-domain:default-project:ip-fabric:ip-       | 0        | 0     | 0             | 0               | 0                |
| fabric.inet.0                                      |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-pod-network | 0        | 0     | 0             | 0               | 0                |
| :k8s-default-pod-network.inet.0                    |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-service-    | 0        | 0     | 0             | 0               | 0                |
| network:k8s-default-service-network.inet.0         |          |       |               |                 |                  |
[root@ip-172-31-13-9 ~]# 
 -> No routes were imported in the new controller

[root@ip-172-31-19-25 contrail-ansible-deployer]# kubectl get pod -o wide
NAME                                 READY   STATUS    RESTARTS   AGE     IP              NODE                                               NOMINATED NODE
cirros-deployment-75c98888b9-6qmcm   1/1     Running   0          4m58s   ip-172-31-25-102.ap-northeast-1.compute.internal   <none>
cirros-deployment-75c98888b9-lxq4k   1/1     Running   0          4m58s   ip-172-31-33-175.ap-northeast-1.compute.internal   <none>
[root@ip-172-31-19-25 contrail-ansible-deployer]# 

/ # ip -o a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000\    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
1: lo    inet scope host lo\       valid_lft forever preferred_lft forever
13: eth0@if14: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue \    link/ether 02:6b:dc:98:ac:95 brd ff:ff:ff:ff:ff:ff
13: eth0    inet scope global eth0\       valid_lft forever preferred_lft forever
/ # ping
PING ( 56 data bytes
64 bytes from seq=0 ttl=63 time=2.155 ms
64 bytes from seq=1 ttl=63 time=0.904 ms
--- ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.904/1.529/2.155 ms
/ # 
 -> Two vRouter Each has a container, between two containers ping The results were normal.

//In provision_After control:

[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                   | peer_address  | peer_asn | encoding | peer_type | state       | send_state      | flap_count | flap_time |
| ip-172-31-13-9.local   |   | 64512    | BGP      | internal  | Idle        | not advertising | 0          | n/a       |
| ip-172-31-25-102.local | | 0        | XMPP     | internal  | Established | in sync         | 0          | n/a       |
| ip-172-31-33-175.local | | 0        | XMPP     | internal  | Established | in sync         | 0          | n/a       |
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
[root@ip-172-31-13-9 ~]#
 -> iBGP On the old controller, but the new controller doesn't have those configurations (in execution) pre-sync This will then be copied to the new controller)

//After run-sync:
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                   | peer_address  | peer_asn | encoding | peer_type | state       | send_state | flap_count | flap_time |
| ip-172-31-13-9.local   |   | 64512    | BGP      | internal  | Established | in sync    | 0          | n/a       |
| ip-172-31-25-102.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
| ip-172-31-33-175.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                  | peer_address | peer_asn | encoding | peer_type | state       | send_state | flap_count | flap_time |
| ip-172-31-19-25.local | | 64512    | BGP      | internal  | Established | in sync    | 0          | n/a       |
[root@ip-172-31-13-9 ~]#
 -> iBGP It's built up, ctr route summary No change because the new controller does not k8s-default-pod-network Routing objectives ( route-target),Routing destination ( route target)Filters organize the import of these prefixes.

//After migrating the node to the new controller:

/ # ping
PING ( 56 data bytes
64 bytes from seq=0 ttl=63 time=1.684 ms
64 bytes from seq=1 ttl=63 time=0.835 ms
64 bytes from seq=2 ttl=63 time=0.836 ms
64 bytes from seq=37 ttl=63 time=0.878 ms
64 bytes from seq=38 ttl=63 time=0.823 ms
64 bytes from seq=39 ttl=63 time=0.820 ms
64 bytes from seq=40 ttl=63 time=1.364 ms
64 bytes from seq=44 ttl=63 time=2.209 ms
64 bytes from seq=45 ttl=63 time=0.869 ms
64 bytes from seq=46 ttl=63 time=0.857 ms
64 bytes from seq=47 ttl=63 time=0.855 ms
64 bytes from seq=48 ttl=63 time=0.845 ms
64 bytes from seq=49 ttl=63 time=0.842 ms
64 bytes from seq=50 ttl=63 time=0.885 ms
64 bytes from seq=51 ttl=63 time=0.891 ms
64 bytes from seq=52 ttl=63 time=0.909 ms
64 bytes from seq=53 ttl=63 time=0.867 ms
64 bytes from seq=54 ttl=63 time=0.884 ms
64 bytes from seq=55 ttl=63 time=0.865 ms
64 bytes from seq=56 ttl=63 time=0.840 ms
64 bytes from seq=57 ttl=63 time=0.877 ms
--- ping statistics ---
58 packets transmitted, 55 packets received, 5% packet loss
round-trip min/avg/max = 0.810/0.930/2.209 ms
/ # 
 -> stay vrouter-agent After restarting, you can see that three packages have been lost(Number 40-44). In migration vRouter When new, ping Good job.

[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
//Check host:
| peer                   | peer_address  | peer_asn | encoding | peer_type | state       | send_state | flap_count | flap_time |
| ip-172-31-13-9.local   |   | 64512    | BGP      | internal  | Established | in sync    | 0          | n/a       |
| ip-172-31-33-175.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                   | peer_address  | peer_asn | encoding | peer_type | state       | send_state | flap_count | flap_time |
| ip-172-31-19-25.local  |  | 64512    | BGP      | internal  | Established | in sync    | 0          | n/a       |
| ip-172-31-25-102.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
[root@ip-172-31-13-9 ~]# 
 -> Both controllers have XMPP Connect, set up IBGP

[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr route summary
//Check host:
| name                                               | prefixes | paths | primary_paths | secondary_paths | infeasible_paths |
| default-domain:default-                            | 0        | 0     | 0             | 0               | 0                |
| project:__link_local__:__link_local__.inet.0       |          |       |               |                 |                  |
| default-domain:default-project:dci-                | 0        | 0     | 0             | 0               | 0                |
| network:__default__.inet.0                         |          |       |               |                 |                  |
| default-domain:default-project:dci-network:dci-    | 0        | 0     | 0             | 0               | 0                |
| network.inet.0                                     |          |       |               |                 |                  |
| default-domain:default-project:default-virtual-    | 0        | 0     | 0             | 0               | 0                |
| network:default-virtual-network.inet.0             |          |       |               |                 |                  |
| inet.0                                             | 0        | 0     | 0             | 0               | 0                |
| default-domain:default-project:ip-fabric:ip-       | 7        | 7     | 1             | 6               | 0                |
| fabric.inet.0                                      |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-pod-network | 7        | 7     | 1             | 6               | 0                |
| :k8s-default-pod-network.inet.0                    |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-service-    | 7        | 7     | 0             | 7               | 0                |
| network:k8s-default-service-network.inet.0         |          |       |               |                 |                  |
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr route summary
//Check host:
| name                                               | prefixes | paths | primary_paths | secondary_paths | infeasible_paths |
| default-domain:default-                            | 0        | 0     | 0             | 0               | 0                |
| project:__link_local__:__link_local__.inet.0       |          |       |               |                 |                  |
| default-domain:default-project:dci-                | 0        | 0     | 0             | 0               | 0                |
| network:__default__.inet.0                         |          |       |               |                 |                  |
| default-domain:default-project:dci-network:dci-    | 0        | 0     | 0             | 0               | 0                |
| network.inet.0                                     |          |       |               |                 |                  |
| default-domain:default-project:default-virtual-    | 0        | 0     | 0             | 0               | 0                |
| network:default-virtual-network.inet.0             |          |       |               |                 |                  |
| inet.0                                             | 0        | 0     | 0             | 0               | 0                |
| default-domain:default-project:ip-fabric:ip-       | 7        | 7     | 1             | 6               | 0                |
| fabric.inet.0                                      |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-pod-network | 7        | 7     | 3             | 4               | 0                |
| :k8s-default-pod-network.inet.0                    |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-service-    | 7        | 7     | 1             | 6               | 0                |
| network:k8s-default-service-network.inet.0         |          |       |               |                 |                  |
[root@ip-172-31-13-9 ~]# 
 -> Because both controllers have at least one container from k8s-default-pod-network, They use iBGP To exchange prefixes, so they have the same prefix

//After migrating the second vrouter to the new controller:
/ # ping
PING ( 56 data bytes
64 bytes from seq=0 ttl=63 time=1.750 ms
64 bytes from seq=1 ttl=63 time=0.815 ms
64 bytes from seq=2 ttl=63 time=0.851 ms
64 bytes from seq=3 ttl=63 time=0.809 ms
64 bytes from seq=34 ttl=63 time=0.853 ms
64 bytes from seq=35 ttl=63 time=0.848 ms
64 bytes from seq=36 ttl=63 time=0.833 ms
64 bytes from seq=37 ttl=63 time=0.832 ms
64 bytes from seq=38 ttl=63 time=0.910 ms
64 bytes from seq=42 ttl=63 time=2.071 ms
64 bytes from seq=43 ttl=63 time=0.826 ms
64 bytes from seq=44 ttl=63 time=0.853 ms
64 bytes from seq=45 ttl=63 time=0.851 ms
64 bytes from seq=46 ttl=63 time=0.853 ms
64 bytes from seq=47 ttl=63 time=0.851 ms
64 bytes from seq=48 ttl=63 time=0.855 ms
64 bytes from seq=49 ttl=63 time=0.869 ms
64 bytes from seq=50 ttl=63 time=0.833 ms
64 bytes from seq=51 ttl=63 time=0.859 ms
64 bytes from seq=52 ttl=63 time=0.866 ms
64 bytes from seq=53 ttl=63 time=0.840 ms
64 bytes from seq=54 ttl=63 time=0.841 ms
64 bytes from seq=55 ttl=63 time=0.854 ms
--- ping statistics ---
56 packets transmitted, 53 packets received, 5% packet loss
round-trip min/avg/max = 0.799/0.888/2.071 ms
/ #
 -> 3 packet loss is seen (seq 38-42)

[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                 | peer_address | peer_asn | encoding | peer_type | state       | send_state | flap_count | flap_time |
| ip-172-31-13-9.local |  | 64512    | BGP      | internal  | Established | in sync    | 0          | n/a       |
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                   | peer_address  | peer_asn | encoding | peer_type | state       | send_state | flap_count | flap_time |
| ip-172-31-19-25.local  |  | 64512    | BGP      | internal  | Established | in sync    | 0          | n/a       |
| ip-172-31-25-102.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
| ip-172-31-33-175.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
[root@ip-172-31-13-9 ~]# 
 -> The new controller has two XMPP Connect.

[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr route summary
//Check host:
| name                                               | prefixes | paths | primary_paths | secondary_paths | infeasible_paths |
| default-domain:default-                            | 0        | 0     | 0             | 0               | 0                |
| project:__link_local__:__link_local__.inet.0       |          |       |               |                 |                  |
| default-domain:default-project:dci-                | 0        | 0     | 0             | 0               | 0                |
| network:__default__.inet.0                         |          |       |               |                 |                  |
| default-domain:default-project:dci-network:dci-    | 0        | 0     | 0             | 0               | 0                |
| network.inet.0                                     |          |       |               |                 |                  |
| default-domain:default-project:default-virtual-    | 0        | 0     | 0             | 0               | 0                |
| network:default-virtual-network.inet.0             |          |       |               |                 |                  |
| inet.0                                             | 0        | 0     | 0             | 0               | 0                |
| default-domain:default-project:ip-fabric:ip-       | 0        | 0     | 0             | 0               | 0                |
| fabric.inet.0                                      |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-pod-network | 0        | 0     | 0             | 0               | 0                |
| :k8s-default-pod-network.inet.0                    |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-service-    | 0        | 0     | 0             | 0               | 0                |
| network:k8s-default-service-network.inet.0         |          |       |               |                 |                  |
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr route summary
//Check host:
| name                                               | prefixes | paths | primary_paths | secondary_paths | infeasible_paths |
| default-domain:default-                            | 0        | 0     | 0             | 0               | 0                |
| project:__link_local__:__link_local__.inet.0       |          |       |               |                 |                  |
| default-domain:default-project:dci-                | 0        | 0     | 0             | 0               | 0                |
| network:__default__.inet.0                         |          |       |               |                 |                  |
| default-domain:default-project:dci-network:dci-    | 0        | 0     | 0             | 0               | 0                |
| network.inet.0                                     |          |       |               |                 |                  |
| default-domain:default-project:default-virtual-    | 0        | 0     | 0             | 0               | 0                |
| network:default-virtual-network.inet.0             |          |       |               |                 |                  |
| inet.0                                             | 0        | 0     | 0             | 0               | 0                |
| default-domain:default-project:ip-fabric:ip-       | 7        | 7     | 2             | 5               | 0                |
| fabric.inet.0                                      |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-pod-network | 7        | 7     | 4             | 3               | 0                |
| :k8s-default-pod-network.inet.0                    |          |       |               |                 |                  |
| default-domain:k8s-default:k8s-default-service-    | 7        | 7     | 1             | 6               | 0                |
| network:k8s-default-service-network.inet.0         |          |       |               |                 |                  |
[root@ip-172-31-13-9 ~]#
 -> The old controller no longer has a prefix.

//At the end of the ISSU process, the new kube-manager starts:
[root@ip-172-31-19-25 ~]# kubectl get pod -o wide
NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE                                               NOMINATED NODE
cirros-deployment-75c98888b9-6qmcm    1/1     Running   0          34m   ip-172-31-25-102.ap-northeast-1.compute.internal   <none>
cirros-deployment-75c98888b9-lxq4k    1/1     Running   0          34m   ip-172-31-33-175.ap-northeast-1.compute.internal   <none>
cirros-deployment2-648b98685f-b8pxw   1/1     Running   0          15s   ip-172-31-25-102.ap-northeast-1.compute.internal   <none>
cirros-deployment2-648b98685f-nv7z9   1/1     Running   0          15s   ip-172-31-33-175.ap-northeast-1.compute.internal   <none>
[root@ip-172-31-19-25 ~]# 
 -> Through the new IP Create Container (, Is the new address from the new controller)

[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                 | peer_address | peer_asn | encoding | peer_type | state  | send_state      | flap_count | flap_time                   |
| ip-172-31-13-9.local |  | 64512    | BGP      | internal  | Active | not advertising | 1          | 2019-Jun-23 05:37:02.614003 |
[root@ip-172-31-13-9 ~]# ./contrail-introspect-cli/ist.py --host ctr nei
Introspect Host:
| peer                   | peer_address  | peer_asn | encoding | peer_type | state       | send_state | flap_count | flap_time |
| ip-172-31-25-102.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
| ip-172-31-33-175.local | | 0        | XMPP     | internal  | Established | in sync    | 0          | n/a       |
[root@ip-172-31-13-9 ~]#
 -> No more new controllers iBGP Route to the old controller.The old controller still has iBGP Routing entries, although the process will soon stop:)

//After the controller is stopped, configure:
[root@ip-172-31-19-25 ~]# kubectl get pod -o wide
NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE                                               NOMINATED NODE
cirros-deployment-75c98888b9-6qmcm    1/1     Running   0          48m   ip-172-31-25-102.ap-northeast-1.compute.internal   <none>
cirros-deployment-75c98888b9-lxq4k    1/1     Running   0          48m   ip-172-31-33-175.ap-northeast-1.compute.internal   <none>
cirros-deployment2-648b98685f-b8pxw   1/1     Running   0          13m   ip-172-31-25-102.ap-northeast-1.compute.internal   <none>
cirros-deployment2-648b98685f-nv7z9   1/1     Running   0          13m   ip-172-31-33-175.ap-northeast-1.compute.internal   <none>
cirros-deployment3-68fb484676-ct9q9   1/1     Running   0          18s   ip-172-31-25-102.ap-northeast-1.compute.internal   <none>
cirros-deployment3-68fb484676-mxbzq   1/1     Running   0          18s   ip-172-31-33-175.ap-northeast-1.compute.internal   <none>
[root@ip-172-31-19-25 ~]# 
 -> New containers can still be created

[root@ip-172-31-25-102 ~]# contrail-status 
Pod      Service  Original Name           State    Id            Status         
vrouter  agent    contrail-vrouter-agent  running  9a46a1a721a7  Up 33 minutes  
vrouter  nodemgr  contrail-nodemgr        running  11fb0a7bc86d  Up 33 minutes  

vrouter kernel module is PRESENT
== Contrail vrouter ==
nodemgr: active
agent: active

[root@ip-172-31-25-102 ~]# 
 -> With a new controller vRouter Good work

/ # ping
PING ( 56 data bytes
64 bytes from seq=0 ttl=63 time=1.781 ms
64 bytes from seq=1 ttl=63 time=0.857 ms
--- ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.857/1.319/1.781 ms
/ #
 -> stay vRouter Between Ping Succeed

Backward compatibility

Since there are several ways to update the cluster (in-place, ISSU, ifdown vhost0 or not), the choice of the method is also an important topic.

Before discussing the details, let me first describe the behavior of vrouter-agent up / down and ifup vhost0 / ifdown vhost0.

When you restart a vrouter-agent, one assumption is that the vrouter-agent container and vhost0 are re-created.

In fact, this is not the case because vhost0 andVrouter.koIs tightly coupled and needs to be unloaded from the kernel Vrouter.koAnd delete it.So from an operational point of view, ifdown vhost0 is required, so not only do you need to update the vrouter-agent, but you also need to update itVrouter.ko.(ifdown vhost0 will also execute rmmod vrouter internally).

Therefore, to discuss backward compatibility, you need to explore the following three topics.

  1. Compatibility of controller s with vrouter-agent
  • ISSU is required if there is no backward compatibility
  1. vrouter-agent andVrouter.koCompatibility of
  • If there is no backward compatibility, ifdown vhost0 is required, which will result in a traffic loss of at least 5-10 seconds and therefore actually means that traffic needs to be transferred to other nodes, such as live migration.
  • Because vrouter-agent uses netlink andVrouter.koSynchronize data, so schema changes can cause unexpected vrouter-agent behavior (such as vrouter-agent segmentation errors on Ksync logic)
  1. Vrouter.koCompatibility with Kernels
  • If there is no backward compatibility, the kernel needs to be updated, so this means that traffic needs to be moved to other nodes

  • WhenVrouter.koWhen there are different in-kernal API s, they cannot be loaded by kernel s, and vhost0 and vrouter-agent cannot be created

For 2 and 3, kernels are unavoidably updated for a variety of reasons, so a possible plan is to first select a new version of kernels, and then select a vrouter-agent that supports the kernels /Vrouter.koAnd check if the vrouter-agent currently in use can be used with this version of control.

  • Use in-place updates if they are working well; use ISSU if they are not working for some reason or require a rollback operation

For 1, because ifmap maintains white_for each version when importing config-api definitionsList.

Based on my attempts, it seems to have good backward compatibility (since routing information updates are similar to BGP s, they should also work well in most cases).

To verify this, I'm trying to set it up using a different version of the module and it still looks like it works.

I-1. config 2002-latest, control 2002-latest, vrouter 5.0-latest, openstack queens

I-2. config 2002-latest, control 5.0-latest, vrouter 5.0-latest, openstack queens

II-1. config 2002-latest, control 2002-latest, vrouter r5.1, kubernetes 1.12

Note: Unfortunately, this combination does not work very well (cni cannot get port information from vrouter-agent), I think this is due to a CNI version change between 5.0.x and 5.1 (0.2.0-> 0.3.1).

II-2. config 2002-latest, control 2002-latest, vrouter 5.0-latest, kubernetes 1.12

Therefore, it is a good practice to update config / control more frequently to fix possible errors, even if you do not need to change the kernel s and vRouter versions immediately.

