Maintenance
Maintenance
Warning
This documentation is deprecated, please check here for its new home
Supported only for kubernetes clusters.
Note: The new procedure to replace nodes is supported for clusters created after Thu Jun 20, 2019 10:00. https://cern.service-now.com/service-portal?id=outage&n=OTG0050807
Node Upgrade with Node Groups
This procedure documents a way to upgrade cluster nodes - kernel, containerd, etc - or some of the system components - kubelet, kube-proxy, ... - without redeploying the full cluster, by creating a new node group with an updated image and label properties.
In the example below we update the image (Fedora 35) to get a new kernel, and pass the required label to update the the Nvidia GPU module.
-
Create a new node group (ng-5-16-13 below) with the new image and new GPU label. This will be the new node group replacement all the old nodes at the end of the process. In this case we start with 1 node, but we could have more
openstack coe nodegroup list rocha-001 | default-master | 1 | | default-worker | 2 | | ng-5-16-13-200 | 1 | openstack coe nodegroup create <CLUSTER> <NODEGROUP> \ --flavor <NODEFLAVOR> --node-count 1 --image 0548fdd4-2bf4-4678-9d43-b873da90f60f \ --merge-labels --label nvidia_gpu_tag=35-5.15.18-200.fc35.x86_64-470.82.00
Example:
openstack coe nodegroup create rocha-001 ng-5-16-13 \ --flavor m2.medium --node-count 1 \ --image 0548fdd4-2bf4-4678-9d43-b873da90f60f \ --merge-labels \ --label nvidia_gpu_tag=35-5.16.13-200.fc35.x86_64-470.82.00 \ --labels ignition_version=3.3.0 \ --labels containerd_tarball_url=https://s3.cern.ch/cri-containerd-release/cri-containerd-cni-1.4.13-1-linux-amd64.tar.gz \ --labels containerd_tarball_sha256=d38fb61ae471d4658e3230a83c3b0033914de58a18588cac7a069225dd5c9122 \ --labels kube_tag=v1.22.3-cern.1
-
After a couple minutes the new node should be up and ready to take workloads. If you don't want this immediately, make sure you cordon the new node
kubectl get node rocha-001-wk4lixixxz6y-master-0 ... v1.22.3 Fedora CoreOS 34.20210529.3.0 5.12.7-300.fc34.x86_64 containerd://1.4.3 rocha-001-wk4lixixxz6y-node-0 ... v1.22.3 Fedora CoreOS 34.20210529.3.0 5.12.7-300.fc34.x86_64 containerd://1.4.3 rocha-001-wk4lixixxz6y-node-1 ... v1.22.3 Fedora CoreOS 34.20210529.3.0 5.12.7-300.fc34.x86_64 containerd://1.4.3 rocha-001-ng-5-16-1-xunwhnak3zpl-node-0 ... v1.22.3 Fedora CoreOS 35.20220227.2.1 5.16.13-200.fc35.x86_64 containerd://1.4.13
-
Validate workloads landing in the new nodes to check all is well. You can also force new workloads into the new node(s) by cordoning old nodes
kubectl cordon rocha-001-wk4lixixxz6y-node-0 rocha-001-wk4lixixxz6y-node-0 Ready,SchedulingDisabled ...
-
If things look good, wait for workloads in the old (cordoned) nodes to finish, and gradually drop empty old nodes and grow the new node group correspondingly. In the first command below, the
--nodes-to-remove
option takes the index of the node(s) to be removed from the nodegroupopenstack coe cluster resize rocha-001 1 \ --nodegroup default-worker --nodes-to-remove 0 openstack coe cluster resize rocha-001 2 \ --nodegroup ng-5-16-13-200
-
At the end of this process, your old node group should have been resized to 0, with the new one taking its place
kubectl get node rocha-001-ng-5-16-1-xunwhnak3zpl-node-0 Ready v1.22.3 Fedora CoreOS 35.20220227.2.1 5.16.13-200.fc35.x86_64 containerd://1.4.13 rocha-001-ng-5-16-1-xunwhnak3zpl-node-2 Ready v1.22.3 Fedora CoreOS 35.20220227.2.1 5.16.13-200.fc35.x86_64 containerd://1.4.13 rocha-001-wk4lixixxz6y-master-0 Ready v1.22.3 Fedora CoreOS 34.20210529.3.0 5.12.7-300.fc34.x86_64 containerd://1.4.3
Node Replacement
Check the cluster node status:
$ kubectl get no
NAME STATUS ROLES AGE VERSION
mycluster-bzogfw4wjinp-minion-0 NotReady <none> 20d v1.11.2
mycluster-bzogfw4wjinp-minion-1 Ready <none> 20d v1.11.2
mycluster-bzogfw4wjinp-minion-2 Ready <none> 20d v1.11.2
You can use the index of the node to remove it and the desired final number of nodes. For example, to remove node mycluster-bzogfw4wjinp-minion-0 which is not healthy and keep the same number of nodes:
$ openstack coe cluster resize --nodes-to-remove 0 <cluster name or id> 3
Request to resize cluster mycluster has been accepted.
Node Replacement (Old procedure)
This is a temporary workaround and currently only works for kubernetes. An equivalent command will be integrated in Magnum.
Check the cluster node status:
$ kubectl get no
NAME STATUS ROLES AGE VERSION
mycluster-bzogfw4wjinp-minion-0 NotReady <none> 20d v1.11.2
mycluster-bzogfw4wjinp-minion-1 Ready <none> 20d v1.11.2
mycluster-bzogfw4wjinp-minion-2 Ready <none> 20d v1.11.2
To remove node mycluster-bzogfw4wjinp-minion-0, which is not healthy, you can use the index of the node to remove it, like so:
$ openstack coe cluster show <cluster-id> | grep stack_id
| stack_id | 6da935e1-4d4e-4688-b38f-fae204934d87 |
$ openstack stack update --existing --parameter minions_to_remove=0 <stack-id>
After a couple of minutes you should see the new node available:
mycluster-bzogfw4wjinp-minion-0 NotReady <none> 20d v1.11.2
mycluster-bzogfw4wjinp-minion-1 Ready <none> 20d v1.11.2
mycluster-bzogfw4wjinp-minion-2 Ready <none> 20d v1.11.2
mycluster-bzogfw4wjinp-minion-3 Ready <none> 5m v1.11.2
The node removed is gone but still shows up in the node list of kubernetes, you can explicitly drop it:
$ kubectl delete node mycluster-bzogfw4wjinp-minion-0
A current drawback is that the list of nodes in magnum is not updated:
$ openstack coe cluster show <cluster-id> | grep node_addresses
| node_addresses | ['137.138.6.201', '137.138.7.46', '137.138.7.46'] |
This will be fixed with the Magnum integration.