Skip to content

Maintenance

Maintenance

Warning

This documentation is deprecated, please check here for its new home

Supported only for kubernetes clusters.

Note: The new procedure to replace nodes is supported for clusters created after Thu Jun 20, 2019 10:00. https://cern.service-now.com/service-portal?id=outage&n=OTG0050807

Node Upgrade with Node Groups

This procedure documents a way to upgrade cluster nodes - kernel, containerd, etc - or some of the system components - kubelet, kube-proxy, ... - without redeploying the full cluster, by creating a new node group with an updated image and label properties.

In the example below we update the image (Fedora 35) to get a new kernel, and pass the required label to update the the Nvidia GPU module.

  1. Create a new node group (ng-5-16-13 below) with the new image and new GPU label. This will be the new node group replacement all the old nodes at the end of the process. In this case we start with 1 node, but we could have more

    openstack coe nodegroup list rocha-001
    | default-master |          1 |
    | default-worker |          2 |
    | ng-5-16-13-200 |          1 |
    
    openstack coe nodegroup create <CLUSTER> <NODEGROUP> \
      --flavor <NODEFLAVOR> --node-count 1
      --image 0548fdd4-2bf4-4678-9d43-b873da90f60f \
      --merge-labels --label nvidia_gpu_tag=35-5.15.18-200.fc35.x86_64-470.82.00
    

    Example:

    openstack coe nodegroup create rocha-001 ng-5-16-13 \
      --flavor m2.medium --node-count 1 \
      --image 0548fdd4-2bf4-4678-9d43-b873da90f60f \
      --merge-labels \
      --label nvidia_gpu_tag=35-5.16.13-200.fc35.x86_64-470.82.00 \
      --labels ignition_version=3.3.0 \
      --labels containerd_tarball_url=https://s3.cern.ch/cri-containerd-release/cri-containerd-cni-1.4.13-1-linux-amd64.tar.gz \
      --labels containerd_tarball_sha256=d38fb61ae471d4658e3230a83c3b0033914de58a18588cac7a069225dd5c9122 \
      --labels kube_tag=v1.22.3-cern.1
    
  2. After a couple minutes the new node should be up and ready to take workloads. If you don't want this immediately, make sure you cordon the new node

    kubectl get node
    rocha-001-wk4lixixxz6y-master-0         ... v1.22.3   Fedora CoreOS 34.20210529.3.0   5.12.7-300.fc34.x86_64   containerd://1.4.3
    rocha-001-wk4lixixxz6y-node-0           ... v1.22.3   Fedora CoreOS 34.20210529.3.0   5.12.7-300.fc34.x86_64   containerd://1.4.3
    rocha-001-wk4lixixxz6y-node-1           ... v1.22.3   Fedora CoreOS 34.20210529.3.0   5.12.7-300.fc34.x86_64   containerd://1.4.3
    rocha-001-ng-5-16-1-xunwhnak3zpl-node-0 ... v1.22.3   Fedora CoreOS 35.20220227.2.1   5.16.13-200.fc35.x86_64  containerd://1.4.13
    
  3. Validate workloads landing in the new nodes to check all is well. You can also force new workloads into the new node(s) by cordoning old nodes

    kubectl cordon rocha-001-wk4lixixxz6y-node-0
    rocha-001-wk4lixixxz6y-node-0   Ready,SchedulingDisabled   ...
    
  4. If things look good, wait for workloads in the old (cordoned) nodes to finish, and gradually drop empty old nodes and grow the new node group correspondingly. In the first command below, the --nodes-to-remove option takes the index of the node(s) to be removed from the nodegroup

    openstack coe cluster resize rocha-001 1 \
      --nodegroup default-worker --nodes-to-remove 0
    
    openstack coe cluster resize rocha-001 2 \
      --nodegroup ng-5-16-13-200
    
  5. At the end of this process, your old node group should have been resized to 0, with the new one taking its place

    kubectl get node
    rocha-001-ng-5-16-1-xunwhnak3zpl-node-0   Ready   v1.22.3   Fedora CoreOS 35.20220227.2.1   5.16.13-200.fc35.x86_64   containerd://1.4.13
    rocha-001-ng-5-16-1-xunwhnak3zpl-node-2   Ready   v1.22.3   Fedora CoreOS 35.20220227.2.1   5.16.13-200.fc35.x86_64   containerd://1.4.13
    rocha-001-wk4lixixxz6y-master-0           Ready   v1.22.3   Fedora CoreOS 34.20210529.3.0   5.12.7-300.fc34.x86_64    containerd://1.4.3
    

Node Replacement

Check the cluster node status:

$ kubectl get no
NAME                              STATUS     ROLES     AGE       VERSION
mycluster-bzogfw4wjinp-minion-0   NotReady   <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-1   Ready      <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-2   Ready      <none>    20d       v1.11.2

You can use the index of the node to remove it and the desired final number of nodes. For example, to remove node mycluster-bzogfw4wjinp-minion-0 which is not healthy and keep the same number of nodes:

$ openstack coe cluster resize --nodes-to-remove 0 <cluster name or id> 3
Request to resize cluster mycluster has been accepted.

Node Replacement (Old procedure)

This is a temporary workaround and currently only works for kubernetes. An equivalent command will be integrated in Magnum.

Check the cluster node status:

$ kubectl get no
NAME                              STATUS     ROLES     AGE       VERSION
mycluster-bzogfw4wjinp-minion-0   NotReady   <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-1   Ready      <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-2   Ready      <none>    20d       v1.11.2

To remove node mycluster-bzogfw4wjinp-minion-0, which is not healthy, you can use the index of the node to remove it, like so:

$ openstack coe cluster show <cluster-id> | grep stack_id
| stack_id            | 6da935e1-4d4e-4688-b38f-fae204934d87                              |

$ openstack stack update --existing --parameter minions_to_remove=0 <stack-id>

After a couple of minutes you should see the new node available:

mycluster-bzogfw4wjinp-minion-0   NotReady   <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-1   Ready      <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-2   Ready      <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-3   Ready      <none>    5m        v1.11.2

The node removed is gone but still shows up in the node list of kubernetes, you can explicitly drop it:

$ kubectl delete node mycluster-bzogfw4wjinp-minion-0

A current drawback is that the list of nodes in magnum is not updated:

$ openstack coe cluster show <cluster-id> | grep node_addresses
| node_addresses      | ['137.138.6.201', '137.138.7.46', '137.138.7.46'] |

This will be fixed with the Magnum integration.


Last update: June 1, 2022