Skip to content

Maintenance

Maintenance

Supported only for kubernetes clusters.

Note: The new procedure to replace nodes is supported for clusters created after Thu Jun 20, 2019 10:00. https://cern.service-now.com/service-portal?id=outage&n=OTG0050807

Node Replacement

Check the cluster node status:

$ kubectl get no
NAME                              STATUS     ROLES     AGE       VERSION
mycluster-bzogfw4wjinp-minion-0   NotReady   <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-1   Ready      <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-2   Ready      <none>    20d       v1.11.2

You can use the index of the node to remove it and the desired final number of nodes. For example, to remove node mycluster-bzogfw4wjinp-minion-0 which is not healthy and keep the same number of nodes:

$ openstack coe cluster resize --nodes-to-remove 0 <cluster name or id> 3
Request to resize cluster mycluster has been accepted.

Node Replacement (Old procedure)

This is a temporary workaround and currently only works for kubernetes. An equivalent command will be integrated in Magnum.

Check the cluster node status:

$ kubectl get no
NAME                              STATUS     ROLES     AGE       VERSION
mycluster-bzogfw4wjinp-minion-0   NotReady   <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-1   Ready      <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-2   Ready      <none>    20d       v1.11.2

To remove node mycluster-bzogfw4wjinp-minion-0, which is not healthy, you can use the index of the node to remove it, like so:

$ openstack coe cluster show <cluster-id> | grep stack_id
| stack_id            | 6da935e1-4d4e-4688-b38f-fae204934d87                              |

$ openstack stack update --existing --parameter minions_to_remove=0 <stack-id>

After a couple of minutes you should see the new node available:

mycluster-bzogfw4wjinp-minion-0   NotReady   <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-1   Ready      <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-2   Ready      <none>    20d       v1.11.2
mycluster-bzogfw4wjinp-minion-3   Ready      <none>    5m        v1.11.2

The node removed is gone but still shows up in the node list of kubernetes, you can explicitly drop it:

$ kubectl delete node mycluster-bzogfw4wjinp-minion-0

A current drawback is that the list of nodes in magnum is not updated:

$ openstack coe cluster show <cluster-id> | grep node_addresses
| node_addresses      | ['137.138.6.201', '137.138.7.46', '137.138.7.46'] |

This will be fixed with the Magnum integration.


Last update: April 26, 2021