Node removal not supported, however it "works" in unexpected manner #95

danielskowronski · 2023-11-13T10:06:32Z

Some time ago, k0sctl added support for node removal.

This provider calls the necessary phase to reset controllers, but it doesn't prepare hosts list, so they can be removed. Data structure ClusterResourceModelHost misses Reset field, and there's no logic that would translate host removal from state to flag update, so it can be picked up by phase manager.

It's quite problematic, when after removal, a new host is added with the same IP, as this is the unique ID for many k0s structures - it results in split-brain. The cluster still tries to connect to a new VM using IP that was not removed (mainly from etcd) and the new VM is stuck on cluster init phase, but serves requests immediately. Control-plane HA requires a load-balancer, so without sophisticated checks it can easily serve two clusters at the same time.

As per docs, the workaround seems to be to manually execute k0s etcd leave --peer-address IP_ADDR on an alive node - in most cases the node we want to delete, but it gets tricky if we're rebuilding a crashed VM. More so, since destroy time provisioners in TF only work with clean destroy - not even with taint.

The text was updated successfully, but these errors were encountered:

- add other option to pass SSH key - as raw PEM-encoded string alessiodionisi#94 - add k0sctlconfig as output, so it can be used with k0sctl CLI alessiodionisi#76 - handle situation, where k0s leader is not available - attempt to validate cluster on Read phase using all controllers - investigate ways of actual node removal alessiodionisi#95

danielskowronski · 2023-12-04T17:56:17Z

This is not yet supported in k0sctl itself - k0sproject/k0sctl#603

danielskowronski mentioned this issue Nov 28, 2023

Raw SSH key support, k0sctl.yaml config output, read phase retries #96

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node removal not supported, however it "works" in unexpected manner #95

Node removal not supported, however it "works" in unexpected manner #95

danielskowronski commented Nov 13, 2023

danielskowronski commented Dec 4, 2023

Node removal not supported, however it "works" in unexpected manner #95

Node removal not supported, however it "works" in unexpected manner #95

Comments

danielskowronski commented Nov 13, 2023

danielskowronski commented Dec 4, 2023