Skip to content

Fix Kubernetes Node Handling#481

Open
dbw7 wants to merge 1 commit into
SUSE:mainfrom
dbw7:kubernetes-node-fix
Open

Fix Kubernetes Node Handling#481
dbw7 wants to merge 1 commit into
SUSE:mainfrom
dbw7:kubernetes-node-fix

Conversation

@dbw7

@dbw7 dbw7 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Closes #437

@dbw7 dbw7 requested a review from a team as a code owner June 2, 2026 13:49
@rdoxenham

Copy link
Copy Markdown
Contributor

Pulled this patch in and deployed a single node with the following cluster.yaml and server.yaml:

% cat kubernetes/cluster.yaml
nodes:
- hostname: uc-test.rancher.local
  type: server
network:
    apiVIP: 192.168.122.100
    apiHost: 192.168.122.100.sslip.io

% cat kubernetes/config/server.yaml
token: totally-not-generated-one
selinux: true
cni: cilium
embedded-registry: true

And it worked just fine...

# cat /etc/rancher/rke2/config.yaml
cni: cilium
embedded-registry: true
selinux: true
tls-san:
    - 192.168.122.100
    - 192.168.122.100.sslip.io
token: totally-not-generated-one

I still need to try a multi-node just to ensure that still works, but it certainly fixes the issue recorded in #437. Thanks!!

@frelon frelon left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@dharmit

dharmit commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Pulled this patch in and deployed a single node with the following cluster.yaml and server.yaml:

% cat kubernetes/cluster.yaml
nodes:
- hostname: uc-test.rancher.local
  type: server
network:
    apiVIP: 192.168.122.100
    apiHost: 192.168.122.100.sslip.io

% cat kubernetes/config/server.yaml
token: totally-not-generated-one
selinux: true
cni: cilium
embedded-registry: true

And it worked just fine...

I am unable to grasp what's wrong because I see this even with current main branch. However, the single-node VM created with this configuration still has single-node-example as its hostname which is something I feel is wrong because:

  1. We set hostname: uc-test.rancher.local in cluster.yaml.
  2. When spinning up the VM using virt-install, I used the following command:
    sudo virt-install --name node01 \
    --ram 16384 \
    --vcpus 6 \
    --osinfo detect=on,name=sle-unknown \
    --graphics none \
    --console pty,target_type=serial \
    --network network=default,model=virtio,mac=FE:C4:05:42:8B:01 \
    --virt-type kvm \
    --import \
    --boot uefi,loader=/usr/share/qemu/ovmf-x86_64-ms-4m-code.bin,nvram.template=/usr/share/qemu/ovmf-x86_64-ms-4m-vars.bin \
    --disk path="/var/lib/libvirt/images/main.raw",format=raw
    So I was expecting the resulting VM's hostname to either be uc-test.rancher.local or node01, but it's single-node-example. 😕

@rdoxenham

Copy link
Copy Markdown
Contributor

I am unable to grasp what's wrong because I see this even with current main branch. However, the single-node VM created with this configuration still has single-node-example as its hostname which is something I feel is wrong because:

  1. We set hostname: uc-test.rancher.local in cluster.yaml.

It will only match on this hostname if you have a uc-test.rancher.local entry in network/.

The default examples have a specific entry for single-node-example (note that this filename is expected to translate into a desired hostname). When nmc loads it will try and match based on mac address to understand which node is which, hence why it sets your hostname to single-node-example:

rdo@tiw network % pwd
(snip) elemental/examples/elemental/customize/single-node

rdo@tiw network % grep mac-address single-node-example.yaml
    mac-address: FE:C4:05:42:8B:01

You will see that this mac address matches your virt-install call:

  1. When spinning up the VM using virt-install, I used the following command:

    sudo virt-install --name node01 \
    --ram 16384 \
    --vcpus 6 \
    --osinfo detect=on,name=sle-unknown \
    --graphics none \
    --console pty,target_type=serial \
    --network network=default,model=virtio,mac=FE:C4:05:42:8B:01 \
    --virt-type kvm \
    --import \
    --boot uefi,loader=/usr/share/qemu/ovmf-x86_64-ms-4m-code.bin,nvram.template=/usr/share/qemu/ovmf-x86_64-ms-4m-vars.bin \
    --disk path="/var/lib/libvirt/images/main.raw",format=raw

    So I was expecting the resulting VM's hostname to either be uc-test.rancher.local or node01, but it's single-node-example. 😕

So, to fix this, rename your single-node-example.yaml file in network/ to uc-test.rancher.local.

You have, however, stumbled on an issue: should we block the install if the available nodes don't match the desired node names in cluster.yaml, @atanasdinov @dirkmueller ?

@atanasdinov

Copy link
Copy Markdown
Contributor

We used to fail in EIB, however, the implementation for Elemental will just default to server.

The latter is more forgiving, but I guess it's something to discuss once again if it proves unexpected.

@ldevulder ldevulder left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Manually tested the fix on my lab, works as expected!

@dharmit

dharmit commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Thanks @rdoxenham for elaborate responses. 🙏🏽

You have, however, stumbled on an issue: should we block the install if the available nodes don't match the desired node names in cluster.yaml

I spent unreasonable time yesterday and today trying to understand what's going on here. At least something (a bug that caused me to not face another bug) useful came out of it. 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

server.yaml file ignored on single-node cluster

6 participants