Expanding Your Talos Kubernetes Cluster and Enabling High Availability

Overview

Adding a third node¹ and enabling high availability by promoting all nodes to control plane.

With 3 nodes, running all as control plane gives you etcd quorum - the cluster survives any single node failure. This is the sweet spot for homelabs. With 4+ nodes, you'd keep 3 control plane and add dedicated workers to reduce overhead.


Tip:	Having trouble? See v0.9.1 for what your setup should look like after completing this article.

Before You Begin

Prerequisites

Talhelper Cluster Bootstrap completed (cluster running)
New node: BIOS configured per Talos Linux USB Installation
New node: Booted from Talos installer ISO, in maintenance mode

Configure BIOS Settings

Before booting from USB, configure BIOS settings (press Delete on boot for GEEKOM XT15 MEGA):

Setting	Location	Action
Boot Device Priority	Boot	Set USB KEY: UEFI:... first
Wake Up by LAN	Advanced > Power Management Configuration	Enable
Power-On after Power-Fail	Advanced > Power Management Configuration	Set to Power On

See Talos Linux USB Installation for full BIOS details.

Gather Hardware Info

With the new node booted from the Talos installer ISO:

talosctl get disks --insecure --nodes 192.168.1.32
talosctl get links --insecure --nodes 192.168.1.32 | grep -E "^NODE|up.*true"

Note the disk model and network interface name.

Add New Node

Clone the config from an existing node and adjust.

Node Config

talos/talconfig.yaml:

nodes:
    # ... existing nodes ...

    - hostname: talos-node-3
      ipAddress: 192.168.1.32
      controlPlane: false # Start as worker, promote later
      installDiskSelector:
          size: ">= 2TB"
          model: WPBSN4M8-2TGP # Same hardware = same model
      networkInterfaces:
          - interface: enp172s0
            addresses:
                - 192.168.1.32/24
            routes:
                - network: 0.0.0.0/0
                  gateway: 192.168.1.1
      schematic:
          customization:
              systemExtensions:
                  officialExtensions:
                      - siderolabs/i915
                      - siderolabs/intel-ucode
                      - siderolabs/iscsi-tools
      patches:
          - |-
              machine:
                kubelet:
                  extraArgs:
                    feature-gates: DynamicResourceAllocation=true
                files:
                  - path: /etc/cri/conf.d/20-customization.part
                    op: create
                    content: |
                      [plugins."io.containerd.cri.v1.runtime"]
                        cdi_spec_dirs = ["/var/run/cdi"]


Note:	Copy the schematic and patches from an existing node to ensure identical configuration. Only hostname, ipAddress, and addresses differ.

Commit Node Config

git add talos/talconfig.yaml
git commit -m "feat(talos): add third node"

Promote All Nodes to Control Plane

With 3 nodes, enable HA by making all control plane.

Update Control Plane

talos/talconfig.yaml:

nodes:
    - hostname: talos-node-1
      controlPlane: true # Already control plane

    - hostname: talos-node-2
      controlPlane: true # Changed from false

    - hostname: talos-node-3
      controlPlane: true # Changed from false

Commit Control Plane

git add talos/talconfig.yaml
git commit -m "feat(talos): make all control plane"

Regenerate Configs

Generate Machine Configs

cd ~/homelab/talos

SOPS_AGE_KEY_FILE=<(op document get "sops-key | homelab") \
  talhelper genconfig

Apply to New Node

Add node 3 to the cluster first. This brings etcd to 2 members before touching node 2.

Apply Config

# Node 3 is in maintenance mode, needs --insecure
talosctl apply-config --insecure \
  --nodes 192.168.1.32 \
  --file clusterconfig/homelab-cluster-talos-node-3.yaml

Watch Node Join

Node 3 reboots and joins the cluster. Wait for it to appear:

# Watch for node 3 to join (Ctrl+C when ready)
kubectl get nodes -w

Verify etcd Members

Verify etcd now has 2 members:

talosctl --nodes 192.168.1.30 etcd members

Promote Existing Worker

Now apply the updated config to node 2 to promote it to control plane.

Apply Config

# Node 2 is already configured (not maintenance mode), no --insecure
talosctl apply-config \
  --nodes 192.168.1.31 \
  --file clusterconfig/homelab-cluster-talos-node-2.yaml

Watch Node Rejoin

This triggers a reboot. Node 2 restarts with control plane services and joins etcd.

kubectl get nodes -w

Verify etcd Members

Verify etcd now has 3 members:

talosctl --nodes 192.168.1.30 etcd members

Verify Final Health

Cluster Health

talosctl --nodes 192.168.1.30 health

All checks should show OK.

GPU Discovery

Verify GPU discovered on new node:

kubectl get resourceslices

Should show ResourceSlices for all 3 nodes.

Maintain Nodes

With 3 control plane nodes, you can safely take one offline for maintenance. etcd maintains quorum with 2 of 3 nodes.

Get MAC Address

Before shutting down, note the MAC address for Wake on LAN (HW ADDR column):

talosctl --nodes 192.168.1.32 get links | grep -E "^NODE|enp"

Graceful Shutdown

Drains workloads and stops services cleanly before powering off. Use when physically moving a node or for extended maintenance.

talosctl --nodes 192.168.1.32 shutdown

Wake on LAN

Power on a shutdown node remotely (requires Wake on LAN enabled in BIOS per Talos Linux USB Installation):

wakeonlan <MAC_ADDRESS>

Reboot

Restarts a running node without full power cycle. Use after config changes that require a restart.

talosctl --nodes 192.168.1.32 reboot

Verify Cluster After Maintenance

When the node comes back online:

kubectl get nodes
talosctl --nodes 192.168.1.30 etcd members


Note:	With 2 of 3 nodes running, the cluster remains fully operational. Avoid taking down 2 nodes simultaneously - etcd requires majority.

Next Steps

With the new node added, update Tailscale to route traffic to it.

See: Tailscale ACL and Subnet Routes

Resources

Sidero Labs, "Adding Nodes to a Cluster," docs.siderolabs.com. Accessed: Dec. 20, 2025. [Online]. Available: https://www.talos.dev/latest/talos-guides/howto/scaling-up/ ↩

Expanding Your Talos Kubernetes Cluster and Enabling High Availability

Overview

Before You Begin

Prerequisites

Configure BIOS Settings

Gather Hardware Info

Add New Node

Node Config

Commit Node Config

Promote All Nodes to Control Plane

Update Control Plane

Commit Control Plane

Regenerate Configs

Generate Machine Configs

Apply to New Node

Apply Config

Watch Node Join

Verify etcd Members

Promote Existing Worker

Apply Config

Watch Node Rejoin

Verify etcd Members

Verify Final Health

Cluster Health

GPU Discovery

Maintain Nodes

Get MAC Address

Graceful Shutdown

Wake on LAN

Reboot

Verify Cluster After Maintenance

Next Steps

Resources

Footnotes