Skip to content

Talos Linux Management

This guide covers day-to-day management of Talos Linux clusters, including machine configuration, storage management, and common operations.

Machine Configuration

Talos uses multi-document machine configurations starting from version 1.5+. Understanding how to manage these configurations is essential for cluster operations.

Viewing Configuration

# View full machine config
talosctl -n <node-ip> get mc -o yaml

# View specific resource types
talosctl -n <node-ip> get volumeconfigs
talosctl -n <node-ip> get uservolumeconfigs

# Check specific volume config (note: u- prefix for user volumes)
talosctl -n <node-ip> get vc u-storage-data

Configuration Document Types

Document Type Purpose
v1alpha1 (main) Core machine configuration
UserVolumeConfig User-defined storage volumes
ExtensionServiceConfig System extension services
SideroLinkConfig SideroLink connectivity
NetworkConfig Network configuration (Talos 1.12+)

Patching Machine Configuration

Strategic Merge Patches

Talos supports strategic merge patches for modifying configuration. JSON patches are not supported for multi-document configurations.

# Apply a patch from a file
talosctl -n <node-ip> patch mc --patch @patch.yaml

# Apply an inline patch
talosctl -n <node-ip> patch mc --patch '
machine:
  network:
    hostname: new-hostname
'

Removing Configuration with $patch: delete

Talos doesn't have a talosctl delete command for configuration documents. Instead, use the $patch: delete directive.

Why No Delete Command?

Talos treats machine configuration as the single source of truth. All changes go through config patches to:

  • Maintain an audit trail
  • Align with immutable OS philosophy
  • Ensure reproducible configurations

Remove a UserVolumeConfig

talosctl -n <node-ip> patch mc --patch '
apiVersion: v1alpha1
kind: UserVolumeConfig
name: storage-data
$patch: delete
'

Remove Multiple Documents

talosctl -n <node-ip> patch mc --patch '
apiVersion: v1alpha1
kind: UserVolumeConfig
name: storage-data
$patch: delete
---
apiVersion: v1alpha1
kind: ExtensionServiceConfig
name: some-extension
$patch: delete
'

Remove a Specific Field (Not Entire Document)

talosctl -n <node-ip> patch mc --patch '
apiVersion: v1alpha1
kind: LinkConfig
name: enp0s3
mtu:
  $patch: delete
'

Apply to Multiple Nodes

NODES="192.168.30.24 192.168.30.25 192.168.30.26"

for NODE in $NODES; do
  echo "Patching $NODE..."
  talosctl -n $NODE patch mc --patch '
apiVersion: v1alpha1
kind: UserVolumeConfig
name: storage-data
$patch: delete
'
done

Storage Management

UserVolumeConfig

UserVolumeConfig creates formatted partitions mounted under /var/mnt/<name>. Useful for:

  • OpenEBS hostpath storage
  • Application data directories
  • Log storage

Example UserVolumeConfig:

apiVersion: v1alpha1
kind: UserVolumeConfig
name: storage-data
provisioning:
  diskSelector:
    match: disk.transport == "virtio"
  minSize: 50GB
  maxSize: 100GB

Raw Block Devices (for Rook Ceph)

Rook Ceph requires raw, unformatted block devices. Do not use UserVolumeConfig for Rook Ceph disks.

Requirements for Rook Ceph:

  • No partitions
  • No filesystem
  • No LVM
  • Disk not mounted

Wiping Disks

After removing a UserVolumeConfig, wipe the disk to prepare it for reuse:

# Wipe a single disk
talosctl -n <node-ip> wipe disk /dev/sdb --insecure

# Verify disk is clean (should show no filesystem)
talosctl -n <node-ip> get disks

Data Loss

Disk wiping is destructive and irreversible. Ensure:

  • All data is backed up
  • Workloads using the storage are stopped
  • You're wiping the correct disk (not the OS disk!)

Migrating Storage: OpenEBS to Rook Ceph

Complete workflow for migrating from OpenEBS (hostpath) to Rook Ceph:

# Define worker nodes
WN01=192.168.30.24
WN02=192.168.30.25
WN03=192.168.30.26

# Step 1: Delete workloads using OpenEBS storage
kubectl delete pvc --all -n <namespace>  # Be careful!

# Step 2: Remove UserVolumeConfig from all worker nodes
for NODE in $WN01 $WN02 $WN03; do
  echo "=== Removing volume config from $NODE ==="
  talosctl -n $NODE patch mc --patch '
apiVersion: v1alpha1
kind: UserVolumeConfig
name: storage-data
$patch: delete
'
done

# Step 3: Wait for volumes to unmount
sleep 10

# Step 4: Wipe the disks
for NODE in $WN01 $WN02 $WN03; do
  echo "=== Wiping disk on $NODE ==="
  talosctl -n $NODE wipe disk /dev/sdb --insecure
done

# Step 5: Verify disks are ready
for NODE in $WN01 $WN02 $WN03; do
  echo "=== Checking $NODE ==="
  talosctl -n $NODE get disks | grep -E "DEV|sdb"
done

After this, Rook Ceph will detect the clean disks and create OSDs automatically.

Common Operations

Check Cluster Health

# Talos health check
talosctl -n <node-ip> health

# Interactive dashboard
talosctl -n <node-ip> dashboard

# Service status
talosctl -n <node-ip> services

View Logs

# Kubelet logs
talosctl -n <node-ip> logs kubelet

# Follow logs in real-time
talosctl -n <node-ip> logs -f kubelet

# etcd logs
talosctl -n <node-ip> logs etcd

Reboot a Node

# Graceful reboot
talosctl -n <node-ip> reboot

# Check reboot progress
talosctl -n <node-ip> dmesg --follow

Reset a Node

Destructive Operation

This wipes all data and configuration from the node.

# Reset with graceful shutdown
talosctl -n <node-ip> reset --graceful

# Force reset (use with caution)
talosctl -n <node-ip> reset --graceful=false

Troubleshooting

JSON Patch Error on Multi-Document Config

Error:

JSON6902 patches are not supported for multi-document machine configuration

Solution: Use strategic merge patches with YAML syntax instead of JSON patches.

Volume Config Not Deleting

Symptom: $patch: delete doesn't seem to work.

Check:

  1. Verify the document exists:

    talosctl -n <node-ip> get vc
    

  2. Ensure correct name field matches:

    talosctl -n <node-ip> get mc -o yaml | grep -A5 "kind: UserVolumeConfig"
    

  3. Use exact apiVersion, kind, and name in your patch.

Disk Wipe Fails

Error: Disk is in use.

Solution: Remove the UserVolumeConfig first, wait for unmount, then wipe:

# Remove config
talosctl -n <node-ip> patch mc --patch '...$patch: delete...'

# Wait for unmount
sleep 10

# Then wipe
talosctl -n <node-ip> wipe disk /dev/sdb --insecure

Finding the Correct Disk

# List all disks
talosctl -n <node-ip> get disks

# Detailed disk info
talosctl -n <node-ip> get disks -o yaml

# Check block devices
talosctl -n <node-ip> read /proc/partitions

Reference

Useful Commands

Command Description
talosctl get mc -o yaml View full machine config
talosctl get vc List volume configs
talosctl get disks List block devices
talosctl patch mc --patch @file.yaml Apply patch from file
talosctl wipe disk /dev/sdb --insecure Wipe a disk
talosctl edit mc Edit config interactively