Skip to content

fix: exclude control plane endpoint from MetalMachine addresses#1605

Open
scatat wants to merge 1 commit intosiderolabs:mainfrom
scatat:fix/exclude-cp-endpoint-from-machine-addresses
Open

fix: exclude control plane endpoint from MetalMachine addresses#1605
scatat wants to merge 1 commit intosiderolabs:mainfrom
scatat:fix/exclude-cp-endpoint-from-machine-addresses

Conversation

@scatat
Copy link
Copy Markdown

@scatat scatat commented Mar 18, 2026

When Talos reports node addresses, it includes the shared control plane endpoint (e.g. a VIP). Sidero correctly stores these in ServerBinding.Spec.Addresses. But when the MetalMachine controller copies them into MetalMachine.Status.Addresses, the shared endpoint propagates to Machine.Status.Addresses, where CACPPT treats it as a per-machine Talos API endpoint. This breaks scale-down operations.

This PR filters the control plane endpoint at the CAPI boundary in metalmachine_controller.go, where the Cluster object is already in scope. ServerBinding retains the full address list.

Safe when ControlPlaneEndpoint.Host is empty — comparison against "" matches nothing, all addresses pass through.

Ref: siderolabs/cluster-api-control-plane-provider-talos#242

🤖 Co-authored with AI

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 18, 2026

👷 Deploy request for wonderful-swartz-a1308c pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit c0e64a5

addresses := make([]capiv1.MachineAddress, 0, len(serverBinding.Spec.Addresses))
for _, addr := range serverBinding.Spec.Addresses {
// the control plane endpoint is shared, not per-machine
if addr == cluster.Spec.ControlPlaneEndpoint.Host {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has same issue as CACPPT PR in general - this might break valid setups with single node controlplane.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in f3d428b.

Added && len(serverBinding.Spec.Addresses) > 1 so the filter only fires when the machine has other addresses. Single-node CP where the endpoint is the only address keeps it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think it's correct, it's trying to guess something while the context is missing here.

Either it should be fixed on Talos side to stop reporting VIPs, or Sidero Metal should pull addresses itself.

When Talos reports node addresses, it includes the shared control plane
endpoint (VIP). The metalmachine controller copies these into
MetalMachine.Status.Addresses, where CACPPT treats them as per-machine
endpoints, breaking scale-down operations.

Filter the control plane endpoint at the CAPI boundary, guarded by
len > 1 so single-node control planes retain the address.

Ref: siderolabs/cluster-api-control-plane-provider-talos#242

Signed-off-by: scatat <stephentan@gmail.com>
@scatat scatat force-pushed the fix/exclude-cp-endpoint-from-machine-addresses branch from f3d428b to c0e64a5 Compare March 20, 2026 20:07
@StephenTan-TW
Copy link
Copy Markdown

Squashed to a single commit with DCO sign-off. The GPG identity check will fail since I'm not in the siderolabs org — that one needs a maintainer override.

@scatat scatat marked this pull request as ready for review March 20, 2026 20:08
@github-project-automation github-project-automation bot moved this to To Do in Planning Mar 20, 2026
@talos-bot talos-bot moved this from To Do to In Review in Planning Mar 20, 2026
@smira smira moved this from In Review to On Hold in Planning Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: On Hold

Development

Successfully merging this pull request may close these issues.

4 participants