Problem
Currently, the operator hardcodes PXE as the sole network boot method (SetPXEBootOnce) and uses it for every power-on cycle. There is no distinction between an initial provisioning boot (network) and a regular operational boot (local disk). This leads to two issues:
-
No HTTP Boot support. HTTP Boot (UEFI HTTP) is increasingly preferred in modern data centers — it works across routed networks without DHCP relay and supports HTTPS for secure image delivery. The operator cannot serve this path today.
-
No safe boot behavior on unexpected power transitions. If someone manually power-cycles a server outside the operator's control, the server boots from whatever the BIOS boot order dictates — potentially re-entering a PXE loop or booting an unintended image. There is no "safe stop" mechanism.
Design Principles
Boot overrides are best-effort signals
The Redfish BootSourceOverrideTarget property (Pxe, UefiHttp, Hdd, etc.) is not reliably honored across all vendors. On some hardware, setting UefiHttp still results in a PXE boot if that is what the BIOS network boot order defines. Because the operator cannot guarantee the actual boot method (due to vendor firmware behavior, DHCP configuration, and boot-operator behavior), the boot overrides serve primarily as validation gates and intent signals.
The actual boot method is determined by:
- BIOS boot order — configured by the operator (see below)
- Image content — a UKI is served via HTTP Boot; a traditional kernel/initramfs via PXE
- DHCP / boot-operator — infrastructure-level concerns outside the operator's direct control
Deterministic boot order: EFI Shell → HDD → Network
For the boot policy to work as intended, the BIOS boot order must be configured to:
| Priority |
Device |
Purpose |
| 1 |
EFI Shell |
Safe stop. Catches unexpected power transitions. |
| 2 |
HDD |
Regular operation. Boots the installed OS from local disk. |
| 3 |
Network |
Provisioning. Used when the operator sets a network boot override. |
This boot order is an external prerequisite. The metal-operator does not enforce it — it is the responsibility of the infrastructure team or a separate provisioning tool to configure the BIOS boot order before servers are onboarded.
Given this boot order, three distinct boot behaviors emerge:
- Operator-controlled network boot (first boot): The operator sets a
Once boot override to Pxe or UefiHttp. The override takes precedence over the BIOS order → server network boots → installs to disk.
- Operator-controlled regular boot (subsequent boots): The operator sets a
Once boot override to Hdd. The override takes precedence → server boots from local disk.
- Unexpected power transition (manual intervention, power glitch): No boot override is active (the previous
Once override was consumed). The BIOS order takes over → EFI Shell is first → boot process stops. The server is in a safe, deterministic state and requires operator intervention to proceed.
Proposed Changes
1. Boot Policy API
Add a BootPolicy struct to ServerClaimSpec and ServerBootConfigurationSpec. Since ServerMaintenance embeds ServerBootConfigurationSpec via serverBootConfigurationTemplate.spec, it inherits the field automatically.
Types
// BootPolicy defines the boot behavior for a server across its lifecycle.
type BootPolicy struct {
// FirstBoot specifies the network boot method for the initial provisioning boot.
// The operator uses this to validate that the image contains the correct artifacts
// and to set the boot override for the first boot cycle.
// +kubebuilder:validation:Required
// +kubebuilder:validation:Enum=Pxe;UefiHttp
FirstBoot FirstBootMode `json:"firstBoot"`
// Boot specifies the boot method for regular operation after initial provisioning.
// +kubebuilder:validation:Enum=Hdd
// +kubebuilder:default="Hdd"
// +optional
Boot BootMode `json:"boot,omitempty"`
}
// FirstBootMode specifies the network boot method for initial provisioning.
// +kubebuilder:validation:Enum=Pxe;UefiHttp
type FirstBootMode string
const (
// FirstBootModePxe boots via PXE. Requires the image to contain
// traditional kernel/initramfs artifacts.
FirstBootModePxe FirstBootMode = "Pxe"
// FirstBootModeUefiHttp boots via UEFI HTTP Boot. Requires the image
// to contain a UKI (Unified Kernel Image) artifact.
FirstBootModeUefiHttp FirstBootMode = "UefiHttp"
)
// BootMode specifies the boot method for regular operation.
// +kubebuilder:validation:Enum=Hdd
type BootMode string
const (
// BootModeHdd boots from the local hard disk.
BootModeHdd BootMode = "Hdd"
)
Image validation by firstBoot mode
The firstBoot mode determines what image artifacts are required. Validation happens before the ServerBootConfiguration is created — the controller that creates the SBC (ServerClaim controller or ServerMaintenance controller) validates the image against the firstBoot mode first:
firstBoot |
Required artifacts |
Pxe |
Kernel + initramfs in OCI image |
UefiHttp |
UKI media type in OCI image |
If the image does not contain the required artifacts, the SBC is not created. The controller emits an event on the requesting resource (ServerClaim or ServerMaintenance) and does not proceed. This avoids creating a resource that the boot-operator or other controllers might react to and fight over.
ServerBootConfigurationStatus — new field
| Status field |
Type |
Description |
httpBootURI |
string (URI) |
The resolved HTTP Boot URI. Set by the boot-operator after resolving the UKI artifact from the OCI image. Optional — if absent, the BMC obtains the URI via DHCP (option 59 / DHCPv6 option 60). |
First boot tracking — annotation on ServerBootConfiguration
The operator tracks whether the initial provisioning boot has occurred using an annotation on the ServerBootConfiguration resource.
| Annotation |
Value |
Description |
metal.ironcore.dev/provisioned |
"true" |
Set by the controller on the SBC after the first boot succeeds. |
Controller logic:
- SBC has
provisioned annotation → regular boot (use bootPolicy.boot)
- SBC does not have
provisioned annotation → first boot (use bootPolicy.firstBoot)
No explicit cleanup logic is needed. The provisioning state is naturally scoped to the SBC lifecycle:
- New claim → new SBC is created without the annotation → first boot
- Claim deleted → SBC is garbage collected → state gone
- Discovery → internal SBC is created and deleted after discovery → no state leaks
- Maintenance → separate SBC → has its own independent annotation (or not — maintenance always uses
firstBoot mode regardless)
- CAPI move → SBC is moved with its annotations intact → state preserved
2. Resource Examples
ServerClaim — PXE provisioning (default, same as today)
apiVersion: metal.ironcore.dev/v1alpha1
kind: ServerClaim
metadata:
name: my-claim
spec:
power: "On"
image: my-osimage:latest
ignitionSecretRef:
name: my-ignition-secret
bootPolicy:
firstBoot: Pxe
boot: Hdd
ServerClaim — HTTP Boot provisioning with UKI
apiVersion: metal.ironcore.dev/v1alpha1
kind: ServerClaim
metadata:
name: my-claim-httpboot
spec:
power: "On"
image: my-uki-osimage:latest # OCI image containing a UKI artifact
ignitionSecretRef:
name: my-ignition-secret
bootPolicy:
firstBoot: UefiHttp
boot: Hdd
ServerBootConfiguration — created from claim
apiVersion: metal.ironcore.dev/v1alpha1
kind: ServerBootConfiguration
metadata:
name: my-claim-sbc
spec:
serverRef:
name: my-server
image: my-uki-osimage:latest
ignitionSecretRef:
name: my-ignition-secret
bootPolicy:
firstBoot: UefiHttp
boot: Hdd
# After boot-operator resolves the UKI artifact:
# status:
# state: Ready
# httpBootURI: "https://boot.example.com/artifacts/abc123/my-osimage.efi"
ServerMaintenance — firmware update via HTTP Boot
Since ServerMaintenance.spec.serverBootConfigurationTemplate.spec embeds ServerBootConfigurationSpec, the bootPolicy field is automatically available:
apiVersion: metal.ironcore.dev/v1alpha1
kind: ServerMaintenance
metadata:
name: firmware-update
annotations:
metal.ironcore.dev/reason: "Scheduled firmware update"
spec:
serverRef:
name: my-server
policy: Enforced
priority: 100
serverPower: "On"
serverBootConfigurationTemplate:
name: firmware-update-boot
spec:
serverRef:
name: my-server
image: firmware-update-uki:latest
ignitionSecretRef:
name: firmware-update-ignition
bootPolicy:
firstBoot: UefiHttp
boot: Hdd
3. Boot Lifecycle
First boot tracking
- After the server powers on successfully for the first time in Reserved state, the controller sets the annotation
metal.ironcore.dev/provisioned: "true" on the SBC.
- No reset or cleanup logic is needed — the annotation is scoped to the SBC's lifecycle. When the SBC is deleted (claim removed, re-provisioning), the state is gone. A new SBC starts without the annotation.
- Entry into
Maintenance state does not affect the claim SBC's annotation — maintenance uses its own SBC and always performs a network boot. When maintenance ends, the server returns to Reserved and the claim SBC still has its annotation → resumes HDD boot.
- CAPI move: The SBC is moved alongside the Server with all annotations intact → provisioning state is preserved.
Discovery lifecycle (no claim, no maintenance)
Discovery is an internal operator concern. No ServerClaim or ServerMaintenance exists, so no boot policy is consulted. The operator always uses PXE for the discovery boot — this is hardcoded, not configurable.
1. Server enters Initial state:
a. Operator creates internal SBC for discovery (no provisioned annotation)
b. PXE boot override → PowerOn
2. Server enters Discovery state:
a. Server PXE boots, probe agent registers
3. Server enters Available state:
a. Server powers off
b. Internal discovery SBC is deleted — no state leaks
4. ServerClaim arrives → Normal claim lifecycle begins (see below)
Normal claim lifecycle
1. ServerClaim created with bootPolicy: {firstBoot: Pxe, boot: Hdd}
2. ServerClaim controller creates SBC with bootPolicy propagated (no provisioned annotation)
3. Boot-operator validates image contains kernel/initramfs → SBC state = Ready
4. Server enters Reserved state:
a. SBC has no provisioned annotation → network boot override (Pxe) → PowerOn
b. Server network boots, installs OS to disk
c. Server reaches running state → provisioned annotation set on SBC
5. Operator reboots server (e.g. power cycle via spec):
a. SBC has provisioned annotation → HDD boot override → PowerOn
b. Server boots from local disk
6. Manual power cycle (outside operator control):
a. No boot override active (Once was consumed)
b. BIOS order: EFI Shell → boot stops
c. Operator must intervene to resume
Maintenance lifecycle
1. ServerMaintenance created with bootPolicy: {firstBoot: UefiHttp, boot: Hdd}
2. Server enters Maintenance state, maintenance SBC is created (no provisioned annotation)
3. Boot-operator validates UKI, writes httpBootURI → SBC state = Ready
4. Maintenance boot:
a. Always uses firstBoot mode → network boot override (UefiHttp)
b. Server boots maintenance image via HTTP Boot
5. Maintenance completes, ServerMaintenance removed, maintenance SBC deleted
6. Server returns to Reserved:
a. Claim SBC still has provisioned annotation (unaffected by maintenance)
b. HDD boot override → server resumes from local disk
Backwards Compatibility
bootPolicy is optional on ServerClaimSpec. If absent, the ServerClaim controller defaults to {firstBoot: Pxe, boot: Hdd} — identical to today's PXE-only behavior for the first boot. The HDD boot override for subsequent boots is new behavior but safe: servers provisioned via PXE already have an OS on disk.
- CRD upgrade is additive — new optional fields require no migration.
- Existing SBCs have no
provisioned annotation, so existing servers in Reserved state will perform one network boot on the next power cycle (matching current behavior), then switch to HDD boot.
- The BIOS boot order (EFI Shell → HDD → Network) is an external prerequisite, not enforced by the operator. Existing setups with a different boot order continue to work but do not benefit from the EFI Shell safe-stop behavior.
Future Extensions
VirtualMedia — A third FirstBootMode value for mounting an ISO image via Redfish Virtual Media. Planned as a near-term addition.
- Additional
BootMode values — e.g., Network for always-network-boot setups (stateless servers).
Problem
Currently, the operator hardcodes PXE as the sole network boot method (
SetPXEBootOnce) and uses it for every power-on cycle. There is no distinction between an initial provisioning boot (network) and a regular operational boot (local disk). This leads to two issues:No HTTP Boot support. HTTP Boot (UEFI HTTP) is increasingly preferred in modern data centers — it works across routed networks without DHCP relay and supports HTTPS for secure image delivery. The operator cannot serve this path today.
No safe boot behavior on unexpected power transitions. If someone manually power-cycles a server outside the operator's control, the server boots from whatever the BIOS boot order dictates — potentially re-entering a PXE loop or booting an unintended image. There is no "safe stop" mechanism.
Design Principles
Boot overrides are best-effort signals
The Redfish
BootSourceOverrideTargetproperty (Pxe,UefiHttp,Hdd, etc.) is not reliably honored across all vendors. On some hardware, settingUefiHttpstill results in a PXE boot if that is what the BIOS network boot order defines. Because the operator cannot guarantee the actual boot method (due to vendor firmware behavior, DHCP configuration, and boot-operator behavior), the boot overrides serve primarily as validation gates and intent signals.The actual boot method is determined by:
Deterministic boot order: EFI Shell → HDD → Network
For the boot policy to work as intended, the BIOS boot order must be configured to:
This boot order is an external prerequisite. The metal-operator does not enforce it — it is the responsibility of the infrastructure team or a separate provisioning tool to configure the BIOS boot order before servers are onboarded.
Given this boot order, three distinct boot behaviors emerge:
Onceboot override toPxeorUefiHttp. The override takes precedence over the BIOS order → server network boots → installs to disk.Onceboot override toHdd. The override takes precedence → server boots from local disk.Onceoverride was consumed). The BIOS order takes over → EFI Shell is first → boot process stops. The server is in a safe, deterministic state and requires operator intervention to proceed.Proposed Changes
1. Boot Policy API
Add a
BootPolicystruct toServerClaimSpecandServerBootConfigurationSpec. SinceServerMaintenanceembedsServerBootConfigurationSpecviaserverBootConfigurationTemplate.spec, it inherits the field automatically.Types
Image validation by
firstBootmodeThe
firstBootmode determines what image artifacts are required. Validation happens before theServerBootConfigurationis created — the controller that creates the SBC (ServerClaim controller or ServerMaintenance controller) validates the image against thefirstBootmode first:firstBootPxeUefiHttpIf the image does not contain the required artifacts, the SBC is not created. The controller emits an event on the requesting resource (ServerClaim or ServerMaintenance) and does not proceed. This avoids creating a resource that the boot-operator or other controllers might react to and fight over.
ServerBootConfigurationStatus — new field
httpBootURIstring(URI)First boot tracking — annotation on ServerBootConfiguration
The operator tracks whether the initial provisioning boot has occurred using an annotation on the
ServerBootConfigurationresource.metal.ironcore.dev/provisioned"true"Controller logic:
provisionedannotation → regular boot (usebootPolicy.boot)provisionedannotation → first boot (usebootPolicy.firstBoot)No explicit cleanup logic is needed. The provisioning state is naturally scoped to the SBC lifecycle:
firstBootmode regardless)2. Resource Examples
ServerClaim — PXE provisioning (default, same as today)
ServerClaim — HTTP Boot provisioning with UKI
ServerBootConfiguration — created from claim
ServerMaintenance — firmware update via HTTP Boot
Since
ServerMaintenance.spec.serverBootConfigurationTemplate.specembedsServerBootConfigurationSpec, thebootPolicyfield is automatically available:3. Boot Lifecycle
First boot tracking
metal.ironcore.dev/provisioned: "true"on the SBC.Maintenancestate does not affect the claim SBC's annotation — maintenance uses its own SBC and always performs a network boot. When maintenance ends, the server returns to Reserved and the claim SBC still has its annotation → resumes HDD boot.Discovery lifecycle (no claim, no maintenance)
Discovery is an internal operator concern. No
ServerClaimorServerMaintenanceexists, so no boot policy is consulted. The operator always uses PXE for the discovery boot — this is hardcoded, not configurable.Normal claim lifecycle
Maintenance lifecycle
Backwards Compatibility
bootPolicyis optional onServerClaimSpec. If absent, the ServerClaim controller defaults to{firstBoot: Pxe, boot: Hdd}— identical to today's PXE-only behavior for the first boot. The HDD boot override for subsequent boots is new behavior but safe: servers provisioned via PXE already have an OS on disk.provisionedannotation, so existing servers in Reserved state will perform one network boot on the next power cycle (matching current behavior), then switch to HDD boot.Future Extensions
VirtualMedia— A thirdFirstBootModevalue for mounting an ISO image via Redfish Virtual Media. Planned as a near-term addition.BootModevalues — e.g.,Networkfor always-network-boot setups (stateless servers).