Following #2920 (which made the machine snapshot read a BMC's IP from the live interface), give operators explicit control over how a BMC's IP is assigned -- and make the default "allocate one and keep it." BMC IP churn has been a recurring source of pain (DPF re-registration, snapshot/topology mismatch, hosts the state machine can't process), so a dynamically-allocated BMC IP should be retained as AllocationType::Static unless an operator explicitly opts into a churnable address.
What this involves
Mirror the existing DpuMode field end-to-end with a new BmcIpAllocationType { Auto, Dynamic, Fixed, Retained } (default Auto):
- proto (
crates/rpc/proto/forge.proto): BmcIpAllocationType enum + optional BmcIpAllocationType bmc_ip_allocation on ExpectedMachine (next free field number), beside dpu_mode.
- api-model (
crates/api-model/src/expected_machine.rs): the enum (#[sqlx(type_name = "bmc_ip_allocation_t", rename_all = "snake_case")], serde, #[default] Auto), the ExpectedMachineData field, the FromRow read, and a resolve(has_address)-style method like DpuMode.
- migration:
CREATE TYPE bmc_ip_allocation_t AS ENUM ('auto','dynamic','fixed','retained') + ALTER TABLE expected_machines ADD COLUMN bmc_ip_allocation bmc_ip_allocation_t NOT NULL DEFAULT 'auto' (mirror 20260420043607_expected_machines_dpu_mode.sql).
- api-db (
crates/api-db/src/expected_machine.rs): bind the column in create() + update().
- rpc (
crates/rpc/src/model/expected_machine.rs): From/TryFrom, unspecified -> default like DpuMode.
Resolution + validation
Auto + address => Fixed; Auto + no address => Retained (new default).
Fixed requires bmc_ip_address (else InvalidArgument); Dynamic/Retained must not carry an address (else InvalidArgument).
Allocation behavior (the substantive part)
Fixed keeps today's path (preallocate the given IP as Static).
Retained / Auto-without-address: the BMC's auto-allocated IP must be persisted as AllocationType::Static so lease expiry (which deletes only Dhcp, dhcp/expire.rs) skips it. Since NextAvailableIp -> Dhcp is hardcoded (api-model/src/allocation_type.rs), add a "next-available, record as Static" path on the BMC preallocate flow (preallocate_bmc_machine_interface / site-explorer try_preallocate_one) -- reuse the existing retention hook if one's there; do not add a broad new AddressSelectionStrategy variant.
Dynamic keeps the expirable Dhcp behavior (the opt-out).
Behavior change (intentional -- call out for reviewers)
Existing rows default to Auto; with no bmc_ip_address that now means Retained, so BMC IPs stop churning by default. Trade-off: retained IPs are pinned and not returned to the pool on decommission -- operators who want reclamation set Dynamic. Document this on the field.
Notes
Following #2920 (which made the machine snapshot read a BMC's IP from the live interface), give operators explicit control over how a BMC's IP is assigned -- and make the default "allocate one and keep it." BMC IP churn has been a recurring source of pain (DPF re-registration, snapshot/topology mismatch, hosts the state machine can't process), so a dynamically-allocated BMC IP should be retained as
AllocationType::Staticunless an operator explicitly opts into a churnable address.What this involves
Mirror the existing
DpuModefield end-to-end with a newBmcIpAllocationType { Auto, Dynamic, Fixed, Retained }(defaultAuto):crates/rpc/proto/forge.proto):BmcIpAllocationTypeenum +optional BmcIpAllocationType bmc_ip_allocationonExpectedMachine(next free field number), besidedpu_mode.crates/api-model/src/expected_machine.rs): the enum (#[sqlx(type_name = "bmc_ip_allocation_t", rename_all = "snake_case")], serde,#[default] Auto), theExpectedMachineDatafield, theFromRowread, and aresolve(has_address)-style method likeDpuMode.CREATE TYPE bmc_ip_allocation_t AS ENUM ('auto','dynamic','fixed','retained')+ALTER TABLE expected_machines ADD COLUMN bmc_ip_allocation bmc_ip_allocation_t NOT NULL DEFAULT 'auto'(mirror20260420043607_expected_machines_dpu_mode.sql).crates/api-db/src/expected_machine.rs): bind the column increate()+update().crates/rpc/src/model/expected_machine.rs): From/TryFrom, unspecified -> default likeDpuMode.Resolution + validation
Auto+ address =>Fixed;Auto+ no address =>Retained(new default).Fixedrequiresbmc_ip_address(elseInvalidArgument);Dynamic/Retainedmust not carry an address (elseInvalidArgument).Allocation behavior (the substantive part)
Fixedkeeps today's path (preallocate the given IP asStatic).Retained/Auto-without-address: the BMC's auto-allocated IP must be persisted asAllocationType::Staticso lease expiry (which deletes onlyDhcp,dhcp/expire.rs) skips it. SinceNextAvailableIp -> Dhcpis hardcoded (api-model/src/allocation_type.rs), add a "next-available, record asStatic" path on the BMC preallocate flow (preallocate_bmc_machine_interface/ site-explorertry_preallocate_one) -- reuse the existing retention hook if one's there; do not add a broad newAddressSelectionStrategyvariant.Dynamickeeps the expirableDhcpbehavior (the opt-out).Behavior change (intentional -- call out for reviewers)
Existing rows default to
Auto; with nobmc_ip_addressthat now means Retained, so BMC IPs stop churning by default. Trade-off: retained IPs are pinned and not returned to the pool on decommission -- operators who want reclamation setDynamic. Document this on the field.Notes
BmcIpAllocationTypeis BMC-named but the shape generalizes to host NICs / DPUs later.Auto/Fixed/Retained/Dynamicresolution + validation errors; aRetainedBMC's IP is recordedStaticand survives lease expiry.