Skip to content

Add bmc_ip_allocation to ExpectedMachine, retaining BMC IPs by default #2952

Description

@chet

Following #2920 (which made the machine snapshot read a BMC's IP from the live interface), give operators explicit control over how a BMC's IP is assigned -- and make the default "allocate one and keep it." BMC IP churn has been a recurring source of pain (DPF re-registration, snapshot/topology mismatch, hosts the state machine can't process), so a dynamically-allocated BMC IP should be retained as AllocationType::Static unless an operator explicitly opts into a churnable address.

What this involves

Mirror the existing DpuMode field end-to-end with a new BmcIpAllocationType { Auto, Dynamic, Fixed, Retained } (default Auto):

  • proto (crates/rpc/proto/forge.proto): BmcIpAllocationType enum + optional BmcIpAllocationType bmc_ip_allocation on ExpectedMachine (next free field number), beside dpu_mode.
  • api-model (crates/api-model/src/expected_machine.rs): the enum (#[sqlx(type_name = "bmc_ip_allocation_t", rename_all = "snake_case")], serde, #[default] Auto), the ExpectedMachineData field, the FromRow read, and a resolve(has_address)-style method like DpuMode.
  • migration: CREATE TYPE bmc_ip_allocation_t AS ENUM ('auto','dynamic','fixed','retained') + ALTER TABLE expected_machines ADD COLUMN bmc_ip_allocation bmc_ip_allocation_t NOT NULL DEFAULT 'auto' (mirror 20260420043607_expected_machines_dpu_mode.sql).
  • api-db (crates/api-db/src/expected_machine.rs): bind the column in create() + update().
  • rpc (crates/rpc/src/model/expected_machine.rs): From/TryFrom, unspecified -> default like DpuMode.

Resolution + validation

  • Auto + address => Fixed; Auto + no address => Retained (new default).
  • Fixed requires bmc_ip_address (else InvalidArgument); Dynamic/Retained must not carry an address (else InvalidArgument).

Allocation behavior (the substantive part)

  • Fixed keeps today's path (preallocate the given IP as Static).
  • Retained / Auto-without-address: the BMC's auto-allocated IP must be persisted as AllocationType::Static so lease expiry (which deletes only Dhcp, dhcp/expire.rs) skips it. Since NextAvailableIp -> Dhcp is hardcoded (api-model/src/allocation_type.rs), add a "next-available, record as Static" path on the BMC preallocate flow (preallocate_bmc_machine_interface / site-explorer try_preallocate_one) -- reuse the existing retention hook if one's there; do not add a broad new AddressSelectionStrategy variant.
  • Dynamic keeps the expirable Dhcp behavior (the opt-out).

Behavior change (intentional -- call out for reviewers)

Existing rows default to Auto; with no bmc_ip_address that now means Retained, so BMC IPs stop churning by default. Trade-off: retained IPs are pinned and not returned to the pool on decommission -- operators who want reclamation set Dynamic. Document this on the field.

Notes

Metadata

Metadata

Assignees

Labels

No labels
No labels
No fields configured for Enhancement.

Projects

Status
In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions