feat: support retaining a BMC IP address allocation#2955
Conversation
|
@coderabbitai PTAL, thanks! |
|
ᘛ⁐̤ᕐᐷ ✅ Action performedReview finished.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: ⛔ Files ignored due to path filters (8)
📒 Files selected for processing (25)
🚧 Files skipped from review as they are similar to previous changes (24)
Summary by CodeRabbit
WalkthroughAdds per-host BMC IP allocation support across the CLI, RPC, model, database, API, and site-explorer retention flow, and adds a new gRPC endpoint and schema for latest Site Explorer run metadata. ChangesBmcIpAllocationType end-to-end
Site Explorer last-run API
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 OpenGrep (1.23.0)crates/admin-cli/src/expected_machines/common.rs┌──────────────┐ �[32m✔�[39m �[1mOpengrep OSS�[0m [00.28][ERROR]: unable to find a config; path crates/admin-cli/src/expected_machines/patch/args.rs┌──────────────┐ �[32m✔�[39m �[1mOpengrep OSS�[0m [00.18][ERROR]: unable to find a config; path crates/admin-cli/src/expected_machines/add/args.rs┌──────────────┐ �[32m✔�[39m �[1mOpengrep OSS�[0m [00.45][ERROR]: unable to find a config; path
🔧 ast-grep (0.44.0)crates/rpc/build.rsComment |
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-06-28 10:09:05 UTC | Commit: 6d8c5e4 |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/admin-cli/src/expected_machines/update/mod.rs (1)
41-72: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winOmitting
bmc_ip_allocationin update JSON does not actually reset it.
ExpectedMachineJson::bmc_ip_allocationdefaults toNone, butpatch_expected_machine()interpretsNoneas “keep the stored value”. That makesexpected-machine updateinconsistent with both the comment above this call (“result matches the file”) andreplace_all_expected_machines, where omission means server-default/Auto. Removing the field from an update file will silently leave an existingdynamic/fixed/retainedsetting in place. Either sendSome(BmcIpAllocationType::Auto)for omitted update-file values or document that update files must spell outAutoexplicitly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/admin-cli/src/expected_machines/update/mod.rs` around lines 41 - 72, The expected-machine update path is treating omitted bmc_ip_allocation as “keep existing” via patch_expected_machine, which conflicts with the update-file semantics in ExpectedMachineJson and the “result matches the file” behavior. Update the expected_machines::update flow to translate a missing bmc_ip_allocation in the parsed JSON into an explicit Auto value before calling ctx.api_client.patch_expected_machine, or otherwise make the CLI/docs clearly require Auto to be written out; keep the fix localized around ExpectedMachineJson and patch_expected_machine call site.crates/admin-cli/src/expected_machines/patch/args.rs (1)
227-238: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
validate()still rejects several valid single-field patches.
run()forwardsdefault_pause_ingestion_and_poweron,bmc_retain_credentials,disable_lockdown, and metadata-only edits, but this guard does not count any of them. Commands that only change one of those fields still fail with the generic “one of the following options must be specified” error. Please derive this check from the actual patchable surface (or the same clap grouping) instead of maintaining a second manual list. As per path instructions, CLI changes should preserve clap behavior and actionable operator-facing error messages.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/admin-cli/src/expected_machines/patch/args.rs` around lines 227 - 238, The validation guard in `validate()` is using a stale hand-maintained allowlist, so it rejects valid single-field patches such as `default_pause_ingestion_and_poweron`, `bmc_retain_credentials`, `disable_lockdown`, and metadata-only edits forwarded by `run()`. Update the check to derive “any patchable field set” from the actual patch surface or the same clap grouping used for the CLI, rather than manually listing only the current BMC/chassis fields. Keep the existing operator-facing error style, but ensure the `validate()` gate matches every field that `run()` can legitimately forward.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@crates/api-db/migrations/20260628075228_expected_machines_bmc_ip_allocation.sql`:
- Around line 8-9: The new bmc_ip_allocation column is added in the migration,
but existing expected_machines rows are not backfilled from retained BMC leases.
Update this migration to also convert preexisting rows that have a DHCP BMC
address while bmc_ip_address is unset so they persist as static rather than
staying at the default auto; use the migration itself to preserve data for
upgraded deployments, keeping it idempotent and aligned with expected_machines
and bmc_ip_allocation_t.
In `@crates/rpc/build.rs`:
- Around line 56-63: The serde derive for BmcIpAllocationType currently
serializes enum variants with their Rust casing, which will break JSON/CLI
consistency for ExpectedMachineJson.bmc_ip_allocation. Update the type
attributes for rpc::forge::BmcIpAllocationType in build.rs to add a serde rename
policy that emits snake_case names while keeping the existing clap::ValueEnum
derive intact. This change should be applied on the same enum attribute block so
the generated JSON matches the lowercase values used by the CLI.
In `@crates/rpc/src/model/expected_machine.rs`:
- Around line 276-289: The `bmc_ip_allocation` conversion in
`expected_machine.rs` is collapsing `None` into `BmcIpAllocationType::default()`
too early, which causes the update path in `handlers::expected_machine::update`
to overwrite existing non-default values when the field is omitted. Keep
`bmc_ip_allocation` as `Option<_>` through the model/handler boundary and only
apply the default when creating a new record or after merging with the existing
row in the update flow. Add a regression test around the `ExpectedMachine`
update path to verify an omitted `bmc_ip_allocation` preserves an existing
`dynamic` or `retained` value.
In `@crates/site-explorer/src/lib.rs`:
- Around line 1884-1892: The BMC allocation reconciliation only promotes DHCP
leases to static in the `retains_dynamic_ip(false)` branch, so switching a host
from retained/auto to `bmc_ip_allocation=dynamic` never undoes the previous
reservation. Update the reconciliation logic around
`expected_machine.data.bmc_ip_allocation` and `try_retain_bmc` to explicitly
clear or revert the BMC lease back to an expirable DHCP state when dynamic is
selected, and ensure the inverse path uses the existing BMC MAC-based
lookup/retention helpers. Add coverage in `site_explorer.rs` for an
`auto/retained -> dynamic` transition so the database row is reconciled back
correctly.
---
Outside diff comments:
In `@crates/admin-cli/src/expected_machines/patch/args.rs`:
- Around line 227-238: The validation guard in `validate()` is using a stale
hand-maintained allowlist, so it rejects valid single-field patches such as
`default_pause_ingestion_and_poweron`, `bmc_retain_credentials`,
`disable_lockdown`, and metadata-only edits forwarded by `run()`. Update the
check to derive “any patchable field set” from the actual patch surface or the
same clap grouping used for the CLI, rather than manually listing only the
current BMC/chassis fields. Keep the existing operator-facing error style, but
ensure the `validate()` gate matches every field that `run()` can legitimately
forward.
In `@crates/admin-cli/src/expected_machines/update/mod.rs`:
- Around line 41-72: The expected-machine update path is treating omitted
bmc_ip_allocation as “keep existing” via patch_expected_machine, which conflicts
with the update-file semantics in ExpectedMachineJson and the “result matches
the file” behavior. Update the expected_machines::update flow to translate a
missing bmc_ip_allocation in the parsed JSON into an explicit Auto value before
calling ctx.api_client.patch_expected_machine, or otherwise make the CLI/docs
clearly require Auto to be written out; keep the fix localized around
ExpectedMachineJson and patch_expected_machine call site.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f8be7e37-cae7-4fc4-8495-6b61f3ae86a4
📒 Files selected for processing (22)
crates/admin-cli/src/expected_machines/add/args.rscrates/admin-cli/src/expected_machines/common.rscrates/admin-cli/src/expected_machines/patch/args.rscrates/admin-cli/src/expected_machines/patch/mod.rscrates/admin-cli/src/expected_machines/tests.rscrates/admin-cli/src/expected_machines/update/mod.rscrates/admin-cli/src/rpc.rscrates/api-core/src/handlers/expected_machine.rscrates/api-core/src/tests/expected_machine.rscrates/api-core/src/tests/sku.rscrates/api-db/migrations/20260628075228_expected_machines_bmc_ip_allocation.sqlcrates/api-db/src/expected_machine.rscrates/api-db/src/expected_machine/tests.rscrates/api-db/src/machine_interface.rscrates/api-db/src/machine_interface/tests.rscrates/api-model/src/expected_machine.rscrates/machine-a-tron/src/api_client.rscrates/rpc/build.rscrates/rpc/proto/forge.protocrates/rpc/src/model/expected_machine.rscrates/site-explorer/src/lib.rscrates/site-explorer/tests/site_explorer.rs
| ALTER TABLE expected_machines | ||
| ADD COLUMN bmc_ip_allocation bmc_ip_allocation_t NOT NULL DEFAULT 'auto'; |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Backfill existing retained BMC leases in the migration.
Rows added here default to auto, but the migration never converts preexisting BMC dhcp address rows to static when bmc_ip_address is unset. On upgraded deployments, those machines remain vulnerable to lease expiry until site-explorer happens to reconcile them, which breaks the new default-retention behavior.
Proposed migration backfill
ALTER TABLE expected_machines
ADD COLUMN bmc_ip_allocation bmc_ip_allocation_t NOT NULL DEFAULT 'auto';
+
+UPDATE machine_interface_addresses mia
+SET allocation_type = 'static'
+FROM machine_interfaces mi
+JOIN expected_machines em
+ ON em.bmc_mac_address = mi.mac_address
+WHERE mia.interface_id = mi.id
+ AND mi.interface_type = 'Bmc'
+ AND mia.allocation_type = 'dhcp'
+ AND em.bmc_ip_address IS NULL
+ AND em.bmc_ip_allocation = 'auto';As per path instructions, crates/api-db/migrations/**: review database migrations for forward/backward compatibility, idempotency assumptions, data preservation, and consistency with Rust and Go data models.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ALTER TABLE expected_machines | |
| ADD COLUMN bmc_ip_allocation bmc_ip_allocation_t NOT NULL DEFAULT 'auto'; | |
| ALTER TABLE expected_machines | |
| ADD COLUMN bmc_ip_allocation bmc_ip_allocation_t NOT NULL DEFAULT 'auto'; | |
| UPDATE machine_interface_addresses mia | |
| SET allocation_type = 'static' | |
| FROM machine_interfaces mi | |
| JOIN expected_machines em | |
| ON em.bmc_mac_address = mi.mac_address | |
| WHERE mia.interface_id = mi.id | |
| AND mi.interface_type = 'Bmc' | |
| AND mia.allocation_type = 'dhcp' | |
| AND em.bmc_ip_address IS NULL | |
| AND em.bmc_ip_allocation = 'auto'; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@crates/api-db/migrations/20260628075228_expected_machines_bmc_ip_allocation.sql`
around lines 8 - 9, The new bmc_ip_allocation column is added in the migration,
but existing expected_machines rows are not backfilled from retained BMC leases.
Update this migration to also convert preexisting rows that have a DHCP BMC
address while bmc_ip_address is unset so they persist as static rather than
staying at the default auto; use the migration itself to preserve data for
upgraded deployments, keeping it idempotent and aligned with expected_machines
and bmc_ip_allocation_t.
Source: Path instructions
| .type_attribute( | ||
| ".forge.BmcIpAllocationType", | ||
| "#[cfg_attr(feature = \"cli\", derive(clap::ValueEnum))]", | ||
| ) | ||
| .type_attribute( | ||
| ".forge.BmcIpAllocationType", | ||
| "#[derive(serde::Serialize, serde::Deserialize)]", | ||
| ) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Where the JSON model and new field are declared/used:"
rg -n -C2 'ExpectedMachineJson|bmc_ip_allocation' crates/admin-cli/src crates/machine-a-tron/src
echo
echo "Current generated attributes for BmcIpAllocationType:"
rg -n -C2 'BmcIpAllocationType|serde\(rename_all' crates/rpc/build.rsRepository: NVIDIA/infra-controller
Length of output: 14925
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Enum definition and values:"
rg -n -C3 'enum BmcIpAllocationType|BmcIpAllocationType' crates proto . --glob '!target' --glob '!*.lock'
echo
echo "Any serde rename attributes already applied to this enum or its JSON wrapper:"
rg -n -C3 'rename_all|rename =|BmcIpAllocationType' crates/admin-cli/src crates/rpc/build.rs --glob '!target'Repository: NVIDIA/infra-controller
Length of output: 50379
Use snake_case serde names for BmcIpAllocationType ExpectedMachineJson.bmc_ip_allocation serializes rpc::forge::BmcIpAllocationType directly, so this derive will emit Auto/Dynamic/Fixed/Retained in JSON unless the enum gets #[serde(rename_all = "snake_case")]. That keeps file-based CLI flows aligned with the lowercase values used on the command line.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/rpc/build.rs` around lines 56 - 63, The serde derive for
BmcIpAllocationType currently serializes enum variants with their Rust casing,
which will break JSON/CLI consistency for ExpectedMachineJson.bmc_ip_allocation.
Update the type attributes for rpc::forge::BmcIpAllocationType in build.rs to
add a serde rename policy that emits snake_case names while keeping the existing
clap::ValueEnum derive intact. This change should be applied on the same enum
attribute block so the generated JSON matches the lowercase values used by the
CLI.
| } else if expected_machine | ||
| .data | ||
| .bmc_ip_allocation | ||
| .retains_dynamic_ip(false) | ||
| { | ||
| // No operator-specified BMC IP, but the host's bmc_ip_allocation | ||
| // retains its auto-allocated address: pin the BMC interface's | ||
| // DHCP lease as Static so it survives lease expiry. | ||
| try_retain_bmc(&self.database_connection, expected_machine.bmc_mac_address).await; |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift
Dynamic mode cannot undo a previously retained BMC lease.
Once this path has promoted a DHCP BMC address to static, later changing the same host to bmc_ip_allocation=dynamic never reverts that row. retains_dynamic_ip(false) just stops calling try_retain_bmc, and retain_bmc_address_by_mac only performs dhcp -> static, so the old reservation survives indefinitely. That breaks the new mode contract (dynamic should leave the lease expirable) and makes allocation-mode changes sticky in the DB. Please add the inverse reconciliation path here as well, and cover an auto/retained -> dynamic transition in crates/site-explorer/tests/site_explorer.rs.
Also applies to: 3325-3349
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/site-explorer/src/lib.rs` around lines 1884 - 1892, The BMC allocation
reconciliation only promotes DHCP leases to static in the
`retains_dynamic_ip(false)` branch, so switching a host from retained/auto to
`bmc_ip_allocation=dynamic` never undoes the previous reservation. Update the
reconciliation logic around `expected_machine.data.bmc_ip_allocation` and
`try_retain_bmc` to explicitly clear or revert the BMC lease back to an
expirable DHCP state when dynamic is selected, and ensure the inverse path uses
the existing BMC MAC-based lookup/retention helpers. Add coverage in
`site_explorer.rs` for an `auto/retained -> dynamic` transition so the database
row is reconciled back correctly.
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
rest-api/flow/internal/nicoapi/nicoproto/site_explorer.proto (1)
104-107: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick winUse
google.protobuf.Timestampfor the run timestamps.started_at,finished_at,last_successful_finished_at, andlast_failed_finished_atare instants;stringleaves RFC 3339 parsing and validation to every client. Switch them togoogle.protobuf.Timestampand addgoogle/protobuf/timestamp.protobefore this schema ships.Suggested change
+import "google/protobuf/timestamp.proto"; ... - string started_at = 1; + google.protobuf.Timestamp started_at = 1; ... - string finished_at = 2; + google.protobuf.Timestamp finished_at = 2; ... - optional string last_successful_finished_at = 9; + optional google.protobuf.Timestamp last_successful_finished_at = 9; ... - optional string last_failed_finished_at = 10; + optional google.protobuf.Timestamp last_failed_finished_at = 10;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rest-api/flow/internal/nicoapi/nicoproto/site_explorer.proto` around lines 104 - 107, The run timestamp fields in site_explorer.proto should use a proper instant type instead of string. Update the schema for the run-related fields, including started_at, finished_at, last_successful_finished_at, and last_failed_finished_at, to use google.protobuf.Timestamp, and add the required google/protobuf/timestamp.proto import so clients do not have to parse RFC 3339 manually.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@rest-api/flow/internal/nicoapi/nicoproto/site_explorer.proto`:
- Around line 104-107: The run timestamp fields in site_explorer.proto should
use a proper instant type instead of string. Update the schema for the
run-related fields, including started_at, finished_at,
last_successful_finished_at, and last_failed_finished_at, to use
google.protobuf.Timestamp, and add the required google/protobuf/timestamp.proto
import so clients do not have to parse RFC 3339 manually.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b9df0648-2347-4a1f-a467-ff7561bf48ff
⛔ Files ignored due to path filters (8)
rest-api/flow/internal/nicoapi/gen/nico.pb.gois excluded by!**/*.pb.go,!**/gen/**,!rest-api/**/*.pb.gorest-api/flow/internal/nicoapi/gen/nico_grpc.pb.gois excluded by!**/*.pb.go,!**/gen/**,!rest-api/**/*.pb.go,!rest-api/**/*_grpc.pb.gorest-api/flow/internal/nicoapi/gen/site_explorer.pb.gois excluded by!**/*.pb.go,!**/gen/**,!rest-api/**/*.pb.gorest-api/workflow-schema/schema/site-agent/workflows/v1/nico_nico.pb.gois excluded by!**/*.pb.go,!rest-api/**/*.pb.gorest-api/workflow-schema/schema/site-agent/workflows/v1/nico_nico_grpc.pb.gois excluded by!**/*.pb.go,!rest-api/**/*.pb.go,!rest-api/**/*_grpc.pb.gorest-api/workflow-schema/schema/site-agent/workflows/v1/site_explorer_nico.pb.gois excluded by!**/*.pb.go,!rest-api/**/*.pb.gorest-api/workflow-schema/site-agent/workflows/v1/nico_nico.protois excluded by!rest-api/workflow-schema/site-agent/workflows/v1/*_nico.protorest-api/workflow-schema/site-agent/workflows/v1/site_explorer_nico.protois excluded by!rest-api/workflow-schema/site-agent/workflows/v1/*_nico.proto
📒 Files selected for processing (24)
crates/admin-cli/src/expected_machines/add/args.rscrates/admin-cli/src/expected_machines/common.rscrates/admin-cli/src/expected_machines/patch/args.rscrates/admin-cli/src/expected_machines/patch/mod.rscrates/admin-cli/src/expected_machines/tests.rscrates/admin-cli/src/expected_machines/update/mod.rscrates/admin-cli/src/rpc.rscrates/api-core/src/handlers/expected_machine.rscrates/api-core/src/tests/expected_machine.rscrates/api-core/src/tests/sku.rscrates/api-db/migrations/20260628075228_expected_machines_bmc_ip_allocation.sqlcrates/api-db/src/expected_machine.rscrates/api-db/src/expected_machine/tests.rscrates/api-db/src/machine_interface.rscrates/api-db/src/machine_interface/tests.rscrates/api-model/src/expected_machine.rscrates/machine-a-tron/src/api_client.rscrates/rpc/build.rscrates/rpc/proto/forge.protocrates/rpc/src/model/expected_machine.rscrates/site-explorer/src/lib.rscrates/site-explorer/tests/site_explorer.rsrest-api/flow/internal/nicoapi/nicoproto/nico.protorest-api/flow/internal/nicoapi/nicoproto/site_explorer.proto
✅ Files skipped from review due to trivial changes (1)
- crates/api-core/src/tests/sku.rs
🚧 Files skipped from review as they are similar to previous changes (21)
- crates/admin-cli/src/expected_machines/common.rs
- crates/admin-cli/src/expected_machines/update/mod.rs
- crates/api-db/src/expected_machine/tests.rs
- crates/api-db/src/machine_interface/tests.rs
- crates/machine-a-tron/src/api_client.rs
- crates/admin-cli/src/expected_machines/tests.rs
- crates/api-core/src/handlers/expected_machine.rs
- crates/rpc/build.rs
- crates/rpc/proto/forge.proto
- crates/rpc/src/model/expected_machine.rs
- crates/api-db/migrations/20260628075228_expected_machines_bmc_ip_allocation.sql
- crates/admin-cli/src/expected_machines/patch/mod.rs
- crates/api-core/src/tests/expected_machine.rs
- crates/admin-cli/src/rpc.rs
- crates/admin-cli/src/expected_machines/patch/args.rs
- crates/site-explorer/tests/site_explorer.rs
- crates/api-db/src/expected_machine.rs
- crates/api-db/src/machine_interface.rs
- crates/api-model/src/expected_machine.rs
- crates/admin-cli/src/expected_machines/add/args.rs
- crates/site-explorer/src/lib.rs
Following NVIDIA#2920 (the machine snapshot now reads a BMC's IP live from the interface), give operators explicit control over how a BMC's IP is assigned -- and make the default keep it. A BMC that picks up a dynamic address now retains it as a static reservation, so it survives DHCP lease expiry instead of churning. BMC IP churn has been a recurring source of trouble -- DPF re-registration, snapshot/topology mismatch, hosts the state machine can't place -- so retaining by default removes it at the root. A new bmc_ip_allocation on ExpectedMachine selects the behavior: auto (the default) keeps a configured bmc_ip_address fixed and otherwise retains an auto-allocated one; fixed requires an address; retained pins an auto-allocated address; dynamic is the opt-out that leaves the lease expirable. A retained address is recorded as a static allocation, which lease expiry already leaves alone. - Add the bmc_ip_allocation field end to end (proto, model, migration, conversions) plus a --bmc-ip-allocation flag on expected-machine add/update/patch - Pin a retained BMC's auto-allocated address as static during site-explorer reconciliation, so lease expiry can't reap it - Validate the combinations -- fixed needs an address; dynamic and retained reject one Tests added! This supports NVIDIA#2952 Signed-off-by: Chet Nichols III <chetn@nvidia.com>
Following #2920 (the machine snapshot now reads a BMC's IP live from the interface), give operators explicit control over how a BMC's IP is assigned -- and make the default keep it. A BMC that picks up a dynamic address now retains it as a static reservation, so it survives DHCP lease expiry instead of churning. BMC IP churn has been a recurring source of trouble -- DPF re-registration, snapshot/topology mismatch, hosts the state machine can't place -- so retaining by default removes it at the root.
A new
bmc_ip_allocationonExpectedMachineselects the behavior:auto(the default) keeps a configuredbmc_ip_addressfixed and otherwise retains an auto-allocated one;fixedrequires an address;retainedpins an auto-allocated address;dynamicis the opt-out that leaves the lease expirable. A retained address is recorded as a static allocation, which lease expiry already leaves alone.bmc_ip_allocationfield end to end (proto, model, migration, conversions) plus a--bmc-ip-allocationflag onexpected-machineadd/update/patchfixedneeds an address;dynamicandretainedreject oneTests added!
This supports #2952
REST proto regeneration. Adding
bmc_ip_allocationto the Core protos tripsCheck REST Core Proto Sync, which only runs on PRs that touch core.protofiles. Regenerating the REST mirror (make -C rest-api core-proto,make -C rest-api/flow gen-nicoapi-pb) also carries in two syncs that are already present in Core but had not yet been mirrored. They ride along because the gate requires the regenerated tree to be fully clean -- they are not part of this change:SiteExplorerLastRun/GetSiteExplorerLastRun(metrics(site-explorer): instrument endpoint update path #2891)EXPIRE_DHCP_LEASE_STATUS_FEATURE_DISABLED(fix(dhcp): Guard DHCP lease expiry handling with a feature flag. #2877; only theworkflow-schemacopy had drifted)