fix(pxe): scout-loader honors static_pxe_url override instead of hardcoding carbide-static-pxe.forge#2944
Conversation
…coding carbide-static-pxe.forge
The scout-loader hardcoded the rootfs (scout.squashfs) download host to
`carbide-static-pxe.forge` and ignored the configurable static-pxe URL that the
rest of the PXE flow already honors. The loader is the most critical early-boot
fetch, yet it was the one place that did not respect the override.
Repro: on a BlueField dpu-mode host, when the DPU's local resolver
(dnsmasq/HBN) is down or the site uses IP-only/custom DNS, the scout-loader
fails with:
LOADER: Failed to gather root filesystem information:
curl: (6) Could not resolve host: carbide-static-pxe.forge
so the host never finishes discovery -- even though the static-pxe server is
reachable by IP and a `static_pxe_url` override is configured.
Fix:
- crates/api-core/src/ipxe.rs: append `static_pxe_url=${base-url}` to the
scout.efi kernel command line (x86_64 and aarch64 host). `${base-url}` is
already set by the wrapping `pxe` template to the resolved
`{{ static_pxe_url }}/public/blobs/` (which honors
external_static_pxe_url / static_pxe_url_override), and iPXE expands it
before booting, so the scout kernel receives the resolved static-pxe base.
- pxe/common_files/scout-loader-rclocal: parse `static_pxe_url=` from
/proc/cmdline and build the rootfs URL from it; fall back to the existing
`http://carbide-static-pxe.forge` host when it is unset. This mirrors the
existing idiom in check-scout-updates.sh, which already derives its base from
a kernel cmdline arg with the same hostname fallback.
Backward-compatible: when no override is configured the loader falls back to
the original `carbide-static-pxe.forge` URL, so behavior is unchanged for
existing deployments. The explicit `newrootfs=<full-url>` override continues to
take precedence.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary by CodeRabbit
WalkthroughThe PXE boot command line now includes ChangesPXE boot URL propagation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/api-core/src/ipxe.rs (1)
210-213: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConfirm the generated
command_lineis exercised by tests. The current iPXE tests cover boot-image selection, but not the serialized command line. Add a table-driven case for the ARM host and X86 paths that assertsstatic_pxe_url=${base-url}, plus a negative check that the DPUcarbide.efibranch still omits it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/api-core/src/ipxe.rs` around lines 210 - 213, The iPXE command-line serialization in the iPXE path is not covered by tests, so add assertions around the generated command_line for the relevant branches in ipxe.rs. Extend the existing table-driven tests to verify both ARM host and X86 host cases include static_pxe_url=${base-url}, and add a negative assertion for the DPU carbide.efi path to confirm it still does not include that field. Use the existing iPXE command-line construction and branch-selection symbols to place the new checks alongside the current boot-image tests.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/api-core/src/ipxe.rs`:
- Around line 210-213: The iPXE command-line serialization in the iPXE path is
not covered by tests, so add assertions around the generated command_line for
the relevant branches in ipxe.rs. Extend the existing table-driven tests to
verify both ARM host and X86 host cases include static_pxe_url=${base-url}, and
add a negative assertion for the DPU carbide.efi path to confirm it still does
not include that field. Use the existing iPXE command-line construction and
branch-selection symbols to place the new checks alongside the current
boot-image tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 9d57241b-0e07-47cd-8430-a1dbf15fa36c
📒 Files selected for processing (2)
crates/api-core/src/ipxe.rspxe/common_files/scout-loader-rclocal
| // static_pxe_url passes the (already iPXE-expanded) static-pxe | ||
| // base to the scout-loader so it fetches scout.squashfs from the | ||
| // configured static-pxe server instead of hardcoding a hostname. | ||
| command_line: format!("mac={mac_address} console=tty0 console={console},115200 pci=realloc=off iommu=off cli_cmd=auto-detect machine_id={machine_interface_id} server_uri=[api_url] pxe_uri=[pxe_url] static_pxe_url=${{base-url}}"), |
There was a problem hiding this comment.
I don't think it's correct to pass base URL, isn't this a template, aka, shouldn't there be a static PXE url variable in scope?
The bug
The scout-loader hardcodes the rootfs (
scout.squashfs) download host tocarbide-static-pxe.forgeand ignores the configurable static-pxe URL that the rest of the PXE flow already honors:crates/api-core/src/cfg/file.rs:external_static_pxe_url: Option<String>crates/api-core/src/handlers/pxe.rs: resolvesstatic_pxe_url_overridecrates/pxe/src/routes/ipxe.rs: templatesstatic_pxe_urlinto the iPXE wrapperpxe/templates/pxe:set base-url {{ static_pxe_url }}/public/blobs/The scout-loader is the most critical early-boot fetch, yet it was the one place that did not respect the override. Before this change,
pxe/common_files/scout-loader-rclocaldid:newrootfsurl="http://carbide-static-pxe.forge/public/blobs/internal/${arch}/scout.squashfs"Verified repro (dpu-mode / unresolvable DNS)
On a BlueField dpu-mode host, when the DPU's local resolver (dnsmasq/HBN) is down or the site uses IP-only / custom DNS, the scout-loader fails with:
so the host never finishes discovery — even though the static-pxe server is reachable by IP and a
static_pxe_urloverride is configured.The fix
crates/api-core/src/ipxe.rs— appendstatic_pxe_url=${base-url}to thescout.efikernel command line (x86_64 and aarch64 host).${base-url}is already set by the wrappingpxetemplate to the resolved{{ static_pxe_url }}/public/blobs/(which honorsexternal_static_pxe_url/static_pxe_url_override), and iPXE expands it before booting, so the scout kernel receives the resolved static-pxe base. This reuses the value the override system already computes; no new plumbing.pxe/common_files/scout-loader-rclocal— parsestatic_pxe_url=from/proc/cmdlineand build the rootfs URL from it; fall back to the existinghttp://carbide-static-pxe.forgehost when it is unset. This mirrors the existing idiom inpxe/common_files/check-scout-updates.sh, which already derives its base from a kernel cmdline arg with the same hostname fallback.Resolution order in the loader:
newrootfs=<full-url>on the cmdline (explicit override) — unchanged, still highest priority.static_pxe_url=<base>on the cmdline (new) — the configured static-pxe URL.carbide-static-pxe.forge(fallback).Note: in current
mainthe per-archetc/rc.localfiles are generated at build time by thestage-scout-loader-rclocalMake task from the singlepxe/common_files/scout-loader-rclocalsource, so editing the one source file covers both profiles.Backward compatibility
When no override is configured the loader falls back to the original
http://carbide-static-pxe.forge/public/blobs/internal/${arch}/scout.squashfsURL, so behavior is unchanged for existing deployments.🤖 Generated with Claude Code