feat: soft-start on missing broker positions; add /status endpoint#9
Conversation
The boot check used to bail the whole process if any configured symbol
was missing from broker positions. Mid-flight a vanishing position is
gentle (cached price stays, max-staleness eventually rejects requests,
other symbols keep working), but at boot the same condition produced
a total outage — and the "fail loud" rationale chose total outage as
the alert. Bad shape for prod: any future Fly restart after a position
went to 0 brings down every symbol, not just the affected one.
Also blocks new-token rollout: config.toml entries can't merge until
the hedging desk has acquired inventory, even though we can verify
wiring without a price.
Changes:
- src/main.rs: replace anyhow::bail! with a per-symbol error! log and a
summary warn!. Server starts in degraded mode with whatever symbols
have marks; healthy symbols quote normally; missing symbols get the
existing AppError::Unavailable (503) at request time.
- src/lib.rs: add /status returning JSON with signer, configured
symbols, and the currently-missing-from-cache set. /health stays
lenient ("ok" whenever process is running) so Fly liveness doesn't
recycle machines on a degraded-but-serving state. AppState now holds
the configured symbol list so /status can compute the missing set.
- tests/integration.rs: refactored test_app to test_app_with(addr,
symbol, price_opt) tuples so tests can build partial-cache states.
Added coverage for /status (both healthy and degraded), and an end-
to-end test that a configured-but-uncached symbol returns 503 from
/context/v1 while other symbols continue to serve.
Closes RAI-657.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR implements degraded startup behavior for missing symbols. ChangesDegraded startup with missing symbol reporting
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/main.rs`:
- Around line 87-95: Update the CLI/help text that documents
ALPACA_BROKER_ACCOUNT_ID and any related startup messaging to reflect that the
server now starts in a degraded mode instead of failing when symbols are
missing; locate the help text strings referenced near ALPACA_BROKER_ACCOUNT_ID
in main.rs (and any usage in the CLI/parser setup) and change wording to
indicate missing symbols will be served as 503 and the server exposes the
missing set via /status rather than aborting startup; keep references to
degraded/partial-serving behavior and mention monitoring via /status so the docs
align with QuoteCache/partial-serving logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b2ffb25d-3c33-4b33-9c7f-d3ab2f35f138
📒 Files selected for processing (3)
src/lib.rssrc/main.rstests/integration.rs
| // first /context/v1 request doesn't race the poll loop. Missing | ||
| // symbols are logged loudly but no longer fatal: the server starts | ||
| // in a partial-serving state where healthy symbols quote normally | ||
| // and missing symbols return 503 at request time. /status exposes | ||
| // the missing set so monitoring can pick up the partial state. We | ||
| // chose this over the old hard-bail because the bail took the whole | ||
| // oracle down on the next Fly restart whenever any single position | ||
| // went to 0 — and the "alert" was the outage itself. | ||
| let cache = Arc::new(QuoteCache::new()); |
There was a problem hiding this comment.
Update CLI help text to match degraded-startup behavior.
Startup now continues in degraded mode, but the CLI docs for ALPACA_BROKER_ACCOUNT_ID still say startup fails when a symbol is missing (Line 42-Line 43). Please align help text with current behavior.
Suggested diff
- /// Must be the issuer's account that holds every symbol listed in
- /// config.toml — startup will fail loud if any registered symbol
- /// has no current position.
+ /// Should be the issuer's account that backs configured symbols.
+ /// Missing positions do not block startup; affected symbols return
+ /// 503 until inventory appears (see /status).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/main.rs` around lines 87 - 95, Update the CLI/help text that documents
ALPACA_BROKER_ACCOUNT_ID and any related startup messaging to reflect that the
server now starts in a degraded mode instead of failing when symbols are
missing; locate the help text strings referenced near ALPACA_BROKER_ACCOUNT_ID
in main.rs (and any usage in the CLI/parser setup) and change wording to
indicate missing symbols will be served as 503 and the server exposes the
missing set via /status rather than aborting startup; keep references to
degraded/partial-serving behavior and mention monitoring via /status so the docs
align with QuoteCache/partial-serving logic.
Wires CEG, DRAM, TSM and SGOV through the oracle. Each entry maps the Base wrapper address to the Alpaca ticker; `config.toml` is the runtime registry the server uses to resolve order tokens to symbols when serving /context/v1. Includes the matching `examples/probe_local.rs` `TOKENS` update so the local smoke test can probe any of the four. Also folds in an `ORACLE_URL` env override on probe_local — set `ORACLE_URL=https://st0x-oracle-server.fly.dev/context/v1` to point the probe at prod, otherwise it falls back to the local server at `127.0.0.1:3000` as before. Wrapper addresses match the entries in S01-Issuer registry PRs #21 and #22. With PR #9's resilience landed, the four new tokens will start in the "missing broker position" state (503 at /context/v1) until the issuer omnibus acquires inventory — that's tracked in RAI-729. No deploy gating needed; symbols come online automatically when positions appear. Closes RAI-569. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wires CEG, DRAM, TSM and SGOV through the oracle. Each entry maps the Base wrapper address to the Alpaca ticker; `config.toml` is the runtime registry the server uses to resolve order tokens to symbols when serving /context/v1. Includes the matching `examples/probe_local.rs` `TOKENS` update so the local smoke test can probe any of the four. Also folds in an `ORACLE_URL` env override on probe_local — set `ORACLE_URL=https://st0x-oracle-server.fly.dev/context/v1` to point the probe at prod, otherwise it falls back to the local server at `127.0.0.1:3000` as before. Wrapper addresses match the entries in S01-Issuer registry PRs #21 and #22. With PR ST0x-Technology#9's resilience landed, the four new tokens will start in the "missing broker position" state (503 at /context/v1) until the issuer omnibus acquires inventory — that's tracked in RAI-729. No deploy gating needed; symbols come online automatically when positions appear. Closes RAI-569. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The boot check used to bail the whole process if any configured symbol
was missing from broker positions. Mid-flight a vanishing position is
gentle (cached price stays, max-staleness eventually rejects requests,
other symbols keep working), but at boot the same condition produced
a total outage — and the "fail loud" rationale chose total outage as
the alert. Bad shape for prod: any future Fly restart after a position
went to 0 brings down every symbol, not just the affected one.
Also blocks new-token rollout: config.toml entries can't merge until
the hedging desk has acquired inventory, even though we can verify
wiring without a price.
Changes:
summary warn!. Server starts in degraded mode with whatever symbols
have marks; healthy symbols quote normally; missing symbols get the
existing AppError::Unavailable (503) at request time.
symbols, and the currently-missing-from-cache set. /health stays
lenient ("ok" whenever process is running) so Fly liveness doesn't
recycle machines on a degraded-but-serving state. AppState now holds
the configured symbol list so /status can compute the missing set.
symbol, price_opt) tuples so tests can build partial-cache states.
Added coverage for /status (both healthy and degraded), and an end-
to-end test that a configured-but-uncached symbol returns 503 from
/context/v1 while other symbols continue to serve.
Closes RAI-657.
Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com
Summary by CodeRabbit
/statusendpoint that displays the current signer address and reports any missing configured symbols/context/v1endpoint returns HTTP 503 status when handling requests for unavailable configured symbols