feat(request): enforce request-line bans via BS /auth/check-request-ban#154
Merged
Conversation
jakebromberg
added a commit
that referenced
this pull request
May 31, 2026
… fail-open boundaries Addresses the substantive bucket of the max-effort review (PR #154): * **Ban bypass closed**: BanCheckClient now validates fingerprint against BS's UUID regex. A banned listener appending `X-Device-Fingerprint: not-a-uuid` would otherwise trigger BS 400 -> ROM treating it as `BanCheckUnavailableError` -> router failing open. Malformed fingerprint is dropped client-side; if no signal survives, the standard no-signal path engages and the router skips the BS call entirely. * **Required-key validation**: BS 200 must carry the `banned` key. A regression returning `{}` previously coerced to `banned=False` silently disabling enforcement; now raises `BanCheckUnavailableError` so the router fails open AND the operator sees the Sentry breadcrumb instead of a quiet drop in `request_blocked` events. * **InvalidURL fail-open**: `_NETWORK_ERRORS` broadened from the three explicit TransportError subclasses to `httpx.HTTPError` (the base) so a misconfigured `BS_CHECK_REQUEST_BAN_URL` (typo, missing scheme) fails open via the documented path instead of escaping as an unhandled `httpx.InvalidURL` 500-ing every /request. * **Shadow-ban hardened**: `posthog_client.capture` on the banned path is wrapped in try/except so a PostHog ingest outage can't prevent the 403 and surface a 500 instead — which would tell a banned listener that the backend is doing extra work, defeating shadow-ban. Tests: +9 cases (TestFingerprintValidation, TestMissingBannedKey, TestInvalidUrlFailsOpen) covering the new boundaries.
Adds an optional pre-parse gate on POST /request that consults Backend-Service's new POST /auth/check-request-ban endpoint (BS#1261) so abusive listeners are blocked before consuming Groq TPM, LML cache budget, or Slack noise. Behind the ENFORCE_REQUEST_BANS feature flag (default off) so the code can deploy before iOS 3.2 reaches App Store rollout. services/ban_check_client.py is a typed httpx wrapper. It treats BS 401/404 as proceed-as-unauth (v3.1 clients send no Authorization header and must never see 401 here) and raises BanCheckUnavailableError on network/timeout/5xx/400 so the router can fail open. routers/request.py runs the ban check after the empty-message guard and before parse. Banned callers get 403 with no Slack/Groq/LML and a request_blocked PostHog event (user_id, fingerprint, ban_reason, ban_source). BS outages log a Sentry breadcrumb and emit a new ban_check_unavailable degraded_mode plus an always-on ban_check_degraded telemetry property. config/settings.py + core/dependencies.py add ENFORCE_REQUEST_BANS, BS_CHECK_REQUEST_BAN_URL, BS_INTERNAL_KEY (held for future internal calls; the public /auth/check-request-ban handler does not gate on it). Unit coverage in tests/unit/test_ban_check_client.py and tests/unit/test_request_ban_enforcement.py for the full behavior matrix; contract smoke in tests/integration/ gated on BS_CHECK_REQUEST_BAN_URL (external_api marker, skipped by default). docs/architecture.md + docs/env-vars.md + .env.example updated. Closes #150
… fail-open boundaries Addresses the substantive bucket of the max-effort review (PR #154): * **Ban bypass closed**: BanCheckClient now validates fingerprint against BS's UUID regex. A banned listener appending `X-Device-Fingerprint: not-a-uuid` would otherwise trigger BS 400 -> ROM treating it as `BanCheckUnavailableError` -> router failing open. Malformed fingerprint is dropped client-side; if no signal survives, the standard no-signal path engages and the router skips the BS call entirely. * **Required-key validation**: BS 200 must carry the `banned` key. A regression returning `{}` previously coerced to `banned=False` silently disabling enforcement; now raises `BanCheckUnavailableError` so the router fails open AND the operator sees the Sentry breadcrumb instead of a quiet drop in `request_blocked` events. * **InvalidURL fail-open**: `_NETWORK_ERRORS` broadened from the three explicit TransportError subclasses to `httpx.HTTPError` (the base) so a misconfigured `BS_CHECK_REQUEST_BAN_URL` (typo, missing scheme) fails open via the documented path instead of escaping as an unhandled `httpx.InvalidURL` 500-ing every /request. * **Shadow-ban hardened**: `posthog_client.capture` on the banned path is wrapped in try/except so a PostHog ingest outage can't prevent the 403 and surface a 500 instead — which would tell a banned listener that the backend is doing extra work, defeating shadow-ban. Tests: +9 cases (TestFingerprintValidation, TestMissingBannedKey, TestInvalidUrlFailsOpen) covering the new boundaries.
…eturn pytest.fail() in mocked httpx handlers tripped mypy's [return] check on CI. AssertionError gives mypy the explicit NoReturn it wants and the test semantic is identical.
cdf5dea to
fe8d023
Compare
jakebromberg
added a commit
that referenced
this pull request
Jun 1, 2026
… fail-open boundaries Addresses the substantive bucket of the max-effort review (PR #154): * **Ban bypass closed**: BanCheckClient now validates fingerprint against BS's UUID regex. A banned listener appending `X-Device-Fingerprint: not-a-uuid` would otherwise trigger BS 400 -> ROM treating it as `BanCheckUnavailableError` -> router failing open. Malformed fingerprint is dropped client-side; if no signal survives, the standard no-signal path engages and the router skips the BS call entirely. * **Required-key validation**: BS 200 must carry the `banned` key. A regression returning `{}` previously coerced to `banned=False` silently disabling enforcement; now raises `BanCheckUnavailableError` so the router fails open AND the operator sees the Sentry breadcrumb instead of a quiet drop in `request_blocked` events. * **InvalidURL fail-open**: `_NETWORK_ERRORS` broadened from the three explicit TransportError subclasses to `httpx.HTTPError` (the base) so a misconfigured `BS_CHECK_REQUEST_BAN_URL` (typo, missing scheme) fails open via the documented path instead of escaping as an unhandled `httpx.InvalidURL` 500-ing every /request. * **Shadow-ban hardened**: `posthog_client.capture` on the banned path is wrapped in try/except so a PostHog ingest outage can't prevent the 403 and surface a 500 instead — which would tell a banned listener that the backend is doing extra work, defeating shadow-ban. Tests: +9 cases (TestFingerprintValidation, TestMissingBannedKey, TestInvalidUrlFailsOpen) covering the new boundaries.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an optional pre-parse gate on
POST /requestthat consults Backend-Service's newPOST /auth/check-request-banendpoint (BS#1261) so abusive listeners are blocked before consuming Groq TPM, LML cache budget, or Slack noise. Behind theENFORCE_REQUEST_BANSfeature flag (default off) so the code can land and deploy before iOS 3.2 (WXYC/wxyc-ios-64#351) reaches App Store rollout.Behavior matrix
request_blockedPostHog event withuser_id+fingerprint+ban_reason+ban_sourcePOST /requestdegraded_mode=ban_check_unavailable, proceedWhy not forward
X-Internal-Key?The ticket asks for it, but BS#1261's
apps/auth/app.tsregistration explicitly comments that the endpoint is "intentionally public (no X-Internal-Key gate)" — callers authenticate per-request via JWT + fingerprint, and BS bounds the cost with per-IP rate limiting. Forwarding the header would be a no-op at best and confusing at worst, so we don't.BS_INTERNAL_KEYis still parsed into settings because ROM holds the same secret for future internal calls (e.g. ROM#151's admin API path on BS).Files
services/ban_check_client.py(new) — typed httpx wrapper.BanCheckResult(frozen dataclass),BanCheckUnavailableError. Treats BS 401/404 as proceed-as-unauth; raises on network/timeout/5xx/400 so the router can fail open.routers/request.py— ban check runs after the empty-message guard and before parse, exactly per the ticket. Newban_check_unavailabledegraded mode + always-onban_check_degradedtelemetry property so LML-down and ban-check-down show up together in PostHog.core/dependencies.py—get_ban_check_clientreturns None unless both flag and URL are set, so the router has a single guard.config/settings.py—bs_check_request_ban_url,bs_internal_key,enforce_request_bans(default False).tests/unit/test_ban_check_client.py(new, 12 tests) — client contract: happy paths, proceed-as-unauth, fail-open, request shape.tests/unit/test_request_ban_enforcement.py(new, 11 tests) — router behavior matrix, including the LML-down × BS-down precedence case.tests/unit/test_dependencies.py—get_ban_check_clientprovider gating (flag off, URL unset, both set).tests/integration/test_ban_enforcement_e2e.py(new,external_api) — live BS contract smoke; skipped unlessBS_CHECK_REQUEST_BAN_URLis set so it's safe to leave in the default CI matrix.docs/architecture.md,docs/env-vars.md,.env.example— flow + env-var documentation.Local checks
ruff check .cleanruff format --check .cleanmypy .clean (63 files)pytest tests/unit/ -q→ 378 passed (+24 new)check_marker_ci_sync.py→ PASSTest plan
request_blockedemitted with all 4 propertiesdegraded_mode=ban_check_unavailabledegraded_mode=search_unavailable(more severe wins) +ban_check_degraded=true(independent property)BS_CHECK_REQUEST_BAN_URL=... TEST_ENV=staging pytest tests/integration/test_ban_enforcement_e2e.py -m external_api)ENFORCE_REQUEST_BANS=trueon staging and verify a known-banned fingerprint gets 403 from staging ROMRelated
Closes #150