Skip to content

Send X-Caller-Budget-Ms to LML on /lookup (pair with LML#370 + BS#1053) #161

@jakebromberg

Description

@jakebromberg

Problem

services/lookup_client.py calls POST /api/v1/lookup on LML without an X-Caller-Budget-Ms header. LML accepts this header per LML#345 (closed) and uses it to bound its search-pipeline budget, falling back to the env default LML_SEARCH_BUDGET_MS=4000ms when absent.

After #158 (closed 2026-06-09) raised ROM's per_attempt_timeout to 20s, the mismatch widened: ROM gives LML up to 20s to find a match, but LML internally cuts off at 4s because it doesn't know the caller will wait longer. Conversely, when ROM is in a degraded budget posture, LML can't shorten its work either.

This is the ROM-side mirror of WXYC/Backend-Service#1053. Same problem, different repo, same header.

Why this matters now

  • LML#370 is in flight: cascade-exhaustion hard cap that's supposed to respect the caller budget. Without the header, LML#370 will cap at LML_SEARCH_BUDGET_MS regardless of ROM's actual intent.
  • Bump LML lookup timeout; reconsider retry on TimeoutException #158's fix bought ROM more wall-clock to wait for a result, but that extra time is wasted if LML still gives up at 4s internally and returns "no match." User-visible symptom: cold-path lookups still degrade to search_unavailable even when LML could have found a match in the 5-15s window.

Suggested fix

In services/lookup_client.py's __init__ (or wherever the lookup HTTP request is composed), expose a caller_budget_ms parameter that defaults to the effective per-attempt budget minus ~200ms transport overhead (matching LML#345's server-side subtraction), and set the X-Caller-Budget-Ms request header on each POST /api/v1/lookup:

caller_budget_ms = max(int(self.per_attempt_timeout * 1000) - 200, 1000)
headers["X-Caller-Budget-Ms"] = str(caller_budget_ms)

Pair with Authorization: Bearer <LML_API_KEY> in the existing _auth_headers() builder.

Acceptance

  • LookupServiceClient sets X-Caller-Budget-Ms from the effective per-attempt timeout.
  • Unit test pins: when per_attempt_timeout=20.0, header is 19800.
  • Unit test pins: when caller passes an explicit caller_budget_ms, the explicit value wins.
  • Confirm in Sentry post-deploy: ROM's LML httpx spans on prod include the lml.caller_budget_ms attribute that LML emits server-side (per LML#345).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions