Retry rate-limited and transient runs-index fetches#558
Open
robinaugh wants to merge 2 commits into
Open
Conversation
The runs index is rate limited (100 req/min), so paging a large result set with `rwx runs list` would die mid-run on a 429 (or a transient 5xx / empty body), forcing callers to hand-roll their own backoff loop. ListRuns now retries 429s (honoring Retry-After), transient 5xx, transient transport errors, and empty/non-JSON 200 bodies, with bounded exponential backoff reusing the existing retry.Backoff. Retries are announced via a RetryProgress writer so a multi-page list does not look hung; the command routes it to stderr under --json so structured stdout stays clean.
ListSandboxRuns also goes through ListRuns, so the new retry/backoff applied to sandbox commands too — but silently, making a transient blip look like a multi-second hang on the hot `sandbox exec` discovery path. Thread a retry-progress writer through ListSandboxRuns and pass s.Stderr from the sandbox call sites so the wait is announced. The mock's method takes the writer (matching the interface) but ignores it, keeping the existing zero-arg MockListSandboxRuns test literals unchanged.
11bc0b5 to
da01b7d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
Problem
The runs index is rate limited, so paging a large result set with
rwx runs listdied mid-run on a 429 (or a transient 5xx / empty body), forcing callers to hand-roll their own backoff.Solution
api.ListRuns, so every pagedlistand the internalListSandboxRunsget it consistently.429(honoringRetry-After), transient5xx, transient transport errors, and empty/non-JSON200s; non-retryable4xxsurface immediately. Bounded backoff reuses the existingretry.Backoff.rate limited by RWX, retrying in 60s (attempt 2/5)), routed to stderr under--jsonso stdout stays parseable. Sandbox run lookups (ListSandboxRuns) surface the same progress.Retry-Afterheader rather than echoing a number that would drift.Further confirmation needed