feat: KSM-881 throttle retry with exponential backoff#68
Open
mgallego-keeper wants to merge 2 commits into
Open
feat: KSM-881 throttle retry with exponential backoff#68mgallego-keeper wants to merge 2 commits into
mgallego-keeper wants to merge 2 commits into
Conversation
Detect HTTP 403 {"error":"throttled"} inside PostQuery and retry up to
maxThrottleRetries (5) with exponential backoff (baseThrottleDelaySec = 11s,
doubling: 11/22/44/88/176s) plus +/-25% jitter, honoring retry_after from the
response body when present. Returns the new sentinel ErrThrottled (checkable
via errors.Is) once retries are exhausted.
The throttle check sits before HandleHttpError in the PostQuery loop, so the
existing key-rotation retry path is unaffected. No new public config; the sleep
is injectable in tests via a new optional Context.Sleep field (mirrors the
existing Context.Transport seam).
No version bump; CHANGELOG entry added under Unreleased. Adds core/throttle_test.go
(unit) and test/throttle_test.go (end-to-end); TestHTTPErrorErrorsAs switched to a
non-throttle code since "throttled" is now retried. go test ./... green in both modules.
idimov-keeper
previously approved these changes
Jun 8, 2026
Mirrors the Python SDK fix (PR #1033, per @stas-schaller review): parseThrottle only inspected the response body, so a non-403 response (e.g. 500/502) carrying {"error":"throttled"} would be retried 5x before failing. Gate the throttle parse/retry on ksmRs.StatusCode == 403 so non-403 responses fall straight through to HandleHttpError. Adds a regression test (TestThrottleBodyNon403NotRetried).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements throttle retry with exponential backoff for the Go SDK (epic KSM-876, task KSM-881). The Keeper backend throttles with HTTP 403
{"error":"throttled"}; previously the SDK returned a*KeeperHTTPErroron the first throttle. NowPostQueryretries transparently, so everySecretsManageroperation (GetSecrets,CreateSecret,UpdateSecret, …) gains throttle resilience with no caller changes.What changed
core/core.go— in thePostQueryloop, detect throttle responses beforeHandleHttpErrorand retry with exponential backoff + jitter; returnErrThrottled(wrapped) once retries are exhausted. AddsparseThrottle(detection /retry_after) andthrottleDelay(backoff + jitter), plus athrottleJitterseam for deterministic tests.core/errors.go— new exported sentinelErrThrottled(check witherrors.Is(err, ksm.ErrThrottled)).core/keeper_globals.go—maxThrottleRetries = 5,baseThrottleDelaySec = 11(1s margin over the backend's 10s memcached TTL). No new public config.core/payload.go— optionalContext.Sleep func(time.Duration)field (defaults totime.Sleep); a no-real-sleep test seam mirroring the existingContext.Transport, preserved acrossPrepareContextrebuilds.CHANGELOG.md— entry under## Unreleased. No version bump.Algorithm
Delays
11s, 22s, 44s, 88s, 176s(each ±25% jitter) → ~341s worst case over 5 retries. Aretry_afterin the response body takes precedence when present.HandleHttpErroris untouched, so key-rotation retry is unaffected.Tests
core/throttle_test.go(unit):throttleDelayexponential sequence,retry_afterprecedence, jitter bounds (pinned via thethrottleJitterseam);parseThrottletable (throttled /result_code/ other error / non-JSON / empty / non-numeric / negativeretry_after).test/throttle_test.go(end-to-end viaGetSecrets, sleeps recorded throughContext.Sleep): retry-then-success, multi-throttle delay bounds, exhaustion →errors.Is(ErrThrottled),retry_afterhonored, throttle + key-rotation compose, non-throttle 403 / non-JSON 502 not retried.TestHTTPErrorErrorsAsswitched to a non-throttle error code (throttling is now retried).Jira: KSM-881 · mirrors the Python implementation (Keeper-Security/secrets-manager#1031).