Skip to content

record circuit-breaker success only on the actual hedged winner#14

Open
HrachShah wants to merge 1 commit into
mainfrom
fix/hedged-record-success-only-on-winner
Open

record circuit-breaker success only on the actual hedged winner#14
HrachShah wants to merge 1 commit into
mainfrom
fix/hedged-record-success-only-on-winner

Conversation

@HrachShah

@HrachShah HrachShah commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Summary

Executor.execute_hedged was calling circuit.record_success() on every involved provider's circuit breaker after a hedged request finished. The loser provider — which was either cancelled mid-flight or completed after the winner had already returned — did NOT serve the user, so marking its circuit CLOSED was wrong. It let a degraded provider appear healthy, silently lengthened the time between failure and circuit-trip, and the next user request to that provider would go to a broken backend that hadn't been probed again.

This change:

  1. Makes hedged_execute return a HedgedResult(response, winner_name, loser_names) so the caller knows which provider actually won.
  2. execute_hedged now records success on the winner's circuit and explicitly records failure (or treats the loser as still-pending) on the losers.
  3. Fixes an adjacent bug where the old contextlib.suppress(asyncio.CancelledError, Exception) swallowed real provider errors during cancel-and-drain.
  4. Drains late-completing loser tasks so we don't get Task was destroyed but it is pending! warnings at shutdown.

Tests

  • test_hedged_winner_records_success_only_on_winner: pre-opens both circuits with 3 failures, sleeps past the recovery window, runs a hedge where A is 10× faster than B, and asserts that A's circuit ends CLOSED with score 1.0 while B's circuit does NOT end CLOSED.
  • test_hedged_all_fail_records_failure_on_both: all providers raise, both circuits stay CLOSED (threshold=5, only 1 failure each).

Both pass. The full existing unit suite (85 tests) still passes.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Fixed circuit breaker crediting in parallel request execution to properly track provider performance. Only the fastest-responding provider is now marked as successful, while slower providers that lose the race are correctly recorded as failures rather than successes.
  • Tests

    • Added test coverage for circuit breaker state transitions during parallel request execution, including scenarios with varying response speeds and complete failures.

The old execute_hedged() looped every involved provider's circuit
breaker and called record_success() on all of them, including the
loser(s). The loser was either cancelled mid-flight or completed
after the winner had already returned. Either way the loser's
provider did NOT serve the user,

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @HrachShah, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

hedging.py gains a HedgedResult type that bundles the winning ChatCompletionResponse with the winner and loser provider names. hedged_execute now returns this type instead of a raw response. execute_hedged in executor.py consumes it to record record_success() only for the winner and record_failure(None) for each loser. Two new unit tests verify these circuit-breaker state transitions.

Changes

Winner-only circuit breaker credit in hedged execution

Layer / File(s) Summary
HedgedResult type and hedged_execute rework
freerelay/core/execution/hedging.py
Adds HedgedResult with __slots__ = ("response", "winner_name", "loser_names"); changes hedged_execute return type to HedgedResult; reworks the asyncio cancellation loop to walk the done set for winner/loser identification, drains un-awaited successful loser tasks, and returns a populated HedgedResult.
Executor integration and circuit breaker tests
freerelay/core/execution/executor.py, tests/unit/test_hedging_winner_only_success.py
execute_hedged renames the result variable, calls record_success() only on the winner CB and record_failure(None) on each loser CB, and returns result.response. Tests mock two providers, pre-open both circuits, verify the winner CB transitions to CLOSED while the loser CB does not, and confirm both CBs record failures when all providers raise.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(173, 216, 230, 0.5)
    note over execute_hedged,LoserCB: Success path
  end
  participant Caller
  participant execute_hedged
  participant hedged_execute
  participant WinnerCB as winner CB
  participant LoserCB as loser CB

  Caller->>execute_hedged: hedged request
  execute_hedged->>hedged_execute: providers, request
  hedged_execute-->>execute_hedged: HedgedResult(response, winner_name, loser_names)
  execute_hedged->>WinnerCB: record_success()
  execute_hedged->>LoserCB: record_failure(None)
  execute_hedged-->>Caller: result.response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 The fastest provider wins the race,
Its circuit breaker beams with grace.
The slower ones who lost the run
Get failure notes — fair's fairly done.
Only winners earn the CLOSED seal,
A hedging hop with righteous zeal! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the primary change: circuit breaker success recording is now limited to the actual hedged winner provider, which directly addresses the main bug fix.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/hedged-record-success-only-on-winner

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
tests/unit/test_hedging_winner_only_success.py (1)

62-63: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Docstring contradicts the actual assertion.

The docstring says B's circuit "stays OPEN", but the test sleeps past recovery_timeout (auto-transition to HALF_OPEN) and only asserts != CLOSED (Lines 100-102). Align the wording with the HALF_OPEN expectation to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_hedging_winner_only_success.py` around lines 62 - 63, The
docstring at the beginning of the test function states that B's circuit "stays
OPEN", but the test code sleeps past recovery_timeout (which causes an automatic
state transition) and the assertion on lines 100-102 only verifies the state is
not CLOSED. Update the docstring to accurately reflect that B's circuit
transitions to HALF_OPEN due to the recovery timeout, aligning the documentation
with the actual behavior being tested by the assertions.
freerelay/core/execution/hedging.py (1)

115-124: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Redundant drain loop over done losers.

These loser_providers are taken from the done set with exc is None, and their results were already retrieved via task.exception() in the loop above. They are not pending, so the "Task was destroyed but it is pending!" rationale doesn't apply, and the nested O(losers × done) scan plus t.result() is a no-op for already-completed successful tasks.

♻️ Suggested simplification
     if winner_task is not None:
         logger.info("Hedged winner: %s", winner_provider.name)
-        # Drain any leftover successful losers so their task objects
-        # aren't GC'd in a 'pending result' state — that's a common
-        # asyncio footgun that produces "Task was destroyed but it is
-        # pending!" warnings at shutdown.
-        for loser in loser_providers:
-            for t in done:
-                if tasks.get(t) is loser:
-                    with contextlib.suppress(asyncio.CancelledError, Exception):
-                        t.result()
-                    break
         return HedgedResult(
             response=winner_task.result(),
             winner_name=winner_provider.name,
             loser_names=tuple(p.name for p in loser_providers),
         )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@freerelay/core/execution/hedging.py` around lines 115 - 124, The nested loop
block iterating over loser_providers and done to call t.result() on
already-completed successful tasks is redundant. Since these successful losers
had their results already retrieved via task.exception() in the preceding loop
above and are not pending, this drain serves no purpose while creating
unnecessary O(losers × done) complexity. Remove the entire nested loop block
(the for loser in loser_providers loop through the break statement) that scans
the done set looking for matching task objects and calls t.result(), as it
provides no functional benefit for tasks that have already completed
successfully.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@freerelay/core/execution/hedging.py`:
- Around line 88-91: In the pending task cancellation loop starting with `for
task in pending:`, the current code only suppresses `asyncio.CancelledError`,
but if a pending task raises a non-CancelledError exception after cancellation,
it will propagate and escape the function before the winner is determined.
Instead of only suppressing CancelledError in the `await task` statement,
capture and log any unexpected exceptions that occur during the cancel/drain
phase, then suppress them so the function can continue to determine the winner.
- Around line 100-129: The HedgedResult class needs to be extended to track
failed losers (providers that completed with exceptions) along with their
exceptions, since currently only successful losers are included in loser_names
while failed losers are silently dropped from the errors list when a winner
exists. Modify the HedgedResult dataclass to add a new field for failed losers
with their exceptions, then in the hedging logic around lines 100-129, populate
this field by tracking which loser providers had exceptions, and include this
data when returning the HedgedResult. Finally, update the execute_hedged
function in executor.py to iterate over these failed losers and record their
failures in the circuit breaker feedback to prevent losing track of provider
degradation.

In `@tests/unit/test_hedging_winner_only_success.py`:
- Around line 91-93: The assertion message in the test exceeds the 88-character
line limit (currently 99 characters on line 92). Split the f-string message
across multiple lines to comply with the linting requirement. You can either
wrap the string by breaking it into concatenated parts or restructure how the
assertion message is formatted to keep each line under 88 characters while
maintaining the readability and meaning of the assertion about a_breaker.state
and CircuitState.CLOSED.

---

Nitpick comments:
In `@freerelay/core/execution/hedging.py`:
- Around line 115-124: The nested loop block iterating over loser_providers and
done to call t.result() on already-completed successful tasks is redundant.
Since these successful losers had their results already retrieved via
task.exception() in the preceding loop above and are not pending, this drain
serves no purpose while creating unnecessary O(losers × done) complexity. Remove
the entire nested loop block (the for loser in loser_providers loop through the
break statement) that scans the done set looking for matching task objects and
calls t.result(), as it provides no functional benefit for tasks that have
already completed successfully.

In `@tests/unit/test_hedging_winner_only_success.py`:
- Around line 62-63: The docstring at the beginning of the test function states
that B's circuit "stays OPEN", but the test code sleeps past recovery_timeout
(which causes an automatic state transition) and the assertion on lines 100-102
only verifies the state is not CLOSED. Update the docstring to accurately
reflect that B's circuit transitions to HALF_OPEN due to the recovery timeout,
aligning the documentation with the actual behavior being tested by the
assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ef72af54-6eab-44fc-be71-2dd4c3c4517a

📥 Commits

Reviewing files that changed from the base of the PR and between ec9d17f and d234e7b.

📒 Files selected for processing (3)
  • freerelay/core/execution/executor.py
  • freerelay/core/execution/hedging.py
  • tests/unit/test_hedging_winner_only_success.py

Comment on lines 88 to 91
for task in pending:
task.cancel()
with contextlib.suppress(asyncio.CancelledError, Exception):
with contextlib.suppress(asyncio.CancelledError):
await task

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Non-CancelledError from a cancelled task can escape and mask the winner.

If a pending task ignores/suppresses cancellation and then raises a non-CancelledError exception, await task re-raises it here, propagating out of hedged_execute before the winner is even determined — discarding a valid winning response. The comment's stated scenario (a task that "raised before we got here") can't apply to a pending task, since a task that already raised is in done, not pending. Consider logging and suppressing unexpected errors from the cancel/drain phase instead of letting them propagate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@freerelay/core/execution/hedging.py` around lines 88 - 91, In the pending
task cancellation loop starting with `for task in pending:`, the current code
only suppresses `asyncio.CancelledError`, but if a pending task raises a
non-CancelledError exception after cancellation, it will propagate and escape
the function before the winner is determined. Instead of only suppressing
CancelledError in the `await task` statement, capture and log any unexpected
exceptions that occur during the cancel/drain phase, then suppress them so the
function can continue to determine the winner.

Comment on lines +100 to +129
loser_providers: list[BaseProvider] = []
errors: list[BaseException] = []

for task in done:
if task.exception() is None:
winner = tasks[task]
logger.info("Hedged winner: %s", winner.name)
# Cancel any remaining done tasks that weren't checked yet
return task.result()
errors.append(task.exception()) # type: ignore[arg-type]
exc = task.exception()
if exc is None and winner_task is None:
winner_task = task
winner_provider = tasks[task]
elif exc is None:
loser_providers.append(tasks[task])
else:
errors.append(exc)

if winner_task is not None:
logger.info("Hedged winner: %s", winner_provider.name)
# Drain any leftover successful losers so their task objects
# aren't GC'd in a 'pending result' state — that's a common
# asyncio footgun that produces "Task was destroyed but it is
# pending!" warnings at shutdown.
for loser in loser_providers:
for t in done:
if tasks.get(t) is loser:
with contextlib.suppress(asyncio.CancelledError, Exception):
t.result()
break
return HedgedResult(
response=winner_task.result(),
winner_name=winner_provider.name,
loser_names=tuple(p.name for p in loser_providers),
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm executor is the only consumer of hedged_execute / HedgedResult
rg -nP '\bhedged_execute\s*\(' --type=py
rg -nP '\bHedgedResult\b' --type=py

Repository: HrachShah/FreeRelay

Length of output: 157


🏁 Script executed:

#!/bin/bash
# First, find the hedging.py file and explore the directory structure
find . -name "hedging.py" -type f 2>/dev/null | head -20
ls -la freerelay/core/execution/ 2>/dev/null || echo "Path not found, searching for execution directory"
find . -type d -name "execution" 2>/dev/null | head -10

Repository: HrachShah/FreeRelay

Length of output: 794


🏁 Script executed:

#!/bin/bash
# Search for files that might contain HedgedResult or hedged_execute
rg "HedgedResult|hedged_execute" --type py -l 2>/dev/null | head -20

Repository: HrachShah/FreeRelay

Length of output: 157


🏁 Script executed:

#!/bin/bash
# Broader search to understand project structure
git ls-files | grep -E "(hedging|executor)" | head -20

Repository: HrachShah/FreeRelay

Length of output: 339


🏁 Script executed:

#!/bin/bash
# Read hedging.py to verify the code at lines 100-129 and see HedgedResult definition
cat -n freerelay/core/execution/hedging.py

Repository: HrachShah/FreeRelay

Length of output: 5581


🏁 Script executed:

#!/bin/bash
# Read executor.py to verify how HedgedResult is used
cat -n freerelay/core/execution/executor.py

Repository: HrachShah/FreeRelay

Length of output: 8380


🏁 Script executed:

#!/bin/bash
# Verify if there are any tests that show the intended behavior
cat -n tests/unit/test_hedging_winner_only_success.py

Repository: HrachShah/FreeRelay

Length of output: 5512


Failed loser providers are silently dropped when a winner exists, preventing their circuit breakers from recording the failure.

When a winner completes successfully, loser_providers only collects done tasks where exc is None (line 109 of hedging.py); a loser that completed with an exception is pushed into the errors list (line 111), which is consumed only when no winner is found (line 133). This means a co-running loser that returned a real error has its exception swallowed and is not included in loser_names (line 128).

In executor.py (lines 171–179), execute_hedged iterates only over result.loser_names to record circuit breaker feedback. Losers that failed are not in loser_names, so their circuit breakers never observe their failure. A struggling provider that errors while another provider wins will have its failure silently discarded, breaking the circuit breaker's ability to track degradation.

Consider exposing failed losers (with their status/exception) on HedgedResult so the executor can record a real failure for them and update their circuit breakers accordingly.

🧰 Tools
🪛 GitHub Actions: CI / 0_typecheck.txt

[error] 114-114: mypy error: Item "None" of "BaseProvider | None" has no attribute "name" [union-attr]


[error] 127-127: mypy error: Item "None" of "BaseProvider | None" has no attribute "name" [union-attr]

🪛 GitHub Actions: CI / typecheck

[error] 114-114: mypy: Item "None" of "BaseProvider | None" has no attribute "name" [union-attr]


[error] 127-127: mypy: Item "None" of "BaseProvider | None" has no attribute "name" [union-attr]

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@freerelay/core/execution/hedging.py` around lines 100 - 129, The HedgedResult
class needs to be extended to track failed losers (providers that completed with
exceptions) along with their exceptions, since currently only successful losers
are included in loser_names while failed losers are silently dropped from the
errors list when a winner exists. Modify the HedgedResult dataclass to add a new
field for failed losers with their exceptions, then in the hedging logic around
lines 100-129, populate this field by tracking which loser providers had
exceptions, and include this data when returning the HedgedResult. Finally,
update the execute_hedged function in executor.py to iterate over these failed
losers and record their failures in the circuit breaker feedback to prevent
losing track of provider degradation.

Comment on lines +91 to +93
assert a_breaker.state == CircuitState.CLOSED, (
f"winner A should have success recorded and circuit reset to CLOSED, got {a_breaker.state}"
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Fix CI lint failure: line exceeds 88 chars (E501).

CI fails on Line 92 (99 > 88). Wrap the assertion message.

🧵 Proposed fix
-    assert a_breaker.state == CircuitState.CLOSED, (
-        f"winner A should have success recorded and circuit reset to CLOSED, got {a_breaker.state}"
-    )
+    assert a_breaker.state == CircuitState.CLOSED, (
+        "winner A should have success recorded and circuit reset to CLOSED, "
+        f"got {a_breaker.state}"
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert a_breaker.state == CircuitState.CLOSED, (
f"winner A should have success recorded and circuit reset to CLOSED, got {a_breaker.state}"
)
assert a_breaker.state == CircuitState.CLOSED, (
"winner A should have success recorded and circuit reset to CLOSED, "
f"got {a_breaker.state}"
)
🧰 Tools
🪛 GitHub Actions: CI / 1_lint.txt

[error] 92-92: E501 Line too long (99 > 88)

🪛 GitHub Actions: CI / lint

[error] 92-92: E501 Line too long (99 > 88)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_hedging_winner_only_success.py` around lines 91 - 93, The
assertion message in the test exceeds the 88-character line limit (currently 99
characters on line 92). Split the f-string message across multiple lines to
comply with the linting requirement. You can either wrap the string by breaking
it into concatenated parts or restructure how the assertion message is formatted
to keep each line under 88 characters while maintaining the readability and
meaning of the assertion about a_breaker.state and CircuitState.CLOSED.

Source: Pipeline failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant