Skip to content

feat: fail safe when configuring the optional recency replica#119

Closed
macdiesel wants to merge 1 commit into
bbeggs/newest-courses-sort-replicafrom
bbeggs/recency-replica-failsafe
Closed

feat: fail safe when configuring the optional recency replica#119
macdiesel wants to merge 1 commit into
bbeggs/newest-courses-sort-replicafrom
bbeggs/recency-replica-failsafe

Conversation

@macdiesel

Copy link
Copy Markdown
Member

Summary

Stacked on top of #118 (the new recency-replica index). Splits the fail-safe behavior into its own PR so the index feature and the resilience change can be reviewed independently.

Base branch: bbeggs/newest-courses-sort-replica (PR #118). I'll retarget this to master once #118 merges. The diff here is only the fail-safe change.

What & why

Configuring the recently-published replica is an optional, additive step. Before this change, if set_replica_index_settings raised AlgoliaException (replica not yet declared, transient Algolia error), the exception propagated out of configure_algolia_index and aborted the entire reindex — taking down configuration of the primary relevance index that every learner's search depends on.

This wraps only the recency-replica step so the exception is logged and skipped, leaving the primary index and its base replica configured. The degraded state is always the safe base (relevance) sort; the recency replica catches up on the next successful run.

  • Kept set_replica_index_settings re-raising (a faithful low-level client method); the "this replica is optional" policy lives at the orchestration layer in configure_algolia_index.
  • New test test_configure_algolia_index_recency_replica_failure_is_safe forces the replica step to raise and asserts: no propagation, primary + base replica still configured, failure logged.
  • ADR 0014 updated with the backend fail-safe decision + consequence.

Test

  • 6 targeted tests pass incl. the new safe-fail test; full test_algolia_utils.py green except 3 pre-existing time-sensitive test_get_course_run* failures (fail on master too).
  • isort + pylint clean on changed files.

🤖 Generated with Claude Code

@macdiesel macdiesel force-pushed the bbeggs/recency-replica-failsafe branch 2 times, most recently from 126d2bb to 140fa30 Compare June 12, 2026 16:00
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (bbeggs/newest-courses-sort-replica@a7eda5c). Learn more about missing BASE report.

Additional details and impacted files
@@                          Coverage Diff                          @@
##             bbeggs/newest-courses-sort-replica     #119   +/-   ##
=====================================================================
  Coverage                                      ?   85.62%           
=====================================================================
  Files                                         ?      109           
  Lines                                         ?     6638           
  Branches                                      ?      814           
=====================================================================
  Hits                                          ?     5684           
  Misses                                        ?      821           
  Partials                                      ?      133           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@macdiesel macdiesel force-pushed the bbeggs/recency-replica-failsafe branch 3 times, most recently from cb896fb to 072ca47 Compare June 15, 2026 18:10
Configuring a sort replica is additive and must never abort the reindex. Wrap each
per-replica set_index_settings(index_name=...) call in the unified registry loop so
an AlgoliaException is logged and skipped, leaving the primary index and the other
replicas configured. The degraded state is always the safe base (relevance) sort;
the failed replica catches up on the next successful run. Record the fail-safe in
ADR 0014 and cover it with a test (base replica succeeds while recency fails).

Stacked on the new-index PR (#118).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@macdiesel

Copy link
Copy Markdown
Member Author

Folded into #118 — the fail-safe commit was cherry-picked onto bbeggs/newest-courses-sort-replica, so the replica feature and its safe configuration now ship as a single PR. Closing this in favor of #118.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant