Skip to content

feat: "newest courses first" Algolia sort — config-gated replica registry + fail-safe#118

Open
macdiesel wants to merge 20 commits into
masterfrom
bbeggs/newest-courses-sort-replica
Open

feat: "newest courses first" Algolia sort — config-gated replica registry + fail-safe#118
macdiesel wants to merge 20 commits into
masterfrom
bbeggs/newest-courses-sort-replica

Conversation

@macdiesel

@macdiesel macdiesel commented Jun 12, 2026

Copy link
Copy Markdown
Member

Summary

Backend support for a "newest courses first" sort on the enterprise Learner Portal search page, via a recency-sorted Algolia replica. Builds and configures the replica; the waffle flag (edx-enterprise) and the MFE point search at it (separate PRs).

This PR also generalizes replica handling into one config-gated registry and makes replica configuration fail-safe. (The fail-safe was briefly split into a separate PR #119 — now folded back in, so the replica feature and its safe configuration ship together.)

What's here

  • Recency replicarecently_published_timestamp (earliest published course-run start) added to course records; a replica that leads customRanking with desc(recently_published_timestamp). Undated courses get 0 so they sort last (not the far-future ALGOLIA_DEFAULT_TIMESTAMP).
  • Unified replica registryALGOLIA_REPLICA_CONFIG_KEYS (in api_client/constants) is the single source of truth for all sort replicas (base duration replica + recency). _build_algolia_replicas, configure_algolia_index, and the secured-key restrictIndices all loop it. A replica is declared/configured/restricted only when its settings.ALGOLIA index-name key is set — inert by default (no virtual(None)), so it's safe to deploy ahead of ops enabling it. Adding a future sort is a registry entry + the field its ranking sorts on.
  • Unified set_index_settings — folded the old primary_index boolean / separate replica method into one name-based set_index_settings(index_settings, index_name=None) routing through _get_index.
  • Fail-safe configuration — each replica's set_index_settings is wrapped so an AlgoliaException is logged and skipped; one replica failing never aborts the reindex (primary + other replicas stay configured). Degraded state is always the safe base sort.
  • Docs — ADR 0014 records the design (registry, config-vs-flag layering, the missing-index open question + operational mitigation); REPLICA_INDEX_NAME documented as the duration replica that powers MFE video search.
  • Fixed incremental_reindex_algolia for the unified set_index_settings signature.

Test

  • Full test_algolia_utils.py / test_algolia.py / command tests green (only the 3 pre-existing time-sensitive test_get_course_run* failures remain, which fail on master too). isort + pylint clean.

Rollout (see ADR 0014)

Deploy + reindex_algolia here before ops sets the MFE env var / enables the enterprise.search_default_sort_newest waffle flag.

🤖 Generated with Claude Code

@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.96%. Comparing base (3975505) to head (672f1e4).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #118      +/-   ##
==========================================
+ Coverage   85.67%   85.96%   +0.29%     
==========================================
  Files         109      109              
  Lines        6700     6777      +77     
  Branches      827      840      +13     
==========================================
+ Hits         5740     5826      +86     
+ Misses        830      821       -9     
  Partials      130      130              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@macdiesel macdiesel force-pushed the bbeggs/newest-courses-sort-replica branch from ced0c19 to bdb8e35 Compare June 12, 2026 02:04
macdiesel added a commit that referenced this pull request Jun 12, 2026
Configuring the recently-published replica is an optional, additive step. If it
raises AlgoliaException (replica not yet declared, transient error), the failure
would otherwise propagate out of configure_algolia_index and abort the whole
reindex -- taking down configuration of the primary relevance index that every
learner depends on.

Wrap the recency-replica set_index_settings(index_name=...) call so the exception
is logged and skipped, leaving the primary index and its base replica configured.
The degraded state is always the safe base (relevance) sort; the recency replica
catches up on the next successful run. Record this fail-safe in ADR 0014 and cover
it with a test.

Stacked on the new-index PR (#118).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
macdiesel added a commit that referenced this pull request Jun 12, 2026
Configuring the recently-published replica is an optional, additive step. If it
raises AlgoliaException (replica not yet declared, transient error), the failure
would otherwise propagate out of configure_algolia_index and abort the whole
reindex -- taking down configuration of the primary relevance index that every
learner depends on.

Wrap the recency-replica set_index_settings(index_name=...) call so the exception
is logged and skipped, leaving the primary index and its base replica configured.
The degraded state is always the safe base (relevance) sort; the recency replica
catches up on the next successful run. Record this fail-safe in ADR 0014 and cover
it with a test.

Stacked on the new-index PR (#118).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
macdiesel added a commit that referenced this pull request Jun 15, 2026
Configuring an optional sort replica is additive and must never abort the
reindex. Wrap each per-replica set_index_settings(index_name=...) call (now driven
by the registry loop in configure_algolia_index) so an AlgoliaException is logged
and skipped, leaving the primary index and base replica configured. The degraded
state is always the safe base (relevance) sort; the replica catches up on the next
successful run. Record the fail-safe in ADR 0014 and cover it with a test.

Stacked on the new-index PR (#118).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
macdiesel added a commit that referenced this pull request Jun 15, 2026
Configuring a sort replica is additive and must never abort the reindex. Wrap each
per-replica set_index_settings(index_name=...) call in the unified registry loop so
an AlgoliaException is logged and skipped, leaving the primary index and the other
replicas configured. The degraded state is always the safe base (relevance) sort;
the failed replica catches up on the next successful run. Record the fail-safe in
ADR 0014 and cover it with a test (base replica succeeds while recency fails).

Stacked on the new-index PR (#118).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
macdiesel added a commit that referenced this pull request Jun 15, 2026
Configuring a sort replica is additive and must never abort the reindex. Wrap each
per-replica set_index_settings(index_name=...) call in the unified registry loop so
an AlgoliaException is logged and skipped, leaving the primary index and the other
replicas configured. The degraded state is always the safe base (relevance) sort;
the failed replica catches up on the next successful run. Record the fail-safe in
ADR 0014 and cover it with a test (base replica succeeds while recency fails).

Stacked on the new-index PR (#118).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@macdiesel macdiesel changed the title feat: recently-published Algolia replica for "newest courses first" sort feat: "newest courses first" Algolia sort — config-gated replica registry + fail-safe Jun 15, 2026
macdiesel and others added 11 commits June 15, 2026 15:18
…t" sort

Indexes recently_published_timestamp (the course's earliest published run start, sharing the is_course_new_content signal) and configures a desc()-sorted Algolia replica so the Learner Portal search can offer a "newest courses first" sort. The replica is only declared/configured when RECENTLY_PUBLISHED_REPLICA_INDEX_NAME is set (safe no-op until ops provisions it); the name is also added to the secured-API-key restrictIndices so the MFE's scoped key can query it.

ENT-11384

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reframes the docs/references note as docs/decisions/0014, an ADR whose central (proposed) decision is how to handle the recency replica being unavailable: rely on config-gating (covered today) + rollout order + the waffle-flag kill-switch, rather than runtime index-existence detection. Records the runtime-fallback alternative as the escape hatch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ourse-object field

Raises patch coverage on the newest-sort change: refactors the module-level replica-list construction into a testable _build_algolia_replicas() (covers both the configured and unconfigured branches); adds a _algolia_object_from_product course test asserting recently_published_timestamp; and adds generate_secured_api_key tests covering the recency replica being included/omitted from restrictIndices (the property and both branches).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The recency-replica coverage tests call the module-level _build_algolia_replicas
and _algolia_object_from_product helpers directly. Add the method-scoped
# pylint: disable=protected-access used elsewhere in this file so CI lint passes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the two-valued primary_index boolean with an optional index_name that
routes through the existing _get_index() helper (which already defaults to the
primary index). This folds the separate set_replica_index_settings method back
into set_index_settings: one method now configures the primary, the base
REPLICA_INDEX_NAME replica, and the recently-published replica uniformly.

Removes the near-duplicate method (same guard + try/except/log) and the docstring
cross-reference that signalled the overlap. All three call sites live in
configure_algolia_index; tests updated to match.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The unify refactor removed set_index_settings' primary_index kwarg, breaking this
caller (E1123 unexpected-keyword-arg). Configure the replica by name instead, and
wire the SDK client onto the AlgoliaSearchClient so the index_name lookup
(_get_index) resolves the replica -- the old primary_index=False path used the
cached replica_index handle and didn't need the client set.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Generalize the single recency-replica gate into a registry of optional sort
replicas (OPTIONAL_ALGOLIA_REPLICA_CONFIG_KEYS in api_client/constants, the shared
source of truth for which replicas exist). The indexer (_build_algolia_replicas,
configure_algolia_index) and the secured-API-key restriction now loop over the
registry: each replica is declared, configured, and made queryable ONLY when its
settings.ALGOLIA index-name key is non-empty.

This makes "not in config -> not used" structural and obvious (inert by default,
the safe thing to check in), and reduces a future sort order to a registry entry
plus the field its customRanking sorts on -- no change to the gating logic. The
waffle flag remains the instant, scoped runtime on/off; the config gate only
governs whether the replica infrastructure exists. Updates ADR 0014.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The registry previously covered only the optional sort replicas; the base
duration replica was still special-cased (hardcoded algolia_replica_index,
a dedicated configure call, a separate secured-key append). Generalize so ALL
replicas flow through one registry, base first:

- ALGOLIA_REPLICA_CONFIG_KEYS (api_client/constants) now lists every replica
  (REPLICA_INDEX_NAME, then RECENTLY_PUBLISHED_REPLICA_INDEX_NAME), in
  declaration order, as the single source of truth.
- _build_algolia_replicas / _configured_replicas / configure_algolia_index and
  the secured-key restrictIndices all loop the registry; each replica is
  declared/configured/restricted only when its index name is set. This also
  fixes a latent virtual(None) for the base replica when REPLICA_INDEX_NAME is
  unset.
- The client's init_index/index_exists still eagerly manage the primary +
  base-replica handles -- the required-core-pair lifecycle, intentionally left
  separate from "which replicas get declared/configured."

Updates ADR 0014 and tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
REPLICA_INDEX_NAME is the base replica sorted by desc(duration); the learner-portal
MFE points its video search (SEARCH_INDEX_IDS.VIDEOS) at this index (its
ALGOLIA_REPLICA_INDEX_NAME env var). Document this at the settings default and in
the replica registry, and note that the ALGOLIA dict is replaced (not merged) from
deployment config. No rename -- the key is an established, ops-facing config name.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Configuring a sort replica is additive and must never abort the reindex. Wrap each
per-replica set_index_settings(index_name=...) call in the unified registry loop so
an AlgoliaException is logged and skipped, leaving the primary index and the other
replicas configured. The degraded state is always the safe base (relevance) sort;
the failed replica catches up on the next successful run. Record the fail-safe in
ADR 0014 and cover it with a test (base replica succeeds while recency fails).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Step-by-step engineer guide: declare the settings key, register it in
ALGOLIA_REPLICA_CONFIG_KEYS, index the field the sort needs, define the replica's
customRanking, and map key -> settings. Covers the names-in-settings vs
definitions-in-code split, the deploy/reindex + ALGOLIA-is-replaced caveat, the
optional MFE wiring, and the inert-by-default / fail-safe properties. Cross-refs
ADR 0014 and uses a worked "price: low to high" example.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@macdiesel macdiesel force-pushed the bbeggs/newest-courses-sort-replica branch from c079f32 to da3bfa4 Compare June 15, 2026 19:23
macdiesel and others added 3 commits June 15, 2026 15:23
…art, any status)

Align the newest-courses sort with master's ENT-11386: is_new_content and the sort now
share _earliest_course_run_start, which keys off the earliest course-run start of ANY
status (the Discovery release date) rather than published runs only -- so an old course
with a recent re-publish no longer looks new or sorts as newest.

Rename to match the new semantics (the index-name key is brand-new and unreleased, so no
ops migration is needed):
  recently_published_timestamp           -> recently_released_timestamp
  get_course_recently_published_timestamp -> get_course_recently_released_timestamp
  RECENTLY_PUBLISHED_REPLICA_INDEX_NAME   -> RECENTLY_RELEASED_REPLICA_INDEX_NAME
  ALGOLIA_RECENTLY_PUBLISHED_..._SETTINGS -> ALGOLIA_RECENTLY_RELEASED_..._SETTINGS
  _earliest_published_course_run_start    -> _earliest_course_run_start

Updates tests, ADR 0014, and the how-to guide. Matching MFE rename follows in #1511.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- incremental_reindex_algolia: drop the now-dead replica_index init; the
  replica is configured by name via set_index_settings(index_name=...),
  which initializes it lazily through _get_index().
- configure_algolia_index: log a single-line WARNING when a replica fails
  to configure (set_index_settings already logs the traceback before
  re-raising), so a failed replica no longer emits two stack traces.
- configure_algolia_index: build the primary's declared replicas list
  dynamically from settings.ALGOLIA rather than an import-time global, so
  it stays consistent with the replicas actually configured and can be
  varied in tests via override_settings. Drop the ALGOLIA_REPLICAS global.
- ADR 0014: document that the virtual replica mirrors non-course records,
  which have no recently_released_timestamp and so sort last by design.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@macdiesel macdiesel marked this pull request as ready for review June 15, 2026 19:54
@macdiesel macdiesel requested review from a team as code owners June 15, 2026 19:54
Copilot AI review requested due to automatic review settings June 15, 2026 19:54

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds backend support for a “newest courses first” Algolia sort by introducing a recency-based replica index, while also consolidating replica handling behind a config-gated registry and making replica configuration resilient to partial failures.

Changes:

  • Adds recently_released_timestamp to course Algolia records and defines a recency-sorted replica (desc(recently_released_timestamp)).
  • Introduces a shared, config-gated replica registry (ALGOLIA_REPLICA_CONFIG_KEYS) and updates index configuration + secured-key restrictions to loop over configured replicas only.
  • Makes replica configuration fail-safe (replica failures are logged and skipped so primary configuration continues) and updates incremental reindex tooling/tests for the unified set_index_settings API.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
enterprise_catalog/settings/base.py Documents and adds the new RECENTLY_RELEASED_REPLICA_INDEX_NAME setting key.
enterprise_catalog/apps/search/management/commands/incremental_reindex_algolia.py Updates incremental reindex flow to configure replica settings via index_name=.
enterprise_catalog/apps/search/management/commands/tests/test_incremental_reindex_algolia.py Adjusts tests for lazy replica initialization via name-based settings calls.
enterprise_catalog/apps/catalog/algolia_utils.py Adds replica registry plumbing, recency replica settings, fail-safe per-replica configuration, and recently_released_timestamp computation.
enterprise_catalog/apps/catalog/tests/test_algolia_utils.py Adds coverage for recency timestamp, configured-replica gating, and safe-fail configuration behavior.
enterprise_catalog/apps/api_client/constants.py Defines ALGOLIA_REPLICA_CONFIG_KEYS as the shared replica registry.
enterprise_catalog/apps/api_client/algolia.py Implements name-based set_index_settings routing and secured-key restriction over all configured replicas.
enterprise_catalog/apps/api_client/tests/test_algolia.py Adds tests for set_index_settings(index_name=...) and restrictIndices behavior with/without the recency replica.
docs/how_to/add_an_algolia_sort_replica.rst Adds a step-by-step guide for adding future Algolia sort replicas via the registry pattern.
docs/decisions/0014-newest-courses-sort-replica.rst Adds ADR documenting the design, rollout strategy, and operational tradeoffs.

Comment thread enterprise_catalog/apps/api_client/algolia.py

@pwnage101 pwnage101 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic LGTM! Just have some nits and strong desires to rename things.

Comment thread enterprise_catalog/apps/catalog/algolia_utils.py Outdated
Comment thread enterprise_catalog/apps/catalog/algolia_utils.py Outdated
Comment thread enterprise_catalog/apps/catalog/algolia_utils.py Outdated
Comment thread enterprise_catalog/settings/base.py Outdated
Comment thread docs/decisions/0014-newest-courses-sort-replica.rst
- Rename _build_algolia_replicas -> _get_algolia_replica_names (it returns
  a list of replica names, not replicas) and the local replicas ->
  replica_names; add a list[str] return annotation. Update references in
  tests and the how-to guide.
- Add an int return annotation to get_course_recently_released_timestamp so
  the int return reads clearly despite the datetime-sounding name.
- settings/base.py: document each replica's purpose (the MFE video-search
  sort; the newest-courses sort) rather than the volatile customRanking
  detail, which now lives only in the *_REPLICA_INDEX_SETTINGS dicts.
- ADR 0014: move the design (what the replica does + the rollout) out of
  Context into Decision, leaving Context to background and the problem
  statement.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 16, 2026 01:56
…ilure

When set_index_settings() is called without index_name, it targets the cached
handle (self.algolia_index), which a caller may have wired to an alternate index
(the incremental reindex command's --index-name path). The failure log used
self.algolia_index_name (the configured primary name), which is stale or empty
for that caller. Derive the effective name from the handle instead, falling back
to the configured name. Addresses Copilot review feedback.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Comment on lines +273 to +276
for config_key in ALGOLIA_REPLICA_CONFIG_KEYS:
index_name = settings.ALGOLIA.get(config_key)
if index_name:
configured.append((index_name, ALGOLIA_REPLICA_INDEX_SETTINGS_BY_CONFIG_KEY[config_key]))
Comment on lines +472 to +478
# Declare the replicas dynamically (from the current settings.ALGOLIA) rather than from an
# import-time global, so the set the primary declares stays consistent with the replicas
# _configured_replicas() actually configures below -- and so tests can vary it via
# override_settings.
primary_settings = {**ALGOLIA_INDEX_SETTINGS, 'replicas': _get_algolia_replica_names()}
algolia_client.set_index_settings(primary_settings)
for index_name, index_settings in _configured_replicas():
…initialized client

Addresses two Copilot review comments:
- _configured_replicas(): a configured replica key with no entry in
  ALGOLIA_REPLICA_INDEX_SETTINGS_BY_CONFIG_KEY (registry drift) is now skipped
  with an error log instead of raising KeyError and aborting the reindex --
  consistent with the per-replica fail-safe behavior.
- configure_algolia_index(): fail fast with ImproperlyConfigured when the
  primary index handle is missing (init_index() bailed on missing INDEX_NAME /
  credentials). Since ALGOLIA is replaced per-environment, this prevents a
  misconfigured run from silently no-op'ing and reporting success.

Adds tests for both paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +24 to +27
* An Algolia *virtual* replica exists as soon as it is declared on the primary
index's settings (it mirrors the primary's records); it does not wait for a
populated record set. So once this service is deployed and ``reindex_algolia``
has run, the replica exists.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Selecting virtual replicas would be an Alternative Considered, not part of the Context section. Please move this below.

Also I don't think I agree with this as a foundation for rejecting virtual replicas. Virtual replicas simply don't support fully customizable sort orders, since there's always a built-in "relevance" factor as the top priority sort order. We'd use virtual replicas to save money and record usage, but the tradeoff is less precise and non-deterministic sorting.

Comment on lines +21 to +23
* ``ALGOLIA`` is *replaced* (not merged) from the deployment YAML, so a replica
is only live once ops sets its index-name key in ``edx-internal`` and a
``reindex_algolia`` run declares it on the primary.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI docs do this a lot where they explain low-level details, rather than what actually matters and what the end goals are. Very few people will read this in the first place, let alone understand it and have the correct takeaway, even if it is factually true.

My version is simpler:

Suggested change
* ``ALGOLIA`` is *replaced* (not merged) from the deployment YAML, so a replica
is only live once ops sets its index-name key in ``edx-internal`` and a
``reindex_algolia`` run declares it on the primary.
* The fundamental design of the enterprise-catalog logic assumes we will only provision one replica, so there is only one setting (``ALGOLIA['REPLICA_INDEX_NAME']``) and one python function to provision that one replica.

Comment on lines +98 to +99
Consequences
------------

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one missing consequence: cost. Another full replica (non-virtual) doubles our record count.

return settings.ALGOLIA.get('REPLICA_INDEX_NAME')

@property
def replica_index_names(self):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the amount of misdirection here is tripping me up. can we make it more directly settings-driven, something like:

# settings/base.py
ALGOLIA = {
    'INDEX_NAME': '',
    # Base replica: the long-standing sort the learner-portal MFE points its video search
    # (SEARCH_INDEX_IDS.VIDEOS) at -- the MFE's ALGOLIA_REPLICA_INDEX_NAME env var. Its
    # customRanking lives in ALGOLIA_REPLICA_INDEX_SETTINGS.
    'REPLICA_INDEX_NAME': '',
    'ADDITIONAL_REPLICA_INDEX_SETTINGS': {'newest_courses_index_repl': 'ALGOLIA_RECENTLY_RELEASED_REPLICA_INDEX_SETTINGS'}, 
    'APPLICATION_ID': '',
    'API_KEY': '',
}

this defines both the additional replica index names that are enabled and which algolia index setting to use.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but actually, that's still a lot of indirection, which is a code smell (with the original settings/algolia_utils.py relationship, not with your code). can we just yoink the settings values from algolia_utils into settings?

ALGOLIA_RECENTLY_RELEASED_REPLICA_INDEX_SETTINGS = {
    'customRanking': [
        'desc(recently_released_timestamp)',
        'asc(metadata_language)',
        'asc(visible_via_association)',
        'asc(created)',
        'desc(course_bayesian_average)',
        'desc(recent_enrollment_count)',
    ],
}
ALGOLIA = {
    'INDEX_NAME': '',
    # Base replica: the long-standing sort the learner-portal MFE points its video search
    # (SEARCH_INDEX_IDS.VIDEOS) at -- the MFE's ALGOLIA_REPLICA_INDEX_NAME env var. Its
    # customRanking lives in ALGOLIA_REPLICA_INDEX_SETTINGS.
    'REPLICA_INDEX_NAME': '',
    'ADDITIONAL_REPLICA_INDEX_SETTINGS': {'newest_courses_index_repl': ALGOLIA_RECENTLY_RELEASED_REPLICA_INDEX_SETTINGS}, 
    'APPLICATION_ID': '',
    'API_KEY': '',
}

in a later PR, we could clean up the rest of the algolia settings later to follow a similar pattern. I think that will be much more straightforward everywhere.

Copilot AI review requested due to automatic review settings June 16, 2026 14:03
…text)

Addresses Troy's ADR comments:
- Context: state the pre-existing single-replica design assumption (one setting,
  one provisioning function) as historical background, and frame the problem as
  adding a *second* replica.
- Decision: state explicitly that we declare a *virtual* replica.
- Alternatives considered: add 'standard (non-virtual) replica' -- deterministic
  newest-first but ~doubles record count/cost; we accept virtual's relevance-bias
  tradeoff to avoid that. Documents the escape hatch.
- Consequences: add the 'no added record cost' consequence (a standard replica
  would roughly double the indexed record count).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated no new comments.

macdiesel and others added 2 commits June 16, 2026 11:07
…A_INDEX_SETTINGS)

Per iloveagent57's review, collapse the replica registry's indirection into a single
settings-driven map:

- settings/base.py: define ALGOLIA['ADDITIONAL_REPLICA_INDEX_SETTINGS'], an
  index_name -> index settings map (config-as-code), and move the recency
  customRanking constant (ALGOLIA_RECENTLY_RELEASED_REPLICA_INDEX_SETTINGS) here.
- settings/production.py: add 'ALGOLIA' to DICT_UPDATE_KEYS so the deployment YAML
  is merged (not replaced) -- ops can override per-env index names/credentials while
  the code-defined replica settings are preserved.
- Remove ALGOLIA_REPLICA_CONFIG_KEYS (constants.py) and
  ALGOLIA_REPLICA_INDEX_SETTINGS_BY_CONFIG_KEY (algolia_utils.py); _get_algolia_replica_names,
  _configured_replicas, and replica_index_names now read settings.ALGOLIA directly
  (base REPLICA_INDEX_NAME + ADDITIONAL_REPLICA_INDEX_SETTINGS keys). No more cross-module
  registry, so the constants.py circular-import workaround is gone and the settings-map
  drift (a KeyError risk Copilot flagged) is structurally impossible.

Behavior change: the recency replica now ships declared in code (always declared in
every env once reindexed), rather than being inert until ops set an index name. The
replica is virtual (no record cost), and user-facing exposure is still gated by the
enterprise.search_default_sort_newest waffle flag + Optimizely experiment.

Updates ADR 0014 and the how-to guide; rewrites the affected tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 16, 2026 16:14

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Comment on lines +453 to +457
# init_index() logs and returns without setting algolia_index when a required name
# (INDEX_NAME / REPLICA_INDEX_NAME) or credential is missing. set_index_settings() would
# then silently no-op for every index, making a misconfigured reindex look successful.
# Since ALGOLIA is replaced (not merged) per-environment, fail loudly here instead.
raise ImproperlyConfigured(
Comment on lines +110 to +113
#. The replica's name and settings ship in code (``ADDITIONAL_REPLICA_INDEX_SETTINGS``), so no
edx-internal change is required to *declare* it. If an environment needs a different index name,
ops overrides ``ALGOLIA['ADDITIONAL_REPLICA_INDEX_SETTINGS']`` in edx-internal -- because
``ALGOLIA`` is merged, only the keys ops sets are overridden.
Comment on lines +4 to +8
Status
------

Proposed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants