Skip to content

Feature: reconcile ontologyRank against Solr state (skip already-current ontologies)#300

Merged
mdorf merged 1 commit into
developfrom
feature/rank-propagator
Jun 23, 2026
Merged

Feature: reconcile ontologyRank against Solr state (skip already-current ontologies)#300
mdorf merged 1 commit into
developfrom
feature/rank-propagator

Conversation

@mdorf

@mdorf mdorf commented Jun 23, 2026

Copy link
Copy Markdown
Member

Summary

Makes RankSolrPropagator reconcile against Solr's actual state instead of tracking what it last wrote. For each ontology it first asks Solr how many docs are not already at the current rank (a cheap rows:0 count with a negative range filter); if none are stale it skips the ontology entirely, otherwise it cursor-scans just the stale docs and atomic-updates them. This replaces the Redis last-propagated skip-cache (removed, along with the now-obsolete force flag).

Why

The previous skip-cache only knew what we had propagated, so the first run couldn't skip ontologies whose Solr rank was already correct (set at index time and never drifted) and re-runs depended on whatever a prior run happened to record. Reconciling against Solr means even the first run skips already-current ontologies, and any run is self-healing and resumable from Solr's real state.

Staging result

A full pass over all 1898 ontologies on staging completed in 107.9 s (vs a projected ~1.5–2 h of full rewrites). Spot-checked the end state directly against staging Solr — MESH (355,402 docs), NCIT (206,378), DDSS (800,621), MDRFRE (82,908) each report 0 docs not at the current rank, confirming the speed comes from skipping redundant writes, not from skipping needed ones.

Notes

…ogies

Replace the Redis last-propagated skip-cache with a per-ontology Solr
check: count docs whose ontologyRank is not already the current value
(rows:0 with a negative range fq) and skip the ontology when none are
stale. This lets even the FIRST run skip ontologies whose Solr rank
already matches (set at index time and never drifted), and makes any
re-run resume from Solr's actual state rather than from what a prior
run happened to record. Only the stale docs are scanned and updated
(cursor over the negative-range filter is safe: updated docs leave the
set only behind the id-ascending cursor). Drops the Redis dependency
and the now-obsolete force flag.

Refs ncbo/ncbo_cron#132
@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.44444% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 80.97%. Comparing base (e559fe9) to head (43c6dec).

Files with missing lines Patch % Lines
...ogies_linked_data/services/rank_solr_propagator.rb 94.44% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #300      +/-   ##
===========================================
- Coverage    81.06%   80.97%   -0.09%     
===========================================
  Files          101      101              
  Lines         6902     6892      -10     
===========================================
- Hits          5595     5581      -14     
- Misses        1307     1311       +4     
Flag Coverage Δ
ag 80.86% <94.44%> (+<0.01%) ⬆️
fs 80.90% <94.44%> (-0.02%) ⬇️
gd 80.91% <94.44%> (+0.04%) ⬆️
unittests 80.97% <94.44%> (-0.09%) ⬇️
vo 80.89% <94.44%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mdorf mdorf merged commit c6f83eb into develop Jun 23, 2026
11 of 12 checks passed
@mdorf mdorf changed the title Feature: Reconcile ontologyRank against Solr state (skip already-current ontologies) Feature: reconcile ontologyRank against Solr state (skip already-current ontologies) Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant