fix(search): never set live refresh_interval to -1 on reindex promotion#29153
fix(search): never set live refresh_interval to -1 on reindex promotion#29153harshach wants to merge 1 commit into
Conversation
…on promote A misconfigured liveIndexSettings.refreshInterval of "-1" (refresh disabled) was re-applied verbatim to the promoted index by pickRefreshInterval, leaving newly indexed documents unsearchable until a manual _refresh -- the "reindex finishes but the page is empty" symptom. Guard the live-revert so a disabled refresh is never used as the live value: override to 1s and log a warning. Both promotion paths (finalizeReindex and promoteEntityIndex) funnel through pickRefreshInterval, so the guard covers the centralized and per-entity flows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
❌ PR checklist incompleteThis PR cannot be merged until the following are addressed on its linked issue:
The fields live on the linked issue in the Shipping project (open the issue → right sidebar → Projects). After you set them, re-run this check (or push a commit) — issue/project changes do not re-trigger it automatically. Maintainers can bypass this check by adding the |
Code Review ✅ ApprovedPrevents the reindex promotion from applying a disabled refresh_interval (-1) to live indexes by overriding it to 1s. This ensures newly indexed documents remain searchable, and the fix is verified by a new unit test. OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
| @Test | ||
| @DisplayName("Live refresh_interval '-1' is never applied as a live value (guard -> 1s)") | ||
| void liveRefreshDisabledIsOverriddenToDefault() { | ||
| // Misconfiguration: liveIndexSettings.refreshInterval is "-1" (refresh disabled) — e.g. | ||
| // copied from the bulk side or a stale saved config. Without the guard the revert would | ||
| // faithfully re-apply "-1" to the promoted index, leaving it unsearchable until a manual | ||
| // _refresh (the "reindex finishes but the page is empty" symptom). The revert must override | ||
| // it back to the near-real-time default. | ||
| org.openmetadata.schema.system.IndexSettings live = | ||
| new org.openmetadata.schema.system.IndexSettings().withRefreshInterval("-1"); | ||
| org.openmetadata.schema.system.BulkIndexOverrides bulk = | ||
| new org.openmetadata.schema.system.BulkIndexOverrides().withRefreshInterval("-1"); | ||
| String json = DefaultRecreateHandler.buildRevertJson(live, bulk); | ||
| assertNotNull(json); | ||
| assertTrue(json.contains("\"refresh_interval\":\"1s\"")); | ||
| assertFalse(json.contains("\"refresh_interval\":\"-1\"")); | ||
| } |
There was a problem hiding this comment.
Test doesn't isolate the primary bug scenario
The new test passes -1 for both live and bulk, but because the live branch wins (first if in pickRefreshInterval), bulk's value never influences the result. The scenario where the misconfiguration exists on live alone (live="-1", bulk=null) — i.e., no bulk-reindex was active — is the most direct form of the bug and goes untested. A separate parameterized case for bulk=null would pin that path and prevent a future refactor from inadvertently skipping the guard only when bulk is absent.
|
🟡 Playwright Results — all passed (17 flaky)✅ 4294 passed · ❌ 0 failed · 🟡 17 flaky · ⏭️ 88 skipped
🟡 17 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |



Describe your changes:
I added a guard in
DefaultRecreateHandler.pickRefreshIntervalso arefresh_intervalof-1(refresh disabled) is never applied as a live serving value during reindex promotion, because a misconfiguredliveIndexSettings.refreshInterval=-1was being re-applied verbatim to the promoted index and left newly indexed documents unsearchable until a manual_refresh(the "reindex finishes but the page is empty" symptom). When the resolved live value would be-1, it is now overridden to1sand a warning is logged so operators can spot the bad config. Both promotion paths — centralizedfinalizeReindexand per-entitypromoteEntityIndex— funnel through this method, so the guard covers both flows. Added a unit test asserting the revert JSON yieldsrefresh_interval:"1s"and never-1.Type of change:
High-level design:
N/A — small, self-contained change to the reindex live-settings revert.
Tests:
Use cases covered
liveIndexSettings.refreshIntervalis-1: the promoted index is left with a near-real-time1srefresh instead of a disabled refresh, so create-then-search returns results without a manual_refresh.Unit tests
DefaultRecreateHandlerTest.BuildRevertJsonTests#liveRefreshDisabledIsOverriddenToDefault— RED without the fix (the old code returned the live-1).mvn -pl openmetadata-service test -Dtest=DefaultRecreateHandlerTest→ 39 tests, 0 failures.Backend integration tests
Ingestion integration tests
Playwright (UI) tests
Manual testing performed
Verified via the unit test above; no local stack run required for this internal logic change.
UI screen recording / screenshots:
Not applicable.
Checklist:
Fixes #<issue-number>above.Greptile Summary
This fix guards against a misconfigured
liveIndexSettings.refreshInterval=-1being re-applied verbatim to a promoted index during reindex, which left newly indexed documents invisible to search until a manual_refresh. A newisRefreshDisabledhelper and a warn+override inpickRefreshIntervalensure the live value is always at least1s.isRefreshDisabled(String)helper and overrides any-1refresh interval to\"1s\"with a warning log, covering both thefinalizeReindexandpromoteEntityIndexpromotion paths.buildRevertJsonwithlive.refreshInterval=\"-1\"emits\"1s\", though the isolated case oflive=\"-1\"withbulk=nullis not separately exercised.Confidence Score: 4/5
Safe to merge; the change is narrowly scoped to the reindex promotion path and the override only fires for the pathological
-1value.The production logic is correct and the fix covers both promotion paths. The one gap is that the test exercises both
liveandbulkset to-1simultaneously, so the simpler misconfiguration oflive=-1with no active bulk run is not directly pinned by a test — a future refactor could break that branch silently.The test file would benefit from an additional case covering
live="-1"withbulk=null.Important Files Changed
pickRefreshIntervalthat prevents a liverefresh_intervalof-1from being applied on index promotion; overrides it to1sand logs a warning. Logic refactored from two early-returns to a singleisRefreshDisabledhelper — semantics preserved for all existing paths.liveRefreshDisabledIsOverriddenToDefaulttest asserting the guard produces1sand never-1. Test exercises the scenario where both live and bulk carry-1, but the purelive=-1, bulk=nullcase (the most direct form of the misconfiguration) is not explicitly covered.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[pickRefreshInterval called\nlive, bulk] --> B{live != null\n&& live.refreshInterval != null?} B -- Yes --> C[refreshInterval = live.refreshInterval] B -- No --> D{bulk != null\n&& bulk.refreshInterval != null?} D -- Yes --> E[refreshInterval = '1s'\ndefault live value] D -- No --> F[refreshInterval = null] C --> G{isRefreshDisabled?\nrefreshInterval.trim == '-1'} E --> G F --> G G -- Yes --> H[LOG.warn\nOverride to '1s'] H --> I[refreshInterval = '1s'] G -- No --> J[return refreshInterval as-is] I --> J%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A[pickRefreshInterval called\nlive, bulk] --> B{live != null\n&& live.refreshInterval != null?} B -- Yes --> C[refreshInterval = live.refreshInterval] B -- No --> D{bulk != null\n&& bulk.refreshInterval != null?} D -- Yes --> E[refreshInterval = '1s'\ndefault live value] D -- No --> F[refreshInterval = null] C --> G{isRefreshDisabled?\nrefreshInterval.trim == '-1'} E --> G F --> G G -- Yes --> H[LOG.warn\nOverride to '1s'] H --> I[refreshInterval = '1s'] G -- No --> J[return refreshInterval as-is] I --> JReviews (1): Last reviewed commit: "fix(search): never apply refresh_interva..." | Re-trigger Greptile