Skip to content

test(opensearch): phase-aware test bootstrap to run the indexing flow under a migration phase (#36266)#36268

Open
fabrizzio-dotCMS wants to merge 4 commits into
mainfrom
issue-36266-contentletdatagen-phase-control
Open

test(opensearch): phase-aware test bootstrap to run the indexing flow under a migration phase (#36266)#36268
fabrizzio-dotCMS wants to merge 4 commits into
mainfrom
issue-36266-contentletdatagen-phase-control

Conversation

@fabrizzio-dotCMS

@fabrizzio-dotCMS fabrizzio-dotCMS commented Jun 22, 2026

Copy link
Copy Markdown
Member

Proposed Changes

Closes #36266.

Integration tests don't go through the production startup path
(InitServlet.init() → ContentletIndexAPI.checkAndInitializeIndex() → IndexStartupValidator), so
an OpenSearch index set is never created — and an unreachable OS service is never detected — unless
a test bootstraps it explicitly. This blocked running the existing test flow under an ES→OS
migration phase and having content (and everything else that gets indexed, not just contentlets)
land in OpenSearch the way it lands in ElasticSearch.

This PR adds a small, separate phase-aware bootstrap helper and wires it into the test init, so
the whole indexing flow can run under a chosen migration phase with no per-data-gen
instrumentation
. The migration phase is already global config that the phase router reads on every
index operation — setting it and bootstrapping the stores is all that's needed.

MigrationPhaseStoreBootstrap (new test util)

  • ensureStoresForCurrentPhase() — for phases that involve OpenSearch (1/2/3), validates OS
    reachability and fails hard with a clear message if the cluster is unreachable, then
    creates/registers the index set (checkAndInitializeIndex() + cache clear). Phase 0 = ES only,
    OS untouched.
  • assertOpenSearchReachable(phase) — standalone fail-fast reachability check.
  • Reachability uses a plain client.info() rather than the full IndexStartupValidator.validate()
    on purpose: the latter also asserts endpoint separation, which fails by design in the
    single-cluster opensearch-upgrade profile.
  • Stricter than production on purpose: production is lenient in phases 1–2 (haltMigration()
    ES fallback), but a migration test that silently degraded to ES would mask the very problem it
    exists to catch.

IntegrationTestInitService.init()

  • Phase-gated bootstrap: when FEATURE_FLAG_OPEN_SEARCH_PHASE selects a phase ≥ 1, init() calls
    ensureStoresForCurrentPhase(). Phase 0 (the default) is a no-op, so existing single-store
    runs are unchanged.
  • Set the phase globally for a run and the entire existing suite exercises the ES→OS flow.

ContentletDataGen — reverted

  • The earlier per-checkin migrationPhase() / targetStore() instrumentation was removed.
    Phase control belongs upstream (global config + init), not bolted onto a single data-gen, so it
    now covers the full indexing flow rather than just contentlets created by one generator.

MigrationPhaseStoreBootstrapIT (new IT, registered in OpenSearchUpgradeSuite)

Bootstraps both stores in setup (phase 1 + ensureStoresForCurrentPhase()) and asserts placement by
querying the document _id (identifier_languageId_variantId) directly on each provider's working
index — noise-free, sidesteps the inferIndexToHit bootstrap gap. Content is created through the
ordinary ContentletDataGen (no override); the phase router does the routing.

Test Coverage
phase3_indexesIntoOpenSearchOnly ordinary data-gen under phase 3 → present in OS, absent from ES (runs single- & two-cluster)
phase1_dualWriteIndexesIntoBothStores ordinary data-gen under phase 1 → both stores (two-cluster only — see below)

How to test

./mvnw verify -pl :dotcms-integration \
  -Dcoreit.test.skip=false -Dopensearch.upgrade.test=true \
  -Dit.test=MigrationPhaseStoreBootstrapIT

To run the existing MainSuite battery under a phase (single-cluster ⇒ only phase 3 is
meaningful), use the new opensearch-phase-suite Maven profile:

./mvnw verify -pl :dotcms-integration -Dcoreit.test.skip=false \
  -Dopensearch.phase.test=true -Dopensearch.phase=3

Notes / known limitation

The phase-1 dual-write test skips itself in the single-cluster opensearch-upgrade profile via
assumeFalse(esSameAsOs()): the legacy ES RestHighLevelClient cannot parse an OpenSearch 3.x
content bulk-write response (NullPointerException in DocWriteResponse), so ES content writes only
work against a truly separate ES cluster. The phase-3 OpenSearch-only path is unaffected.

🤖 Generated with Claude Code

…36266)

Add per-checkin migration-phase control to ContentletDataGen so tests can
direct a generated contentlet to ElasticSearch, OpenSearch, or both during
the ES->OS migration:

- migrationPhase(MigrationPhase): phase-oriented primary API
- targetStore(ES|OS|BOTH): store-oriented convenience over the phase override
- per-checkin scope: wraps the checkin in persist(), captures and restores the
  prior FEATURE_FLAG_OPEN_SEARCH_PHASE value (restored even on throw)
- backward compatible: no override -> respects the globally-active phase
- applied to both persist(...) overloads and nextPersistedAndPublish()

Add ContentletDataGenPhaseControlIT (registered in OpenSearchUpgradeSuite),
which asserts placement by document _id directly on each provider's working
index. ES content-write scenarios skip in the single-cluster opensearch-upgrade
profile (the legacy ES client cannot content-bulk-write to OS 3.x); the OS-only
placement and per-checkin restore tests run everywhere.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @fabrizzio-dotCMS's task in 1m 10s —— View job


Rollback Safety Analysis

  • Gather context
  • Read rollback-unsafe categories reference
  • Get full PR diff
  • Analyze diff against unsafe categories
  • Apply label / post findings

Result: ✅ Safe to Rollback

This PR is test-only — all 5 changed files live under dotcms-integration/src/test/ (test code) or dotcms-integration/pom.xml (build/profile config for the integration test module). None of the changes affect production application code, database schema, Elasticsearch/OpenSearch index mappings, content model serialization, REST/GraphQL API contracts, push-publishing bundle format, OSGi public interfaces, or VTL viewtool contracts.

Files changed:

File Scope
dotcms-integration/pom.xml New Maven profile opensearch-phase-suite — test runner config only
dotcms-integration/.../OpenSearchUpgradeSuite.java Registers new IT class in test suite
dotcms-integration/.../MigrationPhaseStoreBootstrapIT.java New integration test class
dotcms-integration/.../IntegrationTestInitService.java Phase-aware OS bootstrap in test init only
dotcms-integration/.../MigrationPhaseStoreBootstrap.java New test utility class

Category-by-category verdict:

Category Finding
C-1 Structural Data Model No runonce tasks, no schema changes
C-2 ES/OS Mapping Change No mapping code touched (ESMappingAPIImpl, etc. unchanged)
C-3 Content JSON Model Version No changes to ImmutableContentlet or CURRENT_MODEL_VERSION
C-4 DROP TABLE / DROP COLUMN None
H-1 One-Way Data Migration None
H-2 RENAME TABLE / COLUMN None
H-3 PK Restructuring None
H-4 New Field Type None
H-5 Binary Storage Provider None
H-6 DROP PROCEDURE / FUNCTION None
H-7 NOT NULL without default None
H-8 VTL Viewtool Contract None
M-1–M-4 Medium categories None

Rolling back to the previous release after merging this PR is fully safe — N-1 never sees these test classes or the Maven profile at runtime.

Label applied: AI: Safe To Rollback

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

🤖 Bedrock Review — deepseek.v3.2

[🟡 Medium] dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:532withPhaseOverride(() -> publish(contentlet)) returns Void, but withPhaseOverride expects a Supplier<R> returning R. This will cause a compilation error because publish returns void. Use () -> { publish(contentlet); return null; } or a separate helper.

[🟡 Medium] dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:365withPhaseOverride method uses a generic type R but the publish call in nextPersistedAndPublish passes a void-returning lambda. This mismatch will cause a compilation error. The lambda should return a value (e.g., null).


Run: #27987978396 · tokens: in: 6258 · out: 194 · total: 6452

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

🤖 Bedrock Review — qwen.qwen3-next-80b-a3b

New Issues

  • 🔴 Critical: dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:280Config.setProperty(MigrationPhase.FLAG_KEY, prior) in withPhaseOverride may set a null string value, which could corrupt the global migration phase config if prior is null (no null check or fallback). This risks breaking all concurrent tests or services relying on the global phase flag.
  • 🔴 Critical: dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:280Config.setProperty is called unconditionally in finally, even if the prior value was never set (i.e., prior == null). This overwrites a potentially unset flag with "null", which is a valid string value in config — corrupting state for other tests or services.
  • 🟠 High: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/ContentletDataGenPhaseControlIT.java:147esHasDoc uses Try.of(...).getOrElse(false) to swallow IOException from Elasticsearch client. No logging or error reporting on failure — masks index connectivity or auth issues as "document not found".
  • 🟠 High: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/ContentletDataGenPhaseControlIT.java:157osHasDoc uses Try.of(...).getOrElse(false) to swallow OpenSearchException from OS client. No logging or error handling — hides cluster failures during test assertions.
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:278Config.getStringProperty(MigrationPhase.FLAG_KEY, null) assumes the config key exists and is a string, but does not validate that overridePhase.ordinal() is a valid config value. If MigrationPhase enum changes, ordinal may become invalid (e.g., non-integer string), corrupting config state.
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:279Config.setProperty is not wrapped in WrapInTransaction — but this is a config write, not DB. However, if Config internally uses DB-backed storage (as dotCMS does in some modes), this is a non-transactional write that could leave config in inconsistent state if the JVM crashes mid-write. Must verify if Config is DB-backed in test profile.

Existing

  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/ContentletDataGenPhaseControlIT.java:147 — prior finding still present: esHasDoc silently swallows client exceptions — unchanged in diff.

Resolved

  • dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:280Config.setProperty now restores prior value, fixing previous leakage of phase override across tests (was not present before this PR).

Run: #28134452418 · tokens: in: 6568 · out: 1006 · total: 7574

@fabrizzio-dotCMS fabrizzio-dotCMS changed the title test(a11y): add migration-phase / store control to ContentletDataGen (#36266) test(opensearch): phase-aware test bootstrap to run the indexing flow under a migration phase (#36266) Jun 25, 2026
…er a migration phase (#36266)

Replace the per-checkin ContentletDataGen instrumentation with a separate,
phase-aware bootstrap helper wired into the test init, so the whole indexing
flow (not just contentlets) runs under a chosen ES->OS migration phase.

- MigrationPhaseStoreBootstrap: ensureStoresForCurrentPhase() validates OS
  reachability (fails hard in phases 1/2/3, stricter than production on purpose)
  and creates/registers the index set; assertOpenSearchReachable() exposed for
  the create/publish path.
- IntegrationTestInitService.init(): phase-gated bootstrap; phase 0 is a no-op so
  existing single-store runs are unchanged.
- Revert ContentletDataGen to its pre-PR baseline (no targetStore/migrationPhase).
- Replace ContentletDataGenPhaseControlIT with MigrationPhaseStoreBootstrapIT,
  asserting store placement via the ordinary data-gen under the active phase.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

🤖 Bedrock Review — qwen.qwen3-next-80b-a3b

New Issues

  • 🔴 Critical: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:236Config.setProperty(FLAG_KEY, String.valueOf(phase.ordinal())) writes raw ordinal as string, corrupting global config state if test fails mid-run — violates dotCMS convention that config values must be enum names, not ordinals, and risks persistent corruption across test suites
  • 🔴 Critical: dotcms-integration/src/test/java/com/dotcms/util/MigrationPhaseStoreBootstrap.java:72MigrationPhase.current() is called without ensuring Config.getStringProperty(FLAG_KEY) is initialized — if unset, returns null, causing assertOpenSearchReachable(null) to silently skip OS reachability check, enabling false-positive test passes
  • 🟠 High: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:189esHasDoc() swallows Elasticsearch client exceptions without logging or rethrow — masks connectivity/auth failures, allowing tests to pass when ES is unreachable
  • 🟠 High: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:198osHasDoc() swallows OpenSearch client exceptions without logging or rethrow — masks OS cluster failures, enabling silent test pass on broken backend
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/util/IntegrationTestInitService.java:85MigrationPhaseStoreBootstrap.ensureStoresForCurrentPhase() called during test init without @WrapInTransaction — if Config uses DB-backed storage, partial write on JVM crash corrupts global phase flag
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:236setPhase() uses ordinal() — if MigrationPhase enum order changes, config value becomes invalid, breaking phase routing silently

Existing

  • 🔴 Critical: dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:280 — Config.setProperty sets null or "null" string as phase flag, corrupting global config state for all tests and services
  • 🟠 High: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/ContentletDataGenPhaseControlIT.java:147 — esHasDoc swallows Elasticsearch client exceptions without logging, masking connectivity/auth failures
  • 🟠 High: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/ContentletDataGenPhaseControlIT.java:157 — osHasDoc swallows OpenSearch client exceptions without logging, hiding cluster failures
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:278 — Config.setProperty uses ordinal() without validating it produces a valid config value — enum change could break config
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:279 — Config.setProperty called without transaction — may corrupt config if Config uses DB-backed storage and JVM crashes mid-write

Resolved

  • dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/ContentletDataGenPhaseControlIT.java:147 — replaced by new esHasDoc in MigrationPhaseStoreBootstrapIT.java — but still broken (see above)
  • dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/ContentletDataGenPhaseControlIT.java:157 — replaced by new osHasDoc in MigrationPhaseStoreBootstrapIT.java — but still broken (see above)

Run: #28185141660 · tokens: in: 6850 · out: 1442 · total: 8292

…uite battery under a migration phase (#36266)

Adds a Maven profile that runs the regular MainSuite/Junit5Suite battery against a
single OpenSearch 3.x cluster under a chosen ES->OS migration phase, so the impact of
a phase on the existing test flow is measurable and repeatable (instead of an ad-hoc
exported env var).

  ./mvnw verify -pl :dotcms-integration -Dcoreit.test.skip=false \
      -Dopensearch.phase.test=true -Dopensearch.phase=3

Injects DOT_FEATURE_FLAG_OPEN_SEARCH_PHASE into the failsafe fork (read by Config ->
MigrationPhase.current()); IntegrationTestInitService.init() bootstraps the OS index
set for that phase. Single cluster => phase 3 is the meaningful one (phases 1/2 need
two separate ES and OS clusters). Includes inherit the base MainSuite config.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

🤖 Bedrock Review — qwen.qwen3-next-80b-a3b

New Issues

  • 🔴 Critical: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:259setPhase() uses phase.ordinal() to write to Config.setProperty, directly re-introducing the same config corruption risk flagged in prior findings (MigrationPhaseStoreBootstrapIT.java:236 and ContentletDataGen.java:278) — enum ordinal changes silently break phase routing and corrupt global config state across test suites
  • 🔴 Critical: dotcms-integration/src/test/java/com/dotcms/util/MigrationPhaseStoreBootstrap.java:87MigrationPhase.current() is called without validating Config.getStringProperty(FLAG_KEY) is initialized — if DOT_FEATURE_FLAG_OPEN_SEARCH_PHASE is unset or malformed, MigrationPhase.current() returns null, causing assertOpenSearchReachable(null) to silently skip OS reachability check, leading to silent fallback to ES-only behavior in phase 1/2 tests
  • 🟠 High: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:189esHasDoc() swallows Exception from RestHighLevelClientProvider without logging or rethrow — masks ES connectivity/auth/index errors, making test failures opaque
  • 🟠 High: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:198osHasDoc() swallows Exception from OpenSearch client without logging or rethrow — masks OS connectivity/auth/index errors, hiding cluster failures
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/util/IntegrationTestInitService.java:85MigrationPhaseStoreBootstrap.ensureStoresForCurrentPhase() called without @WrapInTransaction — risks config corruption if Config.setProperty uses DB-backed storage and JVM crashes mid-write (same as prior finding at IntegrationTestInitService.java:85)
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:259setPhase() uses ordinal() without validating it produces a valid config value — enum change breaks phase routing silently (same as prior finding at ContentletDataGen.java:278)

Existing

  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/util/IntegrationTestInitService.java:85 — MigrationPhaseStoreBootstrap.ensureStoresForCurrentPhase() called without @WrapInTransaction — risks config corruption on JVM crash
  • 🟡 Medium: dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:259 — setPhase() uses ordinal() — enum order change breaks phase routing silently

Resolved

  • dotcms-integration/src/test/java/com/dotcms/content/elasticsearch/business/MigrationPhaseStoreBootstrapIT.java:236 — This line no longer exists; replaced by setPhase() in new file
  • dotcms-integration/src/test/java/com/dotcms/datagen/ContentletDataGen.java:280 — This line no longer exists; replaced by new setPhase() in MigrationPhaseStoreBootstrapIT

Run: #28188474262 · tokens: in: 8121 · out: 1070 · total: 9191

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Add migration-phase / store control to ContentletDataGen for ES→OS dual-write tests

1 participant