Skip to content

Fixes #29137: Preserve data products across domain deletes#29138

Open
harshach wants to merge 6 commits into
mainfrom
harshach/dp-domain-delete
Open

Fixes #29137: Preserve data products across domain deletes#29138
harshach wants to merge 6 commits into
mainfrom
harshach/dp-domain-delete

Conversation

@harshach

@harshach harshach commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Describe your changes:

Fixes #29137
Fixes #29146

This PR now includes the backend fix plus repro coverage for the data product/domain delete issue.

  • Synchronizes data product domain -> dataProduct CONTAINS relationships when a data product domain is changed.
  • Filters domain delete cascade so a data product shared with another surviving domain is detached from the deleted domain, not hard-deleted.
  • Preserves the existing behavior that a data product is deleted when its last containing domain is recursively deleted.
  • Keeps the Playwright and integration repro coverage that first exposed the original visibility loss.

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change
  • Documentation

Tests:

Backend integration tests

  • Added/updated integration tests in openmetadata-integration-tests/.
  • test_changeDataProductDomain_thenDeleteOriginalDomain_preservesDataProduct
  • test_deleteOneDomainForMultiDomainDataProduct_preservesDataProductUntilLastDomain

Playwright tests

  • Added Playwright repro coverage in openmetadata-ui/src/main/resources/ui/playwright/e2e/Features/DataProductDomainMigration.spec.ts.

Local validation

  • mvn -pl openmetadata-service,openmetadata-integration-tests spotless:check
  • mvn -pl openmetadata-integration-tests -am -DskipTests install
  • mvn -pl openmetadata-integration-tests '-Dit.test=DataProductResourceIT#test_changeDataProductDomain_thenDeleteOriginalDomain_preservesDataProduct+test_deleteOneDomainForMultiDomainDataProduct_preservesDataProductUntilLastDomain' -DfailIfNoTests=false failsafe:integration-test failsafe:verify\n\n#\n### Checklist:\n- [x] I have read the CONTRIBUTING document.\n- [x] My PR title is Fixes <issue-number>: <short explanation>.\n- [x] My PR is linked to GitHub issues via Fixes #29137 and Fixes #29146 above.\n- [x] I have added tests around the fixed behavior.\n

Summary by Gitar

  • Logic fix:
    • Added domainDeleteSubtree mechanism in DomainRepository to track hierarchy during recursive deletes.
    • Updated filterChildrenForDeleteCascade to identify and detach DataProduct entities from domains being deleted.
    • Implemented updateDataProductDomainContainment in DataProductRepository to manage relationship cleanup during domain transitions.
  • Refactoring:
    • Added cascade filtering hooks to EntityRepository to allow child relationship management during hardDelete operations.
  • Testing/Utility:
    • Standardized MultiDomainRule toggling in integration tests to restore state correctly instead of hardcoding true.
  • Localization:
    • Added Swedish (sv-se) translations for SSO configuration testing and login validation messages.

This will update automatically on new commits.

Greptile Summary

This PR fixes a data loss bug where deleting a domain would cascade-delete data products that had already been migrated to another domain, because the CONTAINS relationship between the old domain and the data product was never removed when domains were updated. The fix adds CONTAINS relationship synchronization on domain change and a cascade-filter hook that retains data products with surviving domains during domain hard-deletes.

  • DataProductRepository: adds updateDataProductDomainContainment to sync CONTAINS relationships when a data product's domains change via PATCH/PUT, closing the gap that caused orphaned CONTAINS rows.
  • DomainRepository: overrides prepareChildrenForHardDeleteCascade (both single-entity and bulk paths) to identify data products shared with surviving domains, detach them from only the deleting domains, and update their JSON+search index; uses a ThreadLocal subtree-set to avoid re-initialising context across recursive calls.
  • EntityRepository: adds prepareChildrenForHardDeleteCascade / enterBulkHardDeleteCascade hooks (no-ops by default) called only when hardDelete=true, keeping the soft-delete path unchanged.

Confidence Score: 5/5

Safe to merge. The root cause (stale CONTAINS relationships not cleaned up on domain migration) is correctly fixed in both the update path and the delete-cascade path, with no regressions on soft-delete.

The three-layer fix is logically consistent: CONTAINS rows are now maintained on every domain change, the cascade filter correctly identifies data products with surviving domains and detaches them instead of deleting them, and the ThreadLocal subtree set is properly initialised and cleaned up in all entry points. The hardDelete guard prevents the cascade filter from running during soft-deletes, addressing the concern raised in the previous review thread. Two new integration tests and a Playwright repro test cover the fixed scenarios. The only observations are minor defensive-coding gaps.

openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DomainRepository.java — the two suggestions around null-safety and the unconditional storeEntity call in detachRetainedDataProductsFromDeletingDomains are worth a quick look before merge.

Important Files Changed

Filename Overview
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DomainRepository.java Core of the fix: adds ThreadLocal subtree tracking, two overloads of prepareChildrenForHardDeleteCascade for single-entity and bulk paths, and detachRetainedDataProductsFromDeletingDomains. Logic is sound; minor null-safety concern on find() return.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DataProductRepository.java Adds updateDataProductDomainContainment to keep CONTAINS relationships in sync on domain change; correctly pairs with the base-class HAS management in EntityUpdater.updateDomains.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java Adds no-op hook overrides (prepareChildrenForHardDeleteCascade, enterBulkHardDeleteCascade) and gates them correctly behind hardDelete=true in both the single-entity and bulk paths.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/DataProductResourceIT.java Two new integration tests cover both repro scenarios. ResourceLock used correctly for the multi-domain rule test; helper correctly saves/restores rule state.
openmetadata-ui/src/main/resources/ui/playwright/e2e/Features/DataProductDomainMigration.spec.ts New Playwright repro test. Cleanup in finally block is bare sequential awaits; if movedDataProduct.delete throws, the remaining cleanups including afterAction() are skipped.

Reviews (5): Last reviewed commit: "Preserve multi-domain rule state in test..." | Re-trigger Greptile

@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Jun 17, 2026
@harshach harshach marked this pull request as ready for review June 17, 2026 18:35
@harshach harshach requested a review from a team as a code owner June 17, 2026 18:35
@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

✅ PR checks passed

The linked issue has a description and all required Shipping project fields set. Thanks!

Comment on lines +405 to +410
} finally {
await movedDataProduct.delete(apiContext);
await originalDomain.delete(apiContext);
await vehicleDomain.delete(apiContext);
await afterAction();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Cleanup not error-isolated; departs from file's established pattern

The established pattern in this spec (see the no-assets test at lines 309–326) wraps each cleanup call in its own try/catch and uses a fresh apiContext. Here, all four calls are sequential bare awaits inside a single finally. If movedDataProduct.delete(apiContext) throws for any reason (network hiccup, FQN lookup failure), neither vehicleDomain.delete nor afterAction() will execute, leaking the vehicle domain and the API context across tests and causing flakiness. Additionally, originalDomain.delete(apiContext) is called on an entity the test already hard-deleted (line ~380), so a 404 there could silently interrupt the remainder of cleanup if Domain.delete is ever tightened to surface the response status.

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 62%
62.26% (66560/106901) 44.01% (37227/84574) 45.36% (11181/24645)

@harshach harshach added the To release Will cherry-pick this PR into the release branch label Jun 17, 2026
@harshach harshach changed the title Fixes #29137: Add data product domain deletion repro tests Fixes #29137: Preserve data products across domain deletes Jun 17, 2026
@gitar-bot

gitar-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown
Code Review ✅ Approved 4 resolved / 4 findings

Prevents data product loss during domain deletion by synchronizing containment relationships and implementing a safe-detach cascade filter. All previously identified integration test and cleanup issues have been addressed.

✅ 4 resolved
Bug: Delete-cascade filter has side effects, runs on restore/soft-delete

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DomainRepository.java:499-513 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DomainRepository.java:554-568 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:6391-6402 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:6320-6323 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:6486-6489 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:6598-6601
DomainRepository.filterChildrenForDeleteCascade(List<Domain>, children) is not a pure filter: through retainedSharedDataProductIds -> detachRetainedDataProductsFromDeletingDomains it permanently deletes CONTAINS/HAS relationship rows and re-indexes the data product. This override is invoked from dispatchToContainedChildren, which is shared by ALL three subtree walks: bulkHardDeleteSubtree (line 6598), bulkSoftDeleteSubtree (line 6486), and bulkRestoreSubtree (line 6320). Domain happens to be hard-delete-only today, but if a domain (or a future domain-containing parent) goes through the soft-delete or restore cascade, this code will destructively detach shared data products from a domain that is merely being soft-deleted or even restored — corrupting the data-product/domain links. A method named filterChildrenForDeleteCascade performing irreversible writes that fire on restore is surprising and unsafe.

Suggested fix: keep the override a pure filter and move the detach side effects into the hard-delete path only (e.g., a hook invoked from bulkHardDeleteSubtree/deleteChildren for hardDelete=true), or guard the detach so it runs solely for hard delete. At minimum, do not mutate relationships from the shared restore/soft-delete walk.

Edge Case: deletingDomainIds fallback may mis-retain DP when entered at non-root

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DomainRepository.java:533-536 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DomainRepository.java:457-470
deletingDomainIds falls back to collectDomainSubtreeIds(parentIds) when the domainDeleteSubtree ThreadLocal is null. The ThreadLocal is only seeded in the overridden deleteChildren(UUID,...), which is reached via the top-level delete() entry. If a domain subtree is ever entered through bulkHardDeleteSubtree/dispatchToContainedChildren at a non-root level without the ThreadLocal set, collectDomainSubtreeIds only walks downward from the current batch and excludes ancestor domains that are also being deleted. A data product shared between the current domain and a deleting ancestor would then be classified as 'retained' (its ancestor parent looks non-deleting) and incorrectly preserved/detached instead of cascade-deleted. The standard recursive API delete is safe because the ThreadLocal is seeded at the root, so this is an edge case, but the fallback is not equivalent to the seeded value.

Quality: IT bypasses SdkClients, mutates relationships via repository directly

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/DataProductResourceIT.java:1251-1265
attachDataProductToDomain reaches into Entity.getEntityRepository(DATA_PRODUCT) and Entity.getCollectionDAO().dataProductDAO().update(...) directly inside an integration test, instead of going through SdkClients/the REST API as the IT conventions require. This couples the test to in-process repository internals (it would not work against a remote server) and sidesteps the production code path that creates these relationships. If multi-domain attachment can't be expressed via the public SDK, consider exposing/using the proper API; otherwise document why the direct manipulation is unavoidable.

Quality: Test toggles global multi-domain rule, risking concurrent-test flakiness

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/DataProductResourceIT.java:1249-1259
The new helper updateDataProductDomainsWithMultiDomainRuleDisabled disables the global "Multiple Domains are not allowed" rule via /v1/system/settings, then re-enables it in finally. Two robustness issues:

  1. Global state under CONCURRENT execution: @ResourceLock("MULTI_DOMAIN_RULE") only serializes tests that declare the same lock (DataProductResourceIT, TableResourceIT, DatabaseServiceResourceIT). Any other test running in parallel that creates a multi-domain assignment and expects the rule to reject it could intermittently pass/fail during the window the rule is disabled. This is a known flakiness vector for mutating shared system settings.

  2. The finally restores the rule to a hardcoded true rather than the previously observed value. If the deployment/default ever has this rule disabled, the test would leave it enabled, leaking state to subsequent tests.

Suggested: capture the prior enabled state with EntityRulesUtil.isRuleEnabled(...) and restore that exact value in the finally, and confirm every test that depends on this rule (in any class) holds the MULTI_DOMAIN_RULE resource lock.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud

Copy link
Copy Markdown

@sonarqubecloud

Copy link
Copy Markdown

@github-actions

Copy link
Copy Markdown
Contributor

🟡 Playwright Results — all passed (9 flaky)

✅ 4303 passed · ❌ 0 failed · 🟡 9 flaky · ⏭️ 88 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 301 0 1 4
🟡 Shard 2 809 0 1 9
🟡 Shard 3 814 0 2 8
🟡 Shard 4 855 0 2 12
🟡 Shard 5 732 0 1 47
🟡 Shard 6 792 0 2 8
🟡 9 flaky test(s) (passed on retry)
  • Pages/AuditLogs.spec.ts › should create audit log entry when glossary is updated (shard 1, 1 retry)
  • Features/ColumnBulkOperations.spec.ts › should show disabled edit button when no columns are selected (shard 2, 1 retry)
  • Features/McpChat.spec.ts › Clicking MCP Chat sidebar nav should navigate to chat page (shard 3, 1 retry)
  • Flow/CustomizeWidgets.spec.ts › Data Products Widget (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Time Interval (shard 4, 1 retry)
  • Pages/DataContractsSemanticRules.spec.ts › Validate DataProduct Rule Any_In (shard 4, 1 retry)
  • Pages/ExplorePageRightPanel_KnowledgeCenter.spec.ts › Should remove user owner for knowledgeCenter (shard 5, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage service type filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data product with multiple domains is deleted when one domain is recursively deleted Add repro coverage for data product domain deletion visibility

1 participant