fix: avoid SSA caBundle conflict installing opentelemetry-operator over pre-existing CRDs#23
Merged
prathamesh-sonpatki merged 4 commits intoJun 1, 2026
Conversation
…er pre-existing CRDs When OTel CRDs already exist and cert-manager-cainjector owns .spec.conversion.webhook.clientConfig.caBundle (a prior cert-manager-based install), Helm v4's server-side apply fails: conflict with "cert-manager-cainjector" ... .spec.conversion.webhook.clientConfig.caBundle The opentelemetry-operator chart ships CRDs as templates gated by crds.create (NOT in the chart's crds/ dir), so the kube-prometheus-stack approach (helm show crds + --skip-crds) is a no-op here. Instead, render CRDs with helm template --include-crds, force-apply them out-of-band (upgrades schema and takes field ownership from cainjector), and install with --set crds.create=false so Helm never re-applies — and never re-conflicts on — the CRDs. Verified live in kind (Helm v4 + cert-manager): the unpatched script reproduces the exact cainjector caBundle conflict; the patched script installs cleanly (operator 1/1 Ready, all CRDs intact). - install_operator: crds.create=false + out-of-band CRD render/force-apply - tests/test_install_operator.py: unit coverage for the codepath - tests/integration/kind-e2e.sh: operator-crd-conflict mode - CI: add operator-crd-conflict to matrix; run CRD-conflict legs on Helm v4
Address code review on the cainjector caBundle conflict fix: - The out-of-band CRD apply ran as a `helm template | awk | kubectl apply | grep || true` pipeline. The script uses `set -e` but not `pipefail`, so only the trailing `grep ... || true` exit status was checked — a failed render or failed apply was masked and the install still proceeded to crds.create=false, leaving stale/missing CRDs with no error. Capture each stage to a variable and check it explicitly, aborting via log_error on render failure, empty CRD set, or apply failure. - Broaden CRD detection from the single opentelemetrycollectors CRD to any *.opentelemetry.io CRD, so a partial pre-existing set still triggers the mitigation. - Correct the stale test module docstring (the fix uses crds.create=false, not --skip-crds). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The mock helm previously emitted clean `---`-separated docs with no `# Source:` comments, so the awk CRD-isolation filter was never tested against the shape `helm template --include-crds` actually produces. - Make the mock emit Helm-realistic output: each doc prefixed with a `# Source:` comment, no leading `---` on the first doc, two CRDs plus a Deployment. - Capture the manifest stream the out-of-band force-apply receives on stdin (mock kubectl) and return it from run_install. - Add test_force_apply_receives_crds_only: asserts both CRDs are applied and the Deployment is filtered out, validating the awk filter end-to-end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The operator-crd-conflict integration test seeded CRDs from the operator repo's `main` branch, whose schema can drift from the chart version the script actually installs. Pin to v0.129.1 — the appVersion of chart OPERATOR_VERSION=0.92.1 — so the seeded CRDs are the exact schema the mitigation upgrades from. Comment notes both must be bumped together. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Customer install of
opentelemetry-operatorfails on a cluster that already has the OTel CRDs (left by a prior cert-manager-based install).cert-manager-cainjectorowns.spec.conversion.webhook.clientConfig.caBundlevia server-side apply and re-injects it continuously. Helm v4 applies CRDs with SSA but without--force-conflicts, so the install aborts:adopt_otel_crdscan't fix this — it can't win a race against an active controller that re-asserts the field.Why the kube-prometheus pattern doesn't work here
kube-prometheus-stackships CRDs in the chart'scrds/dir, sohelm show crds+--skip-crdsworks. Theopentelemetry-operatorchart ships CRDs as templates gated bycrds.create—helm show crdsis empty and--skip-crdsis a no-op. (Confirmed:helm template --set crds.create=falserenders 0 CRDs.)Fix
In
install_operator, when OTel CRDs already exist:helm template --include-crds(filtered to CRD docs).kubectl apply --server-side --force-conflicts) — upgrades schema to the chart version and takes field ownership.--set crds.create=falseso Helm never re-applies the CRDs.Verification
Live in kind (Helm v4.1.3 + cert-manager, cainjector owning
caBundle):cert-manager-cainjectorcaBundle conflicttests/test_install_operator.py4/4;adopt_otel/adopt_prom8/8 each;bats unit.bats27/27operator-crd-conflictintegration mode; CRD-conflict CI legs bumped to Helm v4 so the SSA codepath is exercised