Skip to content

draft for umbrella approach to OLMv1#1935

Closed
maleck13 wants to merge 2 commits into
mainfrom
olmv1-poc
Closed

draft for umbrella approach to OLMv1#1935
maleck13 wants to merge 2 commits into
mainfrom
olmv1-poc

Conversation

@maleck13

@maleck13 maleck13 commented Apr 27, 2026

Copy link
Copy Markdown
Collaborator

@jasonmadigan This covers the problem and a proposed approach that was recommended by the OLM team. This is a very early very generated doc at this point and needs digging into further and the details refining

Summary by CodeRabbit

Documentation

  • Added RFC proposal defining a transition to an OLMv1 umbrella operator model for dependency management, including user install workflows, dependency-aware upgrade flows, manifest deployment strategies, and failure handling guidance.

@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c716f22a-f75c-446b-ba1c-6d6a606791ae

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

A new RFC document is introduced proposing replacement of OLMv0 automatic dependency installation with an OLMv1 umbrella operator approach using ClusterExtension to install and manage the full Kuadrant component set as a unified deployment.

Changes

Cohort / File(s) Summary
OLM Dependency Model Transition RFC
doc/proposals/olm-dependency-model-transition.md
New proposal document detailing the transition from OLMv0 to OLMv1 umbrella operator model, including installation workflows, dependency-aware upgrade flows, component manifest acquisition approaches, controller categories, version pinning, RBAC requirements, failure handling expectations, and design rationale.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

Poem

🐰 An umbrella of operators, beneath one extension's care,
Dependencies dance in harmony, no scattered bits to spare,
From OLMv0's scattered ways to v1's unified grace,
Kuadrant finds its home at last—a tidy, ordered place! ☔

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'draft for umbrella approach to OLMv1' directly relates to the main change: a new RFC document proposing the umbrella operator approach for OLMv1 dependency management.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch olmv1-poc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
doc/proposals/olm-dependency-model-transition.md (4)

98-103: Add language identifier to code block.

The fenced code block showing version pinning examples should specify a language identifier for better rendering and syntax highlighting (e.g., bash, sh, or env).

📝 Suggested fix
-```
+```bash
 KUADRANT_OPERATOR_IMAGE=quay.io/kuadrant/kuadrant-operator:v1.2.0
 AUTHORINO_OPERATOR_IMAGE=quay.io/kuadrant/authorino-operator:v0.14.0
 LIMITADOR_OPERATOR_IMAGE=quay.io/kuadrant/limitador-operator:v0.11.0
 DNS_OPERATOR_IMAGE=quay.io/kuadrant/dns-operator:v0.8.0
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @doc/proposals/olm-dependency-model-transition.md around lines 98 - 103, The
fenced code block containing the environment variable examples
(KUADRANT_OPERATOR_IMAGE, AUTHORINO_OPERATOR_IMAGE, LIMITADOR_OPERATOR_IMAGE,
DNS_OPERATOR_IMAGE) needs a language identifier for proper syntax highlighting;
update the opening fence from tobash (or env/sh) so the block
reads as a bash/env snippet and retains the same variable lines unchanged.


</details>

---

`76-82`: **Expand tradeoff analysis for manifest management approaches.**

The two approaches are described briefly, but the proposal would benefit from more detailed tradeoff analysis:
- **Build pipeline**: How does each approach affect CI/CD complexity?
- **Update workflow**: How do manifest-only changes propagate in each model?
- **Testing**: Which approach simplifies integration testing?
- **Reference implementation**: Which approach does cluster-olm-operator use, and why?

This information would help stakeholders evaluate which approach to adopt and is closely tied to the unresolved question on line 177.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@doc/proposals/olm-dependency-model-transition.md` around lines 76 - 82,
Expand the tradeoff analysis under "Component Manifest Management" by adding
comparative details for the two approaches: describe CI/CD/build pipeline
impacts for "Embedded manifests" and "Init container extraction", explain how
manifest-only updates propagate in each model (rebuild image vs. independent
component image rollout or volume updates), outline testing implications (which
simplifies integration/unit/e2e testing and mock strategies), and state which
pattern "cluster-olm-operator" uses and why (reference the "cluster-olm-operator
pattern" and justify with operational/practical reasons); ensure you mention
these four headings (Build pipeline, Update workflow, Testing, Reference
implementation) so reviewers can directly map the tradeoffs to the unresolved
question on line 177.
```

</details>

---

`159-161`: **Consider basic component enable/disable flags in initial implementation.**

Whilst selective deployment is appropriately scoped as future work, consider whether the initial implementation should include basic enable/disable flags for each component, even if all are enabled by default. This would:
- Validate that the architecture truly supports selective deployment
- Reduce migration burden when implementing the full `KuadrantInstall` CR API
- Provide an escape hatch for users who want to manage specific components independently

However, this may add complexity that's not warranted if the selective deployment timeline is uncertain.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@doc/proposals/olm-dependency-model-transition.md` around lines 159 - 161, Add
basic per-component enable/disable flags to the initial implementation so
operators can opt out of individual components; modify the deployment
configuration and operator bootstrap code to read new boolean flags (e.g.,
componentEnablements map or individual flags like enableGateway,
enableOIDCPolicy, enableTelemetry) and short-circuit creation/reconciliation of
corresponding components when false, ensure safety checks in the umbrella
operator reconcile loop to prevent removal if active policy CRs exist (reference
KuadrantInstall as the eventual CR API to mirror), and document defaults as
enabled so behavior remains unchanged unless flags are set.
```

</details>

---

`128-128`: **Expand CRD conflict handling guidance.**

The CRD conflict scenario mentions error reporting but doesn't specify the recovery path. Consider adding:
- Whether users should manually remove conflicting CRDs
- If the umbrella operator could support CRD adoption from existing installations
- Whether a pre-flight check could detect conflicts before deployment

This relates to the unresolved question on line 179 about migration from standalone installations and should be addressed together.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@doc/proposals/olm-dependency-model-transition.md` at line 128, The doc's CRD
conflict note ("**CRD conflict** - if a CRD already exists on the cluster...")
is too brief and lacks a recovery path; expand the section to enumerate clear
remediation and automation options: state whether operators should manually
remove conflicting CRDs, describe pros/cons and steps for manual removal,
document a possible "CRD adoption" approach the umbrella operator could
implement (including prerequisites and safety checks), and recommend adding a
pre-flight conflict detection step that fails fast with instructions; also
cross-reference and reconcile this guidance with the unresolved migration
question about standalone installations (the note around line 179) so both
places present a consistent migration strategy.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @doc/proposals/olm-dependency-model-transition.md:

  • Around line 116-122: Update the "Upgrade Flow" section to add a new subsection
    that defines explicit backwards-compatibility criteria and an acceptance-testing
    plan: (1) for each dependency named in the flow (Authorino, Limitador, DNS)
    state what API/behavioral contracts must be preserved (CRD shapes, REST/gRPC
    endpoints, config options, semantic versioning rules) to be considered
    backwards-compatible; (2) list concrete test scenarios and suites the release
    must run (smoke rollbacks, mixed-version integration tests with old
    kuadrant-operator + new dependencies, stateful migration tests, API contract
    tests, e2e traffic validation) and where they run (CI vs release gating); (3)
    describe failure detection and remediation steps (automated rollback, alerting,
    blocked promotion) and who owns the decision; and (4) require maintaining a
    version compatibility matrix mapping kuadrant-operator versions to supported
    dependency versions. Reference the "Upgrade Flow" section, the umbrella
    operator, kuadrant-operator, and the dependency names (Authorino, Limitador,
    DNS) when inserting this subsection.
  • Around line 105-114: Expand the RBAC section under "RBAC" to include a
    concrete permissions matrix: list exact API groups, resources and verbs (e.g.,
    apiregistration.k8s.io/v1, apiextensions.k8s.io/v1 CRDs:
    get,list,watch,create,update,patch,delete; apps/v1 Deployments:
    get,list,watch,create,update,patch,delete; rbac.authorization.k8s.io
    ClusterRole/ClusterRoleBinding: create,bind; core/v1
    Namespaces/ServiceAccounts/Services: create,delete,get,list,watch), clearly mark
    which permissions are cluster-scoped vs namespace-scoped, provide an example
    ClusterRole manifest for the umbrella operator's ServiceAccount (referencing
    "ClusterExtension's ServiceAccount" and "umbrella operator") and an example Role
    for a component operator, and add a short comparison paragraph showing
    differences vs the current kuadrant-operator RBAC scope so reviewers can see
    added/removed privileges.
  • Around line 5-6: Update the two placeholder links in the document where the
    strings "RFC PR:
    Kuadrant/architecture#0000"
    and "Issue tracking:
    Kuadrant/architecture#0000"
    appear: replace the #0000 placeholders in both the pull request and issue
    URLs/text with the actual PR number and actual issue number respectively so the
    markdown links point to the real RFC PR and tracking issue before merging.

Nitpick comments:
In @doc/proposals/olm-dependency-model-transition.md:

  • Around line 98-103: The fenced code block containing the environment variable
    examples (KUADRANT_OPERATOR_IMAGE, AUTHORINO_OPERATOR_IMAGE,
    LIMITADOR_OPERATOR_IMAGE, DNS_OPERATOR_IMAGE) needs a language identifier for
    proper syntax highlighting; update the opening fence from tobash (or
    env/sh) so the block reads as a bash/env snippet and retains the same
    variable lines unchanged.
  • Around line 76-82: Expand the tradeoff analysis under "Component Manifest
    Management" by adding comparative details for the two approaches: describe
    CI/CD/build pipeline impacts for "Embedded manifests" and "Init container
    extraction", explain how manifest-only updates propagate in each model (rebuild
    image vs. independent component image rollout or volume updates), outline
    testing implications (which simplifies integration/unit/e2e testing and mock
    strategies), and state which pattern "cluster-olm-operator" uses and why
    (reference the "cluster-olm-operator pattern" and justify with
    operational/practical reasons); ensure you mention these four headings (Build
    pipeline, Update workflow, Testing, Reference implementation) so reviewers can
    directly map the tradeoffs to the unresolved question on line 177.
  • Around line 159-161: Add basic per-component enable/disable flags to the
    initial implementation so operators can opt out of individual components; modify
    the deployment configuration and operator bootstrap code to read new boolean
    flags (e.g., componentEnablements map or individual flags like enableGateway,
    enableOIDCPolicy, enableTelemetry) and short-circuit creation/reconciliation of
    corresponding components when false, ensure safety checks in the umbrella
    operator reconcile loop to prevent removal if active policy CRs exist (reference
    KuadrantInstall as the eventual CR API to mirror), and document defaults as
    enabled so behavior remains unchanged unless flags are set.
  • Line 128: The doc's CRD conflict note ("CRD conflict - if a CRD already
    exists on the cluster...") is too brief and lacks a recovery path; expand the
    section to enumerate clear remediation and automation options: state whether
    operators should manually remove conflicting CRDs, describe pros/cons and steps
    for manual removal, document a possible "CRD adoption" approach the umbrella
    operator could implement (including prerequisites and safety checks), and
    recommend adding a pre-flight conflict detection step that fails fast with
    instructions; also cross-reference and reconcile this guidance with the
    unresolved migration question about standalone installations (the note around
    line 179) so both places present a consistent migration strategy.

</details>

<details>
<summary>🪄 Autofix (Beta)</summary>

Fix all unresolved CodeRabbit comments on this PR:

- [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended)
- [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Organization UI

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `da05c1dc-7456-4835-89bf-e3b5c7002143`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 725014c1ddb9ec567f4b51c46d884884ecd59cac and 0a1a7bc3a6c29afe3095e9f67628ea350bd8d274.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `doc/proposals/olm-dependency-model-transition.md`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread doc/proposals/olm-dependency-model-transition.md Outdated
Comment thread doc/proposals/olm-dependency-model-transition.md Outdated
Comment thread doc/proposals/olm-dependency-model-transition.md Outdated
@codecov

codecov Bot commented Apr 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.06%. Comparing base (3d8d122) to head (0a1a7bc).
⚠️ Report is 233 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1935      +/-   ##
==========================================
- Coverage   75.15%   75.06%   -0.10%     
==========================================
  Files         120      123       +3     
  Lines       10487    11594    +1107     
==========================================
+ Hits         7882     8703     +821     
- Misses       2223     2447     +224     
- Partials      382      444      +62     
Flag Coverage Δ
bare-k8s-integration 21.83% <ø> (-2.37%) ⬇️
controllers-integration 56.90% <ø> (-1.90%) ⬇️
envoygateway-integration 43.11% <ø> (-1.02%) ⬇️
gatewayapi-integration 17.33% <ø> (-1.74%) ⬇️
istio-integration 47.23% <ø> (-1.25%) ⬇️
unit 24.89% <ø> (+2.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
api (u) 85.88% <ø> (ø)
internal (u) 76.48% <65.58%> (-0.28%) ⬇️
pkg (u) 34.40% <100.00%> (ø)
see 39 files with indirect coverage changes
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maleck13 maleck13 marked this pull request as draft April 27, 2026 13:54
@maleck13

Copy link
Copy Markdown
Collaborator Author

@jasonmadigan actually created in the kuadrant operator repo as that was useful for context but prob should move to arch repo tbh

@eguzki eguzki left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks promising. I like the umbrella approach idea. A couple of unanswered questions after reading the proposal:

  • What will the new umbrella operator be named? I think kuadrant operator is the best candidate, with a kuadrant-controller component managing all policy reconciliation work.

  • What's being replaced? Will the current kuadrant operator's new OLM release become the OLMv1-compatible umbrella operator, or are we looking at a brand new operator release? Will the current operators continue to be released to the OLM public catalog?

@jasonmadigan jasonmadigan self-assigned this May 6, 2026
@maleck13

Copy link
Copy Markdown
Collaborator Author

What will the new umbrella operator be named? I think kuadrant operator is the best candidate, with a kuadrant-controller component managing all policy reconciliation work.

+1

What's being replaced? Will the current kuadrant operator's new OLM release become the OLMv1-compatible umbrella operator. or are we looking at a brand new operator release?

Still to be decided. In an ideal world we would replace and everything would "seamlessly" migrate to v1 compatible but I am not sure right now how feasible that is.

@eguzki eguzki left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some further thoughts about it

Each umbrella operator release embeds component image references as environment variables or build-time constants:

```
KUADRANT_OPERATOR_IMAGE=quay.io/kuadrant/kuadrant-operator:v1.2.0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be the bundle image actually, right? quay.io/kuadrant/kuadrant-operator-bundle ?


### Upgrade Flow

1. OLMv1 deploys the new umbrella operator image containing updated component image references.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be my only concern here. The umbrella operator would be doing OLM operator's job. We need to implement it and maintain it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but we have to take on that role in some form given OLM dependencies are going away. We either build a big CSV with all the dependencies in it or we do something more controllable such as the umbrella operator.

maleck13 added 2 commits June 8, 2026 08:56
Signed-off-by: craig <cbrookes@redhat.com>
…ers helm templates

Signed-off-by: craig <cbrookes@redhat.com>
@maleck13

maleck13 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

@eguzki @jasonmadigan @didierofrivia I have made some updates to the proposal. Mainly it focuses on leveraging Helm at the core of the new operator.

@didierofrivia Interested in any thoughts you may have on this proposal

@Boomatang you may be interested in this also

@maleck13

maleck13 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Will move this to the arch repo instead of keeping it here actually

@maleck13

maleck13 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Closing in favour of Kuadrant/architecture#179

@maleck13 maleck13 closed this Jun 8, 2026
@github-project-automation github-project-automation Bot moved this to Done in Kuadrant Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants