Skip to content

Draft: Add srelib integration POC#839

Draft
RaphaelBut wants to merge 1 commit into
openshift:mainfrom
RaphaelBut:srelib-poc
Draft

Draft: Add srelib integration POC#839
RaphaelBut wants to merge 1 commit into
openshift:mainfrom
RaphaelBut:srelib-poc

Conversation

@RaphaelBut

@RaphaelBut RaphaelBut commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Demonstrate how to include and use the srelib plugin library in CAD. Adds an adapter that wraps srelib's v1.Client behind CAD's ocm.Client interface, a plugin launcher for managing the srelib process lifecycle, and a runnable example showing the end-to-end flow.

What type of PR is this?

draft POC, not to be merged

( fully vibed by claude, no thinking^^ )

Summary by CodeRabbit

Release Notes

  • Chores

    • Updated Go toolchain from 1.25.7 to 1.26.3
    • Updated AWS SDK dependencies and added new plugin framework dependencies
  • New Features

    • Added srelib plugin system integration enabling cluster information retrieval, organization lookups, and support role ARN management
  • Tests

    • Added unit tests for srelib adapter implementation and functionality

Demonstrate how to include and use the srelib plugin library in CAD.
Adds an adapter that wraps srelib's v1.Client behind CAD's ocm.Client
interface, a plugin launcher for managing the srelib process lifecycle,
and a runnable example showing the end-to-end flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Walkthrough

This PR introduces plugin-based integration of srelib as an ocm.Client implementation. It upgrades dependencies for plugin management, implements a subprocess launcher for srelib, wraps the plugin client to expose the ocm.Client interface, provides adapter tests, and demonstrates end-to-end usage through a POC.

Changes

SRElib Plugin Integration for OCM Client

Layer / File(s) Summary
Dependencies and Plugin Launching
go.mod, pkg/srelib/launcher.go
Go bumped to 1.26.3; adds HashiCorp plugin dependencies and srelib as a local module. Implements LaunchPlugin to spawn the srelib subprocess, connect via go-plugin RPC, and manage lifecycle through PluginClient with Client() accessor and Close() terminator.
SRElib-to-OCM Adapter
pkg/srelib/adapter.go
Adapter wraps srelib v1.Client and implements ocm.Client interface. Delegates GetClusterInfo, GetOrganizationID, and GetSupportRoleARN to the underlying client; all remaining ocm.Client methods return "not supported" errors or nil.
Adapter Validation Tests
pkg/srelib/adapter_test.go
MockSrelibClient enables isolated testing. Tests verify GetClusterInfo passthrough, GetSupportRoleARN delegation, and that unsupported methods return appropriate errors with GetConnection returning nil.
Integration Demonstration
poc/srelib-integration/main.go
POC launches srelib plugin via LaunchPlugin, wraps with Adapter, reads SRELIB_PLUGIN_PATH and CLUSTER_ID from environment, fetches and displays cluster info and support role ARN, then cleanly closes the plugin.

🎯 3 (Moderate) | ⏱️ ~20 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (3 errors, 2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Ote Binary Stdout Contract ❌ Error Added poc/srelib-integration/main.go main() writes to stdout via fmt.Println/Printf (e.g., “=== CAD + srelib POC ===”, “Launching...”), violating the JSON-on-stdout contract. citeturn0view0 Redirect all main() output to os.Stderr (or remove/gate it) so stdout remains JSON-only for any OTE harness/consumer.
No-Weak-Crypto ❌ Error PR adds pkg/pullsecret/pullsecret.go with non-constant-time token check: string(clusterTokenBytes) != expectedToken (line 252). Use constant-time comparison (e.g., subtle.ConstantTimeCompare on token bytes, after length checks) instead of !=/== on token material.
No-Sensitive-Data-In-Logs ❌ Error poc/srelib-integration/main.go logs customer-specific cluster ID/name/state and Support ARN via fmt.Printf to stdout (and errors to stderr). Remove or redact the fmt.Printf/fmt.Println of cluster identifiers and Support ARN (and avoid dumping raw errors); use non-sensitive summary logs or guard behind debug with redaction.
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning pkg/srelib/adapter_test.go TestAdapter_UnsupportedMethods asserts three unrelated behaviors (multiple methods + GetConnection nil) in one test; violates single-responsibility requirement. Split TestAdapter_UnsupportedMethods into separate tests for each unsupported method and GetConnection, and optionally add assertion messages for easier diagnosis.
Description check ❓ Inconclusive The description identifies PR type as 'draft POC' and explains the three main components added, but is missing most required template sections including test coverage checkboxes and pre-checks. Complete the PR template by filling in all sections: select checkbox for 'This PR may not need unit tests' if applicable, and mark pre-check boxes completed or noted as N/A for a draft.
✅ Passed checks (9 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly indicates this is a draft POC adding srelib integration, which aligns with the main changeset focus of adding adapter, launcher, and example components.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed pkg/srelib/adapter_test.go uses static Go test function names (TestAdapter_*); no Ginkgo It/Describe/Context/When titles or dynamic suffixes/timestamps found.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests added in PR. Only standard Go unit tests in adapter_test.go using testing.T framework, which are outside the scope of MicroShift compatibility checks.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR #839 only updates go.mod/go.sum and adds pkg/srelib adapter/launcher + POC main; no new test/e2e or Ginkgo e2e tests are present to assess for SNO compatibility.
Topology-Aware Scheduling Compatibility ✅ Passed PR #839 changes only go.mod/go.sum and new Go POC/adapter/launcher files; no Kubernetes manifests/controllers or scheduling-constraint patterns (affinity/topologySpread/node-role) were introduced.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR-added files (go.mod, pkg/srelib/.go, pkg/srelib/_test.go, poc/srelib-integration/main.go) contain no new Ginkgo e2e tests (no It/Describe/Context/When), so no IPv4/external-connectivity assump...
Container-Privileges ✅ Passed Repo scan found no K8s/YAML using privileged:true, hostPID/hostNetwork/hostIPC:true, SYS_ADMIN, runAsUser:0, or allowPrivilegeEscalation:true; PR adds only Go files.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RaphaelBut

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 12, 2026
@RaphaelBut RaphaelBut marked this pull request as draft June 12, 2026 09:48
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 12, 2026
@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

@RaphaelBut: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/lint e686d30 link true /test lint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
pkg/srelib/launcher.go (1)

58-60: ⚡ Quick win

Consider adding graceful shutdown before Kill().

The Close() method immediately kills the plugin process. Consider attempting a graceful shutdown first (e.g., calling Exited() or setting a timeout) before falling back to Kill() to allow the plugin to clean up resources.

💡 Example graceful shutdown pattern
 func (p *PluginClient) Close() {
+	// Request graceful exit, then kill if it doesn't exit
+	select {
+	case <-p.raw.Exited():
+		return
+	case <-time.After(5 * time.Second):
+		p.raw.Kill()
+	}
-	p.raw.Kill()
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/srelib/launcher.go` around lines 58 - 60, The Close() method currently
calls p.raw.Kill() immediately; change it to attempt a graceful shutdown first
by signaling or waiting for the process to exit (e.g., call p.raw.Signal /
p.raw.Wait / Exited() if available) with a bounded timeout, and only call
p.raw.Kill() as a fallback if the process does not exit within that timeout;
update PluginClient.Close to perform the graceful stop + timeout logic around
the existing p.raw.Kill() call so the plugin has a chance to cleanup before
being killed.
pkg/srelib/adapter.go (1)

40-42: 💤 Low value

Silent failure for missing subscription may hide errors.

When the cluster has no subscription, the method returns an empty string with nil error. Depending on CAD's expectations, a missing subscription might indicate a data integrity issue that should be surfaced as an error rather than silently returning empty.

Alternative approach
 	cmv1Sub, ok := cluster.GetSubscription()
 	if !ok {
-		return "", nil
+		return "", fmt.Errorf("cluster %s has no subscription", clusterID)
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/srelib/adapter.go` around lines 40 - 42, The code silently returns an
empty string and nil error when cluster.GetSubscription() reports no
subscription (cmv1Sub, ok := cluster.GetSubscription()); change this to return a
non-nil error instead of nil so callers can detect the missing subscription
(e.g., return "", fmt.Errorf("missing subscription for cluster") or a typed
error), and update any callers or tests that expect the previous silent
behavior; locate the check for ok after cluster.GetSubscription() in
pkg/srelib/adapter.go to implement this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@go.mod`:
- Line 29: The go.mod contains a replace directive pointing to a local path
("../srelib") that does not exist and will break builds; either remove the
replace directive from go.mod or replace it with a valid module version (e.g.,
use the published module version for github.com/petrkotas/srelib) and, if you
need to keep local development behavior, add a README note instructing
contributors to clone srelib as a sibling directory or run 'go mod edit
-replace' locally—update the replace entry in go.mod or the project
documentation accordingly.

In `@pkg/srelib/adapter.go`:
- Around line 79-81: The GetConnection method on Adapter currently returns nil
which causes downstream panics; implement and return a stub connection that
satisfies the sdk.Connection interface and returns explicit errors for
operations (e.g., TokensContext) so callers get clear failures instead of nil
derefs. Create a type (e.g., sreStubConnection) in the same package that
implements the sdk.Connection methods used by consumers and returns
context-aware errors, then have (*Adapter).GetConnection return
&sreStubConnection{}; ensure the error messages clearly state that srelib does
not support OCM SDK connections to aid debugging.

---

Nitpick comments:
In `@pkg/srelib/adapter.go`:
- Around line 40-42: The code silently returns an empty string and nil error
when cluster.GetSubscription() reports no subscription (cmv1Sub, ok :=
cluster.GetSubscription()); change this to return a non-nil error instead of nil
so callers can detect the missing subscription (e.g., return "",
fmt.Errorf("missing subscription for cluster") or a typed error), and update any
callers or tests that expect the previous silent behavior; locate the check for
ok after cluster.GetSubscription() in pkg/srelib/adapter.go to implement this
change.

In `@pkg/srelib/launcher.go`:
- Around line 58-60: The Close() method currently calls p.raw.Kill()
immediately; change it to attempt a graceful shutdown first by signaling or
waiting for the process to exit (e.g., call p.raw.Signal / p.raw.Wait / Exited()
if available) with a bounded timeout, and only call p.raw.Kill() as a fallback
if the process does not exit within that timeout; update PluginClient.Close to
perform the graceful stop + timeout logic around the existing p.raw.Kill() call
so the plugin has a chance to cleanup before being killed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 45add7dd-2e8a-41ee-847d-10acdf9bdac2

📥 Commits

Reviewing files that changed from the base of the PR and between 10ac9fc and e686d30.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • go.mod
  • pkg/srelib/adapter.go
  • pkg/srelib/adapter_test.go
  • pkg/srelib/launcher.go
  • poc/srelib-integration/main.go

Comment thread go.mod
github.com/openshift/hive/apis v0.0.0-20260430175100-52fd62d6be80
github.com/openshift/osd-network-verifier v1.7.0
github.com/openshift/osde2e-common v0.0.0-20260421185005-d390433565e5
github.com/petrkotas/srelib v0.0.0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Replace directive points to non-existent directory.

The replace directive references ../srelib, which does not exist in the repository. This will break builds for anyone cloning the repo. For a POC that requires local development with an external module, document the required setup steps in a README or remove the replace directive before sharing.

📝 Suggested approach

For POC/development purposes, either:

  1. Document the setup requirement (e.g., "Clone srelib as a sibling directory")
  2. Use a proper version reference if srelib is published
  3. Add a note in the PR description about the local setup requirement

Also applies to: 230-230

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@go.mod` at line 29, The go.mod contains a replace directive pointing to a
local path ("../srelib") that does not exist and will break builds; either
remove the replace directive from go.mod or replace it with a valid module
version (e.g., use the published module version for github.com/petrkotas/srelib)
and, if you need to keep local development behavior, add a README note
instructing contributors to clone srelib as a sibling directory or run 'go mod
edit -replace' locally—update the replace entry in go.mod or the project
documentation accordingly.

Source: Linters/SAST tools

Comment thread pkg/srelib/adapter.go
Comment on lines +79 to +81
func (a *Adapter) GetConnection() *sdk.Connection {
return nil
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

GetConnection() returning nil will cause downstream panics.

The GetConnection() method returns nil, but downstream CAD code dereferences the connection without nil checks. For example, pkg/backplane/backplane.go:74 calls config.OcmClient.GetConnection().TokensContext(ctx), which will panic with a nil pointer dereference.

🔧 Possible approaches
  1. Document the limitation: If srelib-backed clients should not use backplane, add a runtime check or document which investigations are compatible.
  2. Return an error-wrapped stub: Return a connection that returns errors for token operations.
  3. Panic early with clear message: Panic in GetConnection() with a message explaining srelib doesn't support OCM SDK connections.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/srelib/adapter.go` around lines 79 - 81, The GetConnection method on
Adapter currently returns nil which causes downstream panics; implement and
return a stub connection that satisfies the sdk.Connection interface and returns
explicit errors for operations (e.g., TokensContext) so callers get clear
failures instead of nil derefs. Create a type (e.g., sreStubConnection) in the
same package that implements the sdk.Connection methods used by consumers and
returns context-aware errors, then have (*Adapter).GetConnection return
&sreStubConnection{}; ensure the error messages clearly state that srelib does
not support OCM SDK connections to aid debugging.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 18, 2026
@openshift-ci

openshift-ci Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant