Skip to content

CCXDEV-8768: Add a new conditional gatherer for CRs under the alert namespace#1308

Open
ncaak wants to merge 8 commits into
openshift:masterfrom
ncaak:feat/CCXDEX-8768/conditional-gatherer-cepthcluster-alerts
Open

CCXDEV-8768: Add a new conditional gatherer for CRs under the alert namespace#1308
ncaak wants to merge 8 commits into
openshift:masterfrom
ncaak:feat/CCXDEX-8768/conditional-gatherer-cepthcluster-alerts

Conversation

@ncaak

@ncaak ncaak commented Jun 19, 2026

Copy link
Copy Markdown
Contributor
  • This PR implements a new conditional gatherer. This new gatherer will try to retrieve a CR given its GVR always under the namespace where the alert is firing.

  • The pod definition gatherer was also upgraded to accept a prefix in its params. If a prefix is set, the gatherer will check for all pods that matches the prefix. Always under the same namespace where the alert is firing.

  • Some unit tests were added to the pod definition gatherer to adapt to the new functionality.

Categories

  • Bugfix
  • Data Enhancement
  • Feature
  • Backporting
  • Others (CI, Infrastructure, Documentation)

Sample Archive

  • docs/insights-archive-sample/conditional/namespaces/openshift-storage/crd/ceph.rook.io/v1/cephclusters/ocs-storagecluster-cephcluster.json

Documentation

  • docs/gathered-data.md

Unit Tests

  • pkg/gatherers/conditional/gather_pod_definition_test.go
  • pkg/gatherers/conditional/gather_cr_definition_test.go

Privacy

Yes. There are no sensitive data in the newly collected information.

Changelog

No

Breaking Changes

No

References

https://redhat.atlassian.net/browse/CCXDEV-8768

Summary by CodeRabbit

Release Notes

  • New Features

    • Added conditional custom resource definition gathering to collect resource definitions based on firing alerts and provided resource parameters.
  • Enhancements

    • Updated pod_definition to optionally use pod_prefix to gather multiple pods from the alert namespace.
    • Updated the gathering rule schema to support the new CR definition parameters and pod_prefix.
  • Documentation

    • Refreshed gathered-data documentation and added an example custom resource manifest.
  • Tests

    • Added unit tests covering CR definition gathering and pod-prefix filtering behavior.

@openshift-ci-robot

openshift-ci-robot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

@ncaak: This pull request references CCXDEV-8768 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

  • This PR implements a new conditional gatherer. This new gatherer will try to retrieve a CR given its GVR always under the namespace where the alert is firing.

  • The pod definition gatherer was also upgraded to accept a prefix in its params. If a prefix is set, the gatherer will check for all pods that matches the prefix. Always under the same namespace where the alert is firing.

  • Some unit tests were added to the pod definition gatherer to adapt to the new functionality.

Categories

  • Bugfix
  • Data Enhancement
  • Feature
  • Backporting
  • Others (CI, Infrastructure, Documentation)

Sample Archive

  • docs/insights-archive-sample/conditional/namespaces/openshift-storage/crd/ceph.rook.io/v1/cephclusters/ocs-storagecluster-cephcluster.json

Documentation

  • docs/gathered-data.md

Unit Tests

  • pkg/gatherers/conditional/gather_pod_definition_test.go
  • pkg/gatherers/conditional/gather_cr_definition_test.go

Privacy

Yes. There are no sensitive data in the newly collected information.

Changelog

No

Breaking Changes

No

References

https://redhat.atlassian.net/browse/CCXDEV-8768

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 19, 2026
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Two conditional gatherers are extended: a new CRDefinition gatherer collects custom resources by GVR from namespaces where an alert is firing, and PodDefinition now supports an optional pod_prefix to gather matching pods by name prefix. Wiring, schema, tests, and docs were updated accordingly.

Changes

Conditional gatherers

Layer / File(s) Summary
CRDefinition contract and wiring
pkg/gatherers/conditional/gathering_functions.go, pkg/gatherers/conditional/gathering_rule.schema.json, docs/gathered-data.md
Adds GatherCRDefinition, GatherCRDefinitionParams, JSON parameter parsing, builder registration, schema validation for cr_definition, and a new gathered-data entry.
CRDefinition implementation and tests
pkg/gatherers/conditional/gather_cr_definition.go, pkg/gatherers/conditional/gather_cr_definition_test.go, docs/insights-archive-sample/conditional/namespaces/openshift-storage/crd/ceph.rook.io/v1/cephclusters/ocs-storagecluster-cephcluster.json
Implements dynamic CR listing by alert namespace and records emission, with tests using fake dynamic clients and CephCluster fixtures. The archive sample adds a full CephCluster manifest.
PodDefinition prefix handling
pkg/gatherers/conditional/gather_pod_definition.go, docs/gathered-data.md
Updates pod gathering to branch on pod_prefix, listing and filtering pods by prefix when set and preserving single-pod lookup otherwise.
PodDefinition tests and schema
pkg/gatherers/conditional/gather_pod_definition_test.go, pkg/gatherers/conditional/gathering_rule.schema.json
Extends pod definition validation and tests for prefix-based gathering and fallback behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly captures the main change: adding a new conditional CR gatherer.
Description check ✅ Passed The description matches the template well and includes all required sections with relevant details.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The added tests use static t.Run names only; no Ginkgo titles or dynamic values (pods, namespaces, UUIDs, timestamps) appear.
Test Structure And Quality ✅ Passed These are standard table-driven unit tests, not Ginkgo; they use isolated fake clients, no waits, and match existing package test patterns.
Microshift Test Compatibility ✅ Passed PASS: The added tests are plain Go unit tests (testing.T), with no new Ginkgo e2e specs or MicroShift-unsupported OpenShift APIs.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo/e2e tests were added; the changed tests are plain Go unit tests (t.Run/t.Parallel) and contain no multi-node or SNO assumptions.
Topology-Aware Scheduling Compatibility ✅ Passed Changed files are docs, sample CR data, and gatherer logic only; no new deployment/controller scheduling code, and the sample manifest doesn’t use the flagged topology patterns.
Ote Binary Stdout Contract ✅ Passed No process-level entrypoints or stdout writes were added in touched files; the PR only changes gatherer logic, tests, docs, and sample data.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo e2e tests were added; the new tests are unit tests and contain no IPv4-only assumptions or external connectivity.
No-Weak-Crypto ✅ Passed No MD5/SHA1/DES/RC4/3DES/Blowfish, custom crypto, or secret/token comparisons were present in the changed files.
Container-Privileges ✅ Passed No changed manifest or code introduced privileged, hostPID/Network/IPC, SYS_ADMIN, or allowPrivilegeEscalation settings; recursive scan of the added JSON found 0 hits.
No-Sensitive-Data-In-Logs ✅ Passed New logs only include namespaces, pod names, and GVR/error context; no secrets, tokens, PII, or full object contents are logged.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from katushiik11 and opokornyy June 19, 2026 15:28
@openshift-ci

openshift-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ncaak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 19, 2026
@ncaak

ncaak commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 19, 2026
@opokornyy

Copy link
Copy Markdown
Contributor

/retest

@opokornyy

Copy link
Copy Markdown
Contributor

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
pkg/gatherers/conditional/gather_cr_definition_test.go (1)

43-43: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Rename the test to match the repository's Go test convention.

TestGatherer_gatherCRDefinition does not follow the required Test_<FunctionName> / Test_<GatherName>_<FunctionName> form.

Suggested rename
-func TestGatherer_gatherCRDefinition(t *testing.T) {
+func Test_Gatherer_gatherCRDefinition(t *testing.T) {

As per coding guidelines, **/*_test.go: "Name test functions as Test_<FunctionName> or Test_<GatherName>_<FunctionName>."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/gatherers/conditional/gather_cr_definition_test.go` at line 43, Rename
the test function TestGatherer_gatherCRDefinition to follow the Go test naming
convention used in this repo, using the Test_<GatherName>_<FunctionName>
pattern. Update the declaration in gather_cr_definition_test.go so the test name
clearly matches the gatherer and method under test, and keep the rest of the
test logic unchanged.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/gatherers/conditional/gather_cr_definition.go`:
- Around line 75-109: The namespace-based CR listing in the alertInstances loop
can emit duplicate records when multiple alert instances resolve to the same
namespace. Update gather_cr_definition.go to deduplicate namespaces before
calling dynamicClient.Resource(gvr).Namespace(namespace).List, then iterate the
unique namespace set and keep the existing record creation logic in the inner CR
list loop. Use getAlertPodNamespace, dynamicClient.Resource, and the
record.Record path builder as the key locations to adjust.

In `@pkg/gatherers/conditional/gather_pod_definition.go`:
- Around line 88-99: The empty-prefix fallback in gather_pod_definition.go is
currently unreachable because the gathering_rule.schema.json contract makes
pod_prefix required and non-empty. Update the schema and runtime behavior to
match: either make pod_prefix optional or relax its validation to allow an empty
string, and ensure GatherPodDefinition/related prefix handling still behaves
correctly when params.PodPrefix is absent or blank.
- Around line 90-94: The pod gathering flow in gatherPodDefinition should not
return immediately when coreClient.Pods(...).List fails for a single namespace.
Update the error handling to append the list error to errs, preserve any already
collected records, and continue processing the remaining alert instances instead
of exiting early. Keep the logic in gatherPodDefinition and the pod listing loop
intact, but ensure partial results are returned even when one namespace lookup
fails.

In `@pkg/gatherers/conditional/gathering_rule.schema.json`:
- Around line 252-255: The `resource` field in `cr_definition` is too
restrictive because `gathering_rule.schema.json` only allows lowercase letters
and digits, which rejects valid hyphenated CR resource names. Update the schema
pattern for the `resource` property so it accepts hyphens as well, and keep the
change scoped to the `cr_definition` validation path used by the new gatherer.
- Around line 210-223: The `pod_definition` schema in
`gathering_rule.schema.json` is too strict for existing `validation.go` behavior
and current rule usage. Update the `required` fields and `pod_prefix` validation
so `alert_name`-only rules are accepted, and relax the `pod_prefix` pattern in
the `properties.pod_prefix` definition to allow controller-generated prefixes
like `alertmanager-main-` instead of only full DNS labels. Keep the schema
aligned with how `validation.go` validates rules at runtime and use the existing
`pod_definition`/`pod_prefix` symbols to locate the change.

---

Nitpick comments:
In `@pkg/gatherers/conditional/gather_cr_definition_test.go`:
- Line 43: Rename the test function TestGatherer_gatherCRDefinition to follow
the Go test naming convention used in this repo, using the
Test_<GatherName>_<FunctionName> pattern. Update the declaration in
gather_cr_definition_test.go so the test name clearly matches the gatherer and
method under test, and keep the rest of the test logic unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: c37c5a6d-30d6-4548-959a-10542f41b47f

📥 Commits

Reviewing files that changed from the base of the PR and between 314becc and 624ea37.

📒 Files selected for processing (8)
  • docs/gathered-data.md
  • docs/insights-archive-sample/conditional/namespaces/openshift-storage/crd/ceph.rook.io/v1/cephclusters/ocs-storagecluster-cephcluster.json
  • pkg/gatherers/conditional/gather_cr_definition.go
  • pkg/gatherers/conditional/gather_cr_definition_test.go
  • pkg/gatherers/conditional/gather_pod_definition.go
  • pkg/gatherers/conditional/gather_pod_definition_test.go
  • pkg/gatherers/conditional/gathering_functions.go
  • pkg/gatherers/conditional/gathering_rule.schema.json

Comment on lines +75 to +109
for _, alertLabels := range alertInstances {
namespace, err := getAlertPodNamespace(alertLabels)
if err != nil {
klog.Warningf(logMissingLabel, err.Error(), params.AlertName)
errs = append(errs, err)
continue
}

gvr := schema.GroupVersionResource{
Group: params.Group,
Version: params.Version,
Resource: params.Resource,
}

crList, err := dynamicClient.Resource(gvr).Namespace(namespace).List(ctx, metav1.ListOptions{})
if err != nil {
klog.Warningf("CR not found in %s namespace (GVR: %v): %v", namespace, gvr, err)
errs = append(errs, err)
continue
}

for i := range crList.Items {
item := &crList.Items[i]
records = append(records, record.Record{
Name: fmt.Sprintf(
"%s/namespaces/%s/crd/%s/%s/%s/%s",
g.GetName(),
namespace,
params.Group,
params.Version,
params.Resource,
item.GetName()),
Item: record.JSONMarshaller{Object: item.Object},
})
}

@coderabbitai coderabbitai Bot Jun 24, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Deduplicate namespaces before listing CRs.

If the same alert fires for multiple objects in one namespace, this loop lists the same GVR once per alert instance and emits the same archive paths repeatedly. That produces duplicate conditional/namespaces/<ns>/crd/... records for a single namespace.

Suggested fix
 	var errs []error
 	var records []record.Record
+	seenNamespaces := map[string]struct{}{}

 	for _, alertLabels := range alertInstances {
 		namespace, err := getAlertPodNamespace(alertLabels)
 		if err != nil {
 			klog.Warningf(logMissingLabel, err.Error(), params.AlertName)
 			errs = append(errs, err)
 			continue
 		}
+		if _, seen := seenNamespaces[namespace]; seen {
+			continue
+		}
+		seenNamespaces[namespace] = struct{}{}

 		gvr := schema.GroupVersionResource{
 			Group:    params.Group,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for _, alertLabels := range alertInstances {
namespace, err := getAlertPodNamespace(alertLabels)
if err != nil {
klog.Warningf(logMissingLabel, err.Error(), params.AlertName)
errs = append(errs, err)
continue
}
gvr := schema.GroupVersionResource{
Group: params.Group,
Version: params.Version,
Resource: params.Resource,
}
crList, err := dynamicClient.Resource(gvr).Namespace(namespace).List(ctx, metav1.ListOptions{})
if err != nil {
klog.Warningf("CR not found in %s namespace (GVR: %v): %v", namespace, gvr, err)
errs = append(errs, err)
continue
}
for i := range crList.Items {
item := &crList.Items[i]
records = append(records, record.Record{
Name: fmt.Sprintf(
"%s/namespaces/%s/crd/%s/%s/%s/%s",
g.GetName(),
namespace,
params.Group,
params.Version,
params.Resource,
item.GetName()),
Item: record.JSONMarshaller{Object: item.Object},
})
}
var errs []error
var records []record.Record
seenNamespaces := map[string]struct{}{}
for _, alertLabels := range alertInstances {
namespace, err := getAlertPodNamespace(alertLabels)
if err != nil {
klog.Warningf(logMissingLabel, err.Error(), params.AlertName)
errs = append(errs, err)
continue
}
if _, seen := seenNamespaces[namespace]; seen {
continue
}
seenNamespaces[namespace] = struct{}{}
gvr := schema.GroupVersionResource{
Group: params.Group,
Version: params.Version,
Resource: params.Resource,
}
crList, err := dynamicClient.Resource(gvr).Namespace(namespace).List(ctx, metav1.ListOptions{})
if err != nil {
klog.Warningf("CR not found in %s namespace (GVR: %v): %v", namespace, gvr, err)
errs = append(errs, err)
continue
}
for i := range crList.Items {
item := &crList.Items[i]
records = append(records, record.Record{
Name: fmt.Sprintf(
"%s/namespaces/%s/crd/%s/%s/%s/%s",
g.GetName(),
namespace,
params.Group,
params.Version,
params.Resource,
item.GetName()),
Item: record.JSONMarshaller{Object: item.Object},
})
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/gatherers/conditional/gather_cr_definition.go` around lines 75 - 109, The
namespace-based CR listing in the alertInstances loop can emit duplicate records
when multiple alert instances resolve to the same namespace. Update
gather_cr_definition.go to deduplicate namespaces before calling
dynamicClient.Resource(gvr).Namespace(namespace).List, then iterate the unique
namespace set and keep the existing record creation logic in the inner CR list
loop. Use getAlertPodNamespace, dynamicClient.Resource, and the record.Record
path builder as the key locations to adjust.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncaak I think this is a valid comment that could save a few API requests. Can you have a look?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +88 to +99
if len(params.PodPrefix) > 0 {
// New logic has been introduced to retrieve a list of pods with a given prefix (from params)
podList, err := coreClient.Pods(podNamespace).List(ctx, metav1.ListOptions{})
if err != nil {
klog.Errorf("failed to list pods in namespace %s: %v", podNamespace, err)
return nil, []error{err}
}

podDefinitions = filterPodsByPrefix(podList, params.PodPrefix)
//
} else {
// Previous logic to retrieve ONLY the pod definition from the firing alert

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

pod_prefix fallback is unreachable under current schema contract.

The empty-prefix branch is implemented here, but pkg/gatherers/conditional/gathering_rule.schema.json:207-225 marks pod_prefix as required with a non-empty regex, so validated configs cannot hit this fallback path. Align schema and runtime behavior (make pod_prefix optional or allow empty string).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/gatherers/conditional/gather_pod_definition.go` around lines 88 - 99, The
empty-prefix fallback in gather_pod_definition.go is currently unreachable
because the gathering_rule.schema.json contract makes pod_prefix required and
non-empty. Update the schema and runtime behavior to match: either make
pod_prefix optional or relax its validation to allow an empty string, and ensure
GatherPodDefinition/related prefix handling still behaves correctly when
params.PodPrefix is absent or blank.

Comment on lines +90 to +94
podList, err := coreClient.Pods(podNamespace).List(ctx, metav1.ListOptions{})
if err != nil {
klog.Errorf("failed to list pods in namespace %s: %v", podNamespace, err)
return nil, []error{err}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Avoid aborting the whole gather on a single namespace list failure.

Line 93 returns immediately, which discards already collected records/errs and skips remaining alert instances. Keep partial results and continue.

Suggested fix
-			podList, err := coreClient.Pods(podNamespace).List(ctx, metav1.ListOptions{})
+			podList, err := coreClient.Pods(podNamespace).List(ctx, metav1.ListOptions{})
 			if err != nil {
 				klog.Errorf("failed to list pods in namespace %s: %v", podNamespace, err)
-				return nil, []error{err}
+				errs = append(errs, err)
+				continue
 			}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
podList, err := coreClient.Pods(podNamespace).List(ctx, metav1.ListOptions{})
if err != nil {
klog.Errorf("failed to list pods in namespace %s: %v", podNamespace, err)
return nil, []error{err}
}
podList, err := coreClient.Pods(podNamespace).List(ctx, metav1.ListOptions{})
if err != nil {
klog.Errorf("failed to list pods in namespace %s: %v", podNamespace, err)
errs = append(errs, err)
continue
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/gatherers/conditional/gather_pod_definition.go` around lines 90 - 94, The
pod gathering flow in gatherPodDefinition should not return immediately when
coreClient.Pods(...).List fails for a single namespace. Update the error
handling to append the list error to errs, preserve any already collected
records, and continue processing the remaining alert instances instead of
exiting early. Keep the logic in gatherPodDefinition and the pod listing loop
intact, but ensure partial results are returned even when one namespace lookup
fails.

Comment on lines 210 to +223
"required": [
"alert_name"
"alert_name",
"pod_prefix"
],
"properties": {
"alert_name": {
"type": "string",
"title": "AlertName",
"pattern": "^[a-zA-Z0-9_]{1,128}$"
},
"pod_prefix": {
"type": "string",
"title": "PodPrefix",
"pattern": "^[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?$"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Relax the pod_definition schema before this becomes the contract.

validation.go enforces this schema before runtime, so requiring pod_prefix rejects existing alert_name-only rules outright. The current regex also only accepts a complete DNS label, not common prefixes like alertmanager-main-, which is exactly what this feature needs for controller-generated pod names.

Suggested schema change
                 "^pod_definition$": {
                     "type": "object",
                     "title": "GatherPodDefinitionParams",
                     "required": [
-                        "alert_name",
-                        "pod_prefix"
+                        "alert_name"
                     ],
                     "properties": {
                         "alert_name": {
                             "type": "string",
                             "title": "AlertName",
                             "pattern": "^[a-zA-Z0-9_]{1,128}$"
                         },
                         "pod_prefix": {
                             "type": "string",
                             "title": "PodPrefix",
-                            "pattern": "^[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?$"
+                            "minLength": 1,
+                            "maxLength": 253,
+                            "pattern": "^[a-z0-9][a-z0-9.-]{0,252}$"
                         }
                     }
                 },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"required": [
"alert_name"
"alert_name",
"pod_prefix"
],
"properties": {
"alert_name": {
"type": "string",
"title": "AlertName",
"pattern": "^[a-zA-Z0-9_]{1,128}$"
},
"pod_prefix": {
"type": "string",
"title": "PodPrefix",
"pattern": "^[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?$"
"required": [
"alert_name"
],
"properties": {
"alert_name": {
"type": "string",
"title": "AlertName",
"pattern": "^[a-zA-Z0-9_]{1,128}$"
},
"pod_prefix": {
"type": "string",
"title": "PodPrefix",
"minLength": 1,
"maxLength": 253,
"pattern": "^[a-z0-9][a-z0-9.-]{0,252}$"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/gatherers/conditional/gathering_rule.schema.json` around lines 210 - 223,
The `pod_definition` schema in `gathering_rule.schema.json` is too strict for
existing `validation.go` behavior and current rule usage. Update the `required`
fields and `pod_prefix` validation so `alert_name`-only rules are accepted, and
relax the `pod_prefix` pattern in the `properties.pod_prefix` definition to
allow controller-generated prefixes like `alertmanager-main-` instead of only
full DNS labels. Keep the schema aligned with how `validation.go` validates
rules at runtime and use the existing `pod_definition`/`pod_prefix` symbols to
locate the change.

Comment on lines +252 to +255
"resource": {
"type": "string",
"title": "Resource",
"pattern": "^[a-z0-9]{1,64}$"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Allow hyphenated CR resource names in cr_definition.

The resource pattern only permits [a-z0-9], so valid plural resources such as network-attachment-definitions are rejected during schema validation and can never reach the new gatherer.

Suggested schema change
                         "resource": {
                             "type": "string",
                             "title": "Resource",
-                            "pattern": "^[a-z0-9]{1,64}$"
+                            "pattern": "^[a-z0-9]([a-z0-9-]{0,251}[a-z0-9])?$"
                         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"resource": {
"type": "string",
"title": "Resource",
"pattern": "^[a-z0-9]{1,64}$"
"resource": {
"type": "string",
"title": "Resource",
"pattern": "^[a-z0-9]([a-z0-9-]{0,251}[a-z0-9])?$"
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/gatherers/conditional/gathering_rule.schema.json` around lines 252 - 255,
The `resource` field in `cr_definition` is too restrictive because
`gathering_rule.schema.json` only allows lowercase letters and digits, which
rejects valid hyphenated CR resource names. Update the schema pattern for the
`resource` property so it accepts hyphens as well, and keep the change scoped to
the `cr_definition` validation path used by the new gatherer.

return nil, []error{err}
}

const logMissingLabel = "%s at alertName: %s"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to the top of the file?

Comment on lines +83 to +87
gvr := schema.GroupVersionResource{
Group: params.Group,
Version: params.Version,
Resource: params.Resource,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved outside the for loop?

Comment on lines +75 to +109
for _, alertLabels := range alertInstances {
namespace, err := getAlertPodNamespace(alertLabels)
if err != nil {
klog.Warningf(logMissingLabel, err.Error(), params.AlertName)
errs = append(errs, err)
continue
}

gvr := schema.GroupVersionResource{
Group: params.Group,
Version: params.Version,
Resource: params.Resource,
}

crList, err := dynamicClient.Resource(gvr).Namespace(namespace).List(ctx, metav1.ListOptions{})
if err != nil {
klog.Warningf("CR not found in %s namespace (GVR: %v): %v", namespace, gvr, err)
errs = append(errs, err)
continue
}

for i := range crList.Items {
item := &crList.Items[i]
records = append(records, record.Record{
Name: fmt.Sprintf(
"%s/namespaces/%s/crd/%s/%s/%s/%s",
g.GetName(),
namespace,
params.Group,
params.Version,
params.Resource,
item.GetName()),
Item: record.JSONMarshaller{Object: item.Object},
})
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncaak I think this is a valid comment that could save a few API requests. Can you have a look?

@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

@ncaak: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/insights-operator-serial-tests 624ea37 link true /test insights-operator-serial-tests
ci/prow/insights-operator-e2e-tests 624ea37 link true /test insights-operator-e2e-tests
ci/prow/insights-operator-slow-tests 624ea37 link true /test insights-operator-slow-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants