Skip to content

NETOBSERV-2768: Implement DUMP_EVENTS_ON_FAILURE env var#2799

Open
Amoghrd wants to merge 9 commits into
netobserv:mainfrom
Amoghrd:NETOBSERV-2768
Open

NETOBSERV-2768: Implement DUMP_EVENTS_ON_FAILURE env var#2799
Amoghrd wants to merge 9 commits into
netobserv:mainfrom
Amoghrd:NETOBSERV-2768

Conversation

@Amoghrd

@Amoghrd Amoghrd commented Jun 9, 2026

Copy link
Copy Markdown
Member

Description

  • Implement DUMP_EVENTS_ON_FAILURE env var and default it to false.
  • Summary of all tests ran at the end of the run
  • Fix upgrade test wherein later tests would deploy released version of the operator since we are updating the global var in the test. Now using a local var in the test such that the global var is not polluted
  • Fix YAML lint errors

Dependencies

n/a

Test Results

Ran 2 tests
Passed: 1, Failed: 1, Skipped: 1

FAILED TESTS (1):

  1. [sig-netobserv] Network_Observability with Loki Author:aramesha-NonPreRelease-Critical-59746-NetObserv upgrade testing [Serial]
    /Users/amoghrd/repos/netobserv/netobserv-operator/integration-tests/backend/test_flowcollector.go:853 [646.025 seconds]
    Error: Expected
    : 1.12.0
    not to equal
    : 1.12.0

PASSED TESTS (1):

  1. [sig-netobserv] Network_Observability with Loki Author:aramesha-NonPreRelease-High-79015-Verify PacketTranslation feature [Serial] [695.229 seconds]

SKIPPED TESTS (1):

  1. [sig-netobserv] Network_Observability with Loki Author:memodi-NonPreRelease-Medium-77894-TechPreview Network Policies Correlation [Serial] [399.483 seconds]
    Reason: Skipping because the TechPreviewNoUpgrade is not enabled on the cluster.

Checklist

  • Does the changes in PR need specific configuration or environment set up for testing?
    • if so please describe it in PR description.
  • I have added thorough unit tests for the change.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

  • Tests
    • New env var support: DUMP_EVENTS_ON_FAILURE enables cluster-wide event/log dumping on test failure for better diagnostics.
    • Improved test reporting: after-suite now lists failed, passed, and skipped tests with clearer totals and per-test timings and failure details.
    • More reliable upgrade tests and cleanup guidance: upgrade flows avoid shared global state and emphasize uninstalling the operator after completion.

@openshift-ci-robot

openshift-ci-robot commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

@Amoghrd: This pull request references NETOBSERV-2768 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Description

Implement DUMP_EVENTS_ON_FAILURE env var and default it to false.

Dependencies

n/a

Checklist

  • Does the changes in PR need specific configuration or environment set up for testing?
    • if so please describe it in PR description.
  • I have added thorough unit tests for the change.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@Amoghrd Amoghrd requested a review from oliver-smakal June 9, 2026 18:48
@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign stleerh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Amoghrd Amoghrd requested a review from memodi June 9, 2026 18:48
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Test initialization now reads DUMP_EVENTS_ON_FAILURE to control dump-on-failure. After-suite reporting collects g.SpecReport entries into passed/failed/skipped slices and prints detailed FAILED/PASSED/SKIPPED sections. The upgrade test avoids mutating global catalog source objects and removes an OCP version skip guard.

Changes

Test logging, reporting, and upgrade flow

Layer / File(s) Summary
Environment-controlled event dumping
integration-tests/backend/backend_suite_test.go
Import os/strconv and read DUMP_EVENTS_ON_FAILURE, setting exutil.TestContext.DumpLogsOnFailure before test init.
After-suite spec summary refactor
integration-tests/backend/backend_suite_test.go
Replace integer counters with passedSpecs, failedSpecs, skippedSpecs slices; collect g.SpecReport entries, compute totals, and print separate FAILED/PASSED/SKIPPED sections listing spec text, per-spec timing, and failure/reason details.
Upgrade test catalog/source selection
integration-tests/backend/test_flowcollector.go
Remove SkipIfOCPBelow("v4.10"); create local NOReleased* catalog/source objects instead of mutating globals for older-version deployment; stop reassigning NOcatSrc.Name/Namespace when applying latest catalog.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: implementing the DUMP_EVENTS_ON_FAILURE environment variable.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed PR description covers main changes (DUMP_EVENTS_ON_FAILURE, test summary, upgrade test fix) and includes test results and checklist completion.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Amoghrd

Amoghrd commented Jun 9, 2026

Copy link
Copy Markdown
Member Author

/hold

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
integration-tests/backend/backend_suite_test.go (1)

31-35: Improve parsing of DUMP_EVENTS_ON_FAILURE in integration-tests/backend/backend_suite_test.go.
os.Getenv("DUMP_EVENTS_ON_FAILURE") == "true" is brittle (case/1/etc. won’t work); use strconv.ParseBool and treat invalid values as false.

Suggested patch
 import (
 	"flag"
 	"fmt"
 	"os"
+	"strconv"
 	"testing"
@@
-	dumpEvents := false
-	if os.Getenv("DUMP_EVENTS_ON_FAILURE") == "true" {
-		dumpEvents = true
-	}
-	exutil.TestContext.DumpLogsOnFailure = dumpEvents
+	dumpEvents, err := strconv.ParseBool(os.Getenv("DUMP_EVENTS_ON_FAILURE"))
+	if err != nil {
+		dumpEvents = false
+	}
+	exutil.TestContext.DumpLogsOnFailure = dumpEvents
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/backend_suite_test.go` around lines 31 - 35,
Replace the brittle equality check against "true" with strconv.ParseBool on the
DUMP_EVENTS_ON_FAILURE env var: call
strconv.ParseBool(os.Getenv("DUMP_EVENTS_ON_FAILURE")), treat any parse error as
false, assign the resulting boolean to dumpEvents (or set
exutil.TestContext.DumpLogsOnFailure directly), and ensure
exutil.TestContext.DumpLogsOnFailure is set to that parsed value so case/
"1"/"t" are handled correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@integration-tests/backend/backend_suite_test.go`:
- Around line 150-154: The summary prints "totalEvaluated/totalEvaluated" which
always shows 100%; update both fmt.Printf calls to use the actual passed count
(len(passedSpecs)) as the numerator instead of totalEvaluated so the headline
reflects real pass/fail ratio; reference the existing symbols totalEvaluated,
passedSpecs and report.RunTime.Seconds() when making this change.

---

Nitpick comments:
In `@integration-tests/backend/backend_suite_test.go`:
- Around line 31-35: Replace the brittle equality check against "true" with
strconv.ParseBool on the DUMP_EVENTS_ON_FAILURE env var: call
strconv.ParseBool(os.Getenv("DUMP_EVENTS_ON_FAILURE")), treat any parse error as
false, assign the resulting boolean to dumpEvents (or set
exutil.TestContext.DumpLogsOnFailure directly), and ensure
exutil.TestContext.DumpLogsOnFailure is set to that parsed value so case/
"1"/"t" are handled correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 622346ba-74f8-45bd-b12c-4b7bc4dfad48

📥 Commits

Reviewing files that changed from the base of the PR and between 6ad1987 and 01105a1.

📒 Files selected for processing (1)
  • integration-tests/backend/backend_suite_test.go

Comment thread integration-tests/backend/backend_suite_test.go Outdated
@Amoghrd Amoghrd force-pushed the NETOBSERV-2768 branch 3 times, most recently from 22fee16 to 366bcfc Compare June 10, 2026 14:50
@Amoghrd

Amoghrd commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

/unhold

@Amoghrd Amoghrd added the needs-review Tells that the PR needs a review label Jun 10, 2026
@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown

@kapiljain1989: changing LGTM is restricted to collaborators

Details

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kapjain-rh

Copy link
Copy Markdown
Member

/lgtm

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (fbfc6c1) to head (366bcfc).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@     Coverage Diff      @@
##   main   #2799   +/-   ##
============================
============================
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread integration-tests/backend/backend_suite_test.go Outdated
@openshift-ci openshift-ci Bot removed the lgtm label Jun 12, 2026
@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown

New changes are detected. LGTM label has been removed.

Comment thread integration-tests/backend/backend_suite_test.go
@Amoghrd Amoghrd requested a review from memodi June 17, 2026 14:30
@memodi

memodi commented Jun 17, 2026

Copy link
Copy Markdown
Member

new tests are now running for e2etests in CI, the lint-fmt test is failing

@Amoghrd

Amoghrd commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

@memodi Ran YAML lint locally and pushed changed. CI has an issue with YAML lint command not found. Fix in release#80672

@openshift-ci

openshift-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown

@Amoghrd: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2etests-lint-fmt 259eebf link true /test e2etests-lint-fmt
ci/prow/e2etest 259eebf link true /test e2etest

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@Amoghrd

Amoghrd commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

/test rehearse-80672-pull-ci-netobserv-netobserv-operator-main-e2etests-lint-fmt

@openshift-ci

openshift-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown

@Amoghrd: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test ci-bundle-noo-bundle
/test e2etest
/test e2etests-lint-fmt
/test images

The following commands are available to trigger optional jobs:

/test e2e-operator

Use /test all to run all jobs.

Details

In response to this:

/test rehearse-80672-pull-ci-netobserv-netobserv-operator-main-e2etests-lint-fmt

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference needs-review Tells that the PR needs a review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants