diff --git a/.claude/agents/ci-agent.md b/.claude/agents/ci-agent.md new file mode 100644 index 00000000..4fffa9c6 --- /dev/null +++ b/.claude/agents/ci-agent.md @@ -0,0 +1,283 @@ +--- +name: ci-agent +description: CI/CD validation and workflow integrity. Use when validating Tekton pipelines, checking local/CI parity, debugging CI failures, or ensuring pre-commit hooks mirror CI checks. +tools: Bash, Read, Grep, WebFetch, WebSearch +model: sonnet +--- + +# CI Agent + +CI/CD validation and workflow integrity for this operator. + +## Responsibilities + +### Primary Tasks +- Validate Tekton pipeline integrity +- Ensure local/CI parity +- Detect missing CI checks +- Optimize pipeline execution ordering +- Verify pre-commit mirrors CI + +### CI/CD Components + +**Tekton Pipelines** (`.tekton/`): +- `deadmanssnitch-operator-pull-request.yaml`: PR validation +- `deadmanssnitch-operator-push.yaml`: Main branch builds +- `deadmanssnitch-operator-pko-pull-request.yaml`: PKO validation +- `deadmanssnitch-operator-pko-push.yaml`: PKO deployment + +**Pipeline Stages:** +1. Checkout code +2. Build container image +3. Run linting (golangci-lint) +4. Run unit tests +5. Security scanning (gitleaks, gosec) +6. E2E testing (separate pipeline) +7. PKO packaging (separate pipeline) + +## Local/CI Parity + +### Pre-commit ↔ CI Mapping + +| Pre-commit Hook | CI Equivalent | Purpose | +|----------------|---------------|---------| +| `go-build` | Tekton compile check | Ensure code compiles | +| `golangci-lint` | Tekton lint job | Static analysis | +| `gitleaks` | Tekton security scan | Secret detection | +| `go-mod-tidy` | CI dependency check | No uncommitted go.mod/sum | +| `rbac-wildcard-check` | CI security policy | No wildcard RBAC | + +**Parity validation:** +```bash +# Check pre-commit uses same golangci-lint version as CI +grep "rev:" .pre-commit-config.yaml | grep golangci-lint +# Should match version in boilerplate pipeline + +# Check gitleaks version +grep "rev:" .pre-commit-config.yaml | grep gitleaks +``` + +### Running Full CI Locally + +```bash +# Lint (same as CI) +make go-check + +# Tests (same environment as CI) +boilerplate/_lib/container-make go-test + +# Build (same as CI) +make docker-build + +# Full validation +prek run --all-files +make go-test +make go-build +``` + +## Pipeline Validation + +### Tekton Pipeline Health Checks + +```bash +# Check for valid YAML +yamllint .tekton/*.yaml + +# Check for missing required steps +grep "pipelineRef:" .tekton/*.yaml +grep "params:" .tekton/*.yaml +``` + +### Required CI Steps + +Every PR pipeline MUST include: +- ✅ Checkout code +- ✅ Build image +- ✅ Run golangci-lint +- ✅ Run gitleaks +- ✅ Run unit tests +- ✅ Build succeeds + +E2E pipeline additionally includes: +- ✅ Deploy to test cluster +- ✅ Run e2e tests +- ✅ Cleanup + +### Missing Check Detection + +```bash +REQUIRED_CHECKS=( + "golangci-lint" + "gitleaks" + "go test" + "go build" + "rbac-wildcard-check" +) + +for check in "${REQUIRED_CHECKS[@]}"; do + if ! grep -q "$check" .tekton/*.yaml; then + echo "WARNING: $check not found in CI" + fi +done +``` + +## Usage + +Invoke when: +- Tekton pipelines modified +- Pre-commit hooks changed +- New validation steps added +- CI failures need investigation +- Optimization needed + +## Commands + +```bash +# Validate Tekton YAML +yamllint .tekton/*.yaml + +# Check pipeline references +grep "pipelineRef:" .tekton/*.yaml + +# Compare pre-commit and CI tools +diff <(grep "rev:" .pre-commit-config.yaml) <(echo "# CI versions from boilerplate") + +# Test container build (same as CI) +make docker-build + +# Run in CI-equivalent environment +boilerplate/_lib/container-make +``` + +## Execution Ordering Optimization + +**Current order (fastest first):** +1. File hygiene (2s) - check-merge-conflict, trailing-whitespace, EOF +2. YAML syntax (2s) - validate deploy/ manifests +3. Secret scan (5s) - gitleaks +4. Go build (10s cached) - compile check +5. Go mod tidy (10s) - dependency drift +6. RBAC check (5s) - wildcard detection +7. Static analysis (15s cached) - golangci-lint + +**Why this order:** +- Quick checks first provide fast feedback +- Fail fast on common issues (formatting, secrets) +- Expensive checks (lint) run last +- Total target: <30s on typical changeset + +## Integration with Boilerplate + +This operator uses Red Hat boilerplate: +- **Pipeline source**: `https://github.com/openshift/boilerplate` +- **Pipeline path**: `pipelines/docker-build-oci-ta/pipeline.yaml` +- **Updates**: `make boilerplate-update` + +When boilerplate updates: +- Check for breaking changes +- Test locally before merging +- Update pre-commit hooks to match + +## CI Failure Investigation + +### Lint Failures +```bash +# Reproduce locally +make go-check + +# Or exact CI environment +boilerplate/_lib/container-make go-check +``` + +### Test Failures +```bash +# Reproduce locally +make go-test + +# CI environment +boilerplate/_lib/container-make go-test + +# Check for environment differences +env | grep -E "GO|CI|BUILD" +``` + +### Build Failures +```bash +# Reproduce locally +make docker-build + +# Check Dockerfile +cat build/Dockerfile + +# Verify base image +grep "FROM" build/Dockerfile +``` + +### Secret Scan Failures +```bash +# Reproduce locally +gitleaks detect --source . --verbose + +# Check specific file +gitleaks detect --source . --log-opts="" +``` + +## Escalation Conditions + +Escalate to human when: +- CI pipeline consistently fails but local passes +- Tekton pipeline syntax errors +- Boilerplate update breaks CI +- New required check needs adding +- Pipeline execution time >10 minutes +- Konflux/Tekton infrastructure issues + +## Output Format + +Report CI issues in this format: +```text +CI Status: FAILING +Pipeline: deadmanssnitch-operator-pull-request +Stage: golangci-lint +Error: Exit code 1 + +Local Reproduction: + make go-check + +Root Cause: +Fix: +``` + +## Performance Targets + +- **PR pipeline**: <5 minutes total +- **Lint**: <1 minute +- **Unit tests**: <2 minutes +- **Build**: <3 minutes +- **E2E pipeline**: <15 minutes + +If exceeded, investigate: +- Cache misses +- Network issues +- Test parallelization +- Resource constraints + +## CI Security Considerations + +**Pipeline security:** +- Don't disable required checks +- Don't allow bypassing on PRs +- Require approvals for `.tekton/` changes +- Validate pipeline changes carefully + +**Secret handling in CI:** +- Use Tekton Secrets for credentials +- Don't log secrets +- Don't expose secrets in params +- Rotate secrets regularly + +**Image security:** +- Base images from trusted registries +- Scan images for vulnerabilities +- Don't use `latest` tag +- Sign images (if applicable) diff --git a/.claude/agents/docs-agent.md b/.claude/agents/docs-agent.md new file mode 100644 index 00000000..ea984362 --- /dev/null +++ b/.claude/agents/docs-agent.md @@ -0,0 +1,183 @@ +--- +name: docs-agent +description: Documentation maintenance and synchronization. Use when updating docs after code changes, validating command examples, keeping CLAUDE.md synchronized, or fixing documentation drift. +tools: Bash, Read, Edit, Grep +model: sonnet +--- + +# Docs Agent + +Documentation maintenance and synchronization for this operator. + +## Responsibilities + +### Primary Tasks +- Update documentation after code changes +- Ensure command examples remain valid +- Keep CLAUDE.md synchronized with actual workflows +- Validate Markdown formatting +- Check for broken links (if applicable) + +### Documentation Files +- `README.md`: Project overview, badges, links +- `CONTRIBUTING.md`: Contribution guidelines +- `DEVELOPMENT.md`: Developer commands +- `TESTING.md`: Testing guidelines +- `CLAUDE.md` / `AGENTS.md`: AI agent guidance +- `docs/development.md`: Original development guide + +## Update Triggers + +Update docs when: +- **Make targets added/removed**: Update `DEVELOPMENT.md` and `AGENTS.md` +- **API types changed**: Update `README.md` or relevant docs +- **Test framework changes**: Update `TESTING.md` +- **New dependencies**: Update `DEVELOPMENT.md` +- **Pre-commit hooks changed**: Update `CONTRIBUTING.md` +- **Claude Code hooks changed** (`.claude/settings.json`): Update `.claude/hooks/README.md` +- **Build process changed**: Update `DEVELOPMENT.md` and `AGENTS.md` + +## Validation Checks + +### Command Examples +```bash +# Extract make targets from docs and verify they exist +grep -h 'make [a-z]' *.md | grep -oP 'make \K[a-z-]+' | sort -u | while read t; do + make -n "$t" &>/dev/null || echo "Missing target: $t" +done + +# Dry-run make to verify targets exist +make -n go-build +make help +``` + +### Markdown Linting +```bash +# Check for code blocks without language tags +grep -E '```$' *.md + +# Find relative links to verify they exist +grep -E '\[.*\]\(\.' *.md +``` + +### Consistency Checks +- All `make` targets in docs exist in `Makefile` +- Pre-commit hooks listed match `prek.toml` and `hack/prek.ci.toml` +- Dependencies in docs match `go.mod` +- Go version in docs matches `go.mod` + +## Usage + +Invoke when: +- Code changes affect documented workflows +- New features added +- Build process modified +- Contributing guidelines need updates + +## Auto-Update Patterns + +### Make Targets +When `Makefile` changes, sync: +- `DEVELOPMENT.md` command reference +- `AGENTS.md` development commands section +- `README.md` if new primary targets added + +### Pre-commit Hooks +When `prek.toml`, `hack/prek.ci.toml`, or `.claude/settings.json` changes, sync: +- `CONTRIBUTING.md` validation section +- `AGENTS.md` validation strategy +- `.claude/hooks/README.md` hook configuration + +### Dependencies +When `go.mod` changes (major versions), sync: +- `DEVELOPMENT.md` prerequisites +- `README.md` badges/requirements + +## Documentation Style + +### Consistency Rules +- Use `bash` for code blocks, not `sh` or `shell` +- Commands should be copy-pasteable +- Include expected output for non-obvious commands +- Use `# Comments` to explain complex commands +- Prefer real examples over placeholders +- Capitalize "Markdown" as a proper noun + +### Link Format +- Use relative paths for internal docs: `[Testing](./TESTING.md)` +- Use full URLs for external links: `[go.uber.org/mock](https://pkg.go.dev/go.uber.org/mock)` +- Check links exist before committing + +## Documentation Sections to Maintain + +### README.md +- Project description stays current +- Badges reflect actual status +- Links to docs are correct +- Quick start is up to date + +### CONTRIBUTING.md +- prek setup matches `prek.toml` and `hack/prek.ci.toml` +- Required checks match CI pipeline +- Examples use current commands +- Security guidelines current + +### DEVELOPMENT.md +- All commands work as documented +- File paths are correct +- Prerequisites match actual requirements +- Troubleshooting addresses real issues + +### TESTING.md +- Test commands use current framework (testify/assert + GoMock) +- Mock generation steps reference correct location: `pkg/dmsclient/mock/` +- Coverage instructions work + +### AGENTS.md +- Agent rules reflect current workflows +- Commands are accurate and tested +- Security guardrails comprehensive +- Repo-specific constraints current + +## Escalation Conditions + +Escalate to human when: +- Major architectural docs need rewriting +- Conflicting information across multiple docs +- Command examples fail validation +- Documentation strategy needs rethinking +- Breaking changes require migration guide + +## Integration Points + +- Update docs in same PR as code changes +- Keep docs in sync with implementation +- No separate "docs update" PRs unless fixing errors + +## Validation Commands + +```bash +# Check all markdown files +find . -name "*.md" -not -path "./vendor/*" -not -path "./.git/*" -not -path "./boilerplate/*" + +# Verify make targets exist +grep -h 'make [a-z-]*' *.md | grep -oP '`make \K[a-z-]+' | sort -u + +# Check for dead links (manual review) +grep -r '\[.*\](' *.md +``` + +## Output Format + +When updating docs, report: +```text +Updated: DEVELOPMENT.md +- Fixed Go version requirement: 1.17 -> 1.25.4 +- Removed nonexistent make target: make tools +- Updated mock path: pkg/util/test/generated/ -> pkg/dmsclient/mock/ + +Validated: +- All make targets exist and work +- All command examples tested +- Links checked +``` diff --git a/.claude/agents/lint-agent.md b/.claude/agents/lint-agent.md new file mode 100644 index 00000000..10831f84 --- /dev/null +++ b/.claude/agents/lint-agent.md @@ -0,0 +1,104 @@ +--- +name: lint-agent +description: Automated linting and code quality enforcement. Use when running formatting checks, executing golangci-lint, auto-fixing safe issues, or investigating CI lint failures. +tools: Bash, Read, Edit +model: sonnet +--- + +# Lint Agent + +Automated linting and code quality enforcement for this operator. + +## Responsibilities + +### Primary Tasks +- Run formatting checks (`go fmt`) +- Execute golangci-lint with repo configuration +- Auto-fix safe linting issues +- Preserve existing code style and patterns +- Report unfixable issues with context + +### Validation Flow +1. Check if Go files have changed +2. Run `go fmt -l .` to detect formatting issues +3. Auto-fix formatting: `go fmt ./...` +4. Run `make go-check` (golangci-lint) +5. Attempt auto-fixes: `golangci-lint run --fix` +6. Report remaining issues with file:line references + +### Auto-Fix Criteria +Safe to auto-fix: +- Formatting (gofmt) +- Unused imports +- Simplifiable code (gosimple) +- Ineffectual assignments +- Trailing whitespace + +DO NOT auto-fix: +- Potential bugs (govet errors) +- Security issues (gosec warnings) +- Cyclomatic complexity violations +- API breaking changes + +## Usage + +Invoke when: +- Pre-commit validation needed +- After code generation +- Before creating PR +- CI lint failures need investigation + +## Commands + +```bash +# Format check only +go fmt -l . | grep -v "^$" + +# Format and fix +go fmt ./... + +# Full lint (as in CI) +make go-check + +# Lint with auto-fix +golangci-lint run --fix --config=boilerplate/openshift/golang-osd-operator/golangci.yml + +# Lint specific files +golangci-lint run --config=boilerplate/openshift/golang-osd-operator/golangci.yml +``` + +## Configuration + +Lint config: `boilerplate/openshift/golang-osd-operator/golangci.yml` + +Key rules: +- `govet`: Go static analysis +- `gosec`: Security scanning +- `staticcheck`: Bug detection +- `gocyclo`: Complexity checks +- `gofmt`: Formatting +- `goimports`: Import management + +## Output Format + +Report issues in this format: +```text +[FILE:LINE] [LINTER] Issue description +Example: controllers/deadmanssnitchintegration/deadmanssnitchintegration_controller.go:42 [govet] unreachable code +``` + +## Escalation Conditions + +Escalate to human when: +- Security warnings from gosec +- Cyclomatic complexity >15 (requires refactoring) +- API compatibility issues +- Multiple unfixable errors (>5) +- Linter configuration issues + +## Integration Points + +- Runs as part of `prek run golangci-lint` +- Mirrors Tekton CI lint job +- Should complete in <30s on typical changeset +- Uses same config as CI (no drift) diff --git a/.claude/agents/security-agent.md b/.claude/agents/security-agent.md new file mode 100644 index 00000000..c1bfdef8 --- /dev/null +++ b/.claude/agents/security-agent.md @@ -0,0 +1,235 @@ +--- +name: security-agent +description: Security scanning and policy enforcement. Use when scanning for secrets, validating RBAC (no wildcards), checking insecure patterns, or investigating security violations in CI. +tools: Bash, Read, Grep, Edit +model: sonnet +--- + +# Security Agent + +Security scanning and policy enforcement for this operator. + +## Responsibilities + +### Primary Tasks +- Scan for hardcoded secrets and credentials +- Validate RBAC configurations (no wildcards) +- Check for insecure patterns in code +- Detect dangerous operations +- Enforce security policies + +### Security Checks + +#### 1. Secret Scanning +```bash +# Gitleaks (runs in pre-commit) +prek run gitleaks + +# Manual scan +gitleaks detect --source . --verbose +``` + +**Detect:** +- AWS keys (access key ID, secret access key) +- GitHub tokens +- API keys +- Private keys (PEM, SSH) +- Passwords in code or config +- Database connection strings with credentials +- High-entropy strings (potential secrets) + +#### 2. RBAC Wildcard Check +```bash +# Pre-commit hook enforces this +make rbac-wildcard-check +``` + +**Forbidden patterns in `deploy/*.yaml`:** +- `resources: ["*"]` +- `verbs: ["*"]` +- `apiGroups: ["*"]` (usually) +- Multi-line format: `- '*'` + +**Enforcement:** +- ALWAYS specify exact resource types +- ALWAYS specify exact verbs +- Wildcard permissions are NEVER acceptable + +#### 3. Code Security Patterns + +**Dangerous patterns to detect:** +```go +// Secrets in code +password := "hardcoded-secret" // FORBIDDEN +apiKey := os.Getenv("API_KEY") // OK if not logged + +// Logging secrets +logger.Info("token: " + token) // FORBIDDEN +logger.Info("request authenticated") // OK + +// Command injection +exec.Command("sh", "-c", userInput) // DANGEROUS +exec.Command("kubectl", "get", "pods", podName) // OK if podName validated + +// Unsafe YAML/JSON unmarshaling +yaml.Unmarshal(untrustedInput, &obj) // Validate schema first + +// File path traversal +filepath.Join(baseDir, userInput) // Validate userInput doesn't contain ".." +``` + +#### 4. Dependency Vulnerabilities +```bash +# Check for known vulnerabilities in dependencies +go list -json -m all | nancy sleuth + +# Scan go.mod for outdated dependencies with CVEs +# (This requires external tooling not in current repo) +``` + +## Usage + +Invoke when: +- Before committing code +- RBAC manifests modified +- Secret handling code changed +- CI/CD pipelines modified +- Dockerfile updated +- Network policy changed + +## Commands + +```bash +# Full security scan +prek run gitleaks --all-files +make rbac-wildcard-check +make go-check # includes gosec + +# Individual checks +gitleaks detect --source . --verbose +golangci-lint run --enable gosec +grep -r "password\s*:=\s*\"" --include="*.go" . +``` + +## High-Risk File Detection + +Files requiring extra scrutiny: +- `deploy/*.yaml` (RBAC, NetworkPolicy) +- `*_rbac.go` (authorization logic) +- `controllers/deadmanssnitchintegration/deadmanssnitchintegration_controller.go` (main reconciler with secret handling) +- `.tekton/*.yaml` (CI/CD pipelines) +- `build/Dockerfile` (container security) + +## Security Policy Enforcement + +### Secrets +- ✅ Use Kubernetes Secrets with references +- ✅ Use environment variables (with care) +- ✅ Use external secret management (Vault, etc.) +- ❌ Never hardcode secrets +- ❌ Never log secrets +- ❌ Never commit `.env` files with secrets + +### RBAC +- ✅ Specify exact resources and verbs +- ✅ Use Role for namespace-scoped permissions +- ✅ Use ClusterRole sparingly +- ❌ Never use wildcard permissions +- ❌ Never grant `cluster-admin` + +### Network Policies +- ✅ Default deny all traffic +- ✅ Explicitly allow required connections +- ✅ Document each ingress/egress rule +- ❌ Don't create overly permissive policies + +### Container Security +- ✅ Use minimal base images +- ✅ Run as non-root user +- ✅ Set read-only root filesystem +- ✅ Drop unnecessary capabilities +- ❌ Don't use `latest` tag +- ❌ Don't run as root + +## Gitleaks Configuration + +Custom allowlist in `.gitleaks.toml`: +- Known false positives +- Test fixtures with fake credentials +- Public key material (certificates) +- Non-secret high-entropy strings + +## Output Format + +Report findings in this format: +```text +[SEVERITY] [CATEGORY] Location: Issue +Example: [HIGH] [SECRET] pkg/dmsclient/dmsclient.go:42: Hardcoded API key detected +Example: [CRITICAL] [RBAC] deploy/role.yaml:15: Wildcard permission not allowed +``` + +Severity levels: +- **CRITICAL**: Immediate fix required (secrets committed, wildcard RBAC) +- **HIGH**: Security vulnerability (code injection, auth bypass) +- **MEDIUM**: Risky pattern (weak crypto, missing validation) +- **LOW**: Security hygiene (outdated dependency, missing security header) + +## Auto-Remediation + +Safe to auto-fix: +- Removing trailing whitespace from manifests +- Fixing YAML indentation + +NOT safe to auto-fix: +- Adding or modifying security context in manifests (requires manual review) +- Removing wildcards from RBAC (requires understanding requirements) +- Removing secrets from code (requires alternative solution) +- Changing authentication logic +- Modifying NetworkPolicies + +## Escalation Conditions + +Escalate immediately when: +- Secrets detected in commit +- Wildcard RBAC permissions found +- Authentication/authorization logic changed +- Network policy allows all traffic +- Dockerfile runs as root +- CI pipeline modified to skip security checks + +Escalate for review when: +- gosec warnings in security-critical code +- New dependency with known CVEs +- Crypto algorithm changes +- External network call added + +## Integration Points + +- **Pre-commit**: gitleaks runs automatically via prek +- **CI**: Tekton runs gitleaks and gosec +- **RBAC check**: Custom make target +- **Manual**: Run before modifying security-critical code + +## FIPS Compliance + +This operator requires FIPS 140-2 compliance: +- All crypto operations must use validated libraries +- No weak algorithms (MD5, SHA1, DES) +- TLS 1.2+ only +- FIPS-approved key lengths + +Check crypto usage: +```bash +grep -r "crypto/" --include="*.go" . | grep -v "crypto/tls" +grep -r "md5\|sha1\|des" --include="*.go" . +``` + +## False Positive Handling + +If gitleaks flags non-secret: +1. Verify it's truly not a secret +2. Add to `.gitleaks.toml` allowlist with justification +3. Document why it's safe +4. Review periodically + +Never disable gitleaks entirely or use `SKIP=gitleaks`. diff --git a/.claude/agents/test-agent.md b/.claude/agents/test-agent.md new file mode 100644 index 00000000..3c1d80c2 --- /dev/null +++ b/.claude/agents/test-agent.md @@ -0,0 +1,181 @@ +--- +name: test-agent +description: Automated testing and test quality assurance. Use when running targeted tests for changed code, analyzing test failures, debugging flaky tests, or ensuring test coverage. +tools: Bash, Read, Edit +model: sonnet +--- + +# Test Agent + +Automated testing and test quality assurance for this operator. + +## Responsibilities + +### Primary Tasks +- Run targeted unit tests for changed code +- Detect and report flaky test failures +- Suggest minimal fixes for test failures +- Ensure test coverage for new code +- Avoid unnecessary test reruns + +### Test Execution Strategy +1. **Incremental testing**: Run only affected packages +2. **Failure analysis**: Distinguish real bugs from flaky tests +3. **Minimal fixes**: Fix the test or the bug, not surrounding code +4. **Coverage validation**: Ensure new code has tests + +### Test Selection Logic + +```bash +# Changed Go files +CHANGED_FILES=$(git diff --name-only HEAD | grep "\.go$") + +# Extract packages +PACKAGES=$(echo "$CHANGED_FILES" | xargs -n1 dirname | sort -u | tr '\n' ' ') + +# Run targeted tests +for pkg in $PACKAGES; do + go test -v ./$pkg/... +done +``` + +## Usage + +Invoke when: +- Code changes committed +- Test failures in CI +- Before creating PR +- After code generation (mocks changed) + +## Commands + +```bash +# All tests +make go-test + +# Specific package +go test -v ./controllers/deadmanssnitchintegration/ + +# Specific test by name +go test -v -run TestReconcileClusterDeployment ./controllers/deadmanssnitchintegration/ + +# All packages +go test ./... + +# Coverage +go test -coverprofile=coverage.out ./... +go tool cover -html=coverage.out + +# Container-based (CI parity) +boilerplate/_lib/container-make go-test +``` + +## Failure Analysis + +### Real Failure Indicators +- Consistent failure across multiple runs +- Failed assertion with unexpected value +- Panic or runtime error +- Compilation error in test + +### Flaky Test Indicators +- Passes on retry without code changes +- Timeout issues +- Race condition symptoms +- Environment-dependent failures + +### Test Debugging + +```bash +# Run test multiple times to detect flakiness +for i in {1..5}; do go test ./controllers/deadmanssnitchintegration/ || break; done + +# Verbose output +go test -v ./controllers/deadmanssnitchintegration/ + +# Race detector +go test -race ./controllers/deadmanssnitchintegration/ +``` + +## Test Framework + +Tests use **testify/assert** for assertions and **go.uber.org/mock** (GoMock) for mocking the DMS API client. + +The DMS API client mock is at `pkg/dmsclient/mock/mock_dmsclient.go`, generated from the `Client` interface in `pkg/dmsclient/dmsclient.go`. + +### Writing Tests with testify + +```go +import ( + "testing" + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func TestMyFeature(t *testing.T) { + result, err := MyFunction() + require.NoError(t, err) + assert.Equal(t, expected, result) +} +``` + +### Using GoMock + +```go +import ( + "go.uber.org/mock/gomock" + "github.com/openshift/deadmanssnitch-operator/pkg/dmsclient/mock" +) + +func TestReconcile(t *testing.T) { + ctrl := gomock.NewController(t) + defer ctrl.Finish() + + mockClient := mock.NewMockClient(ctrl) + mockClient.EXPECT().CreateSnitch(gomock.Any()).Return(&dmsclient.Snitch{}, nil) + // ... +} +``` + +## Fix Strategy + +**Test fails due to code bug:** +1. Identify failing assertion +2. Locate corresponding production code +3. Fix the bug +4. Verify fix with targeted test run +5. Run full suite to check for regressions + +**Test fails due to outdated mocks:** +1. Check if `pkg/dmsclient/dmsclient.go` interface changed +2. Regenerate mocks: `boilerplate/_lib/container-make generate` +3. Update test expectations if needed +4. Rerun tests + +**Test fails due to test bug:** +1. Review test logic +2. Fix test setup or assertions +3. Ensure test is deterministic +4. Avoid hardcoded timeouts or sleeps + +## Test Coverage Requirements + +New code MUST have: +- Unit tests for public functions +- Error path testing +- Edge case coverage +- Mock-based isolation from external APIs (DMS client) + +Don't test: +- Generated code (`zz_generated.*.go`, `mock_dmsclient.go`) +- Trivial getters/setters +- Third-party library wrappers (test your logic, not theirs) + +## Escalation Conditions + +Escalate to human when: +- Consistent test failures across multiple packages +- Flaky tests that can't be made deterministic +- Coverage drops significantly +- Tests require architectural changes +- Mock generation fails diff --git a/.claude/hooks/README.md b/.claude/hooks/README.md new file mode 100644 index 00000000..8626b26b --- /dev/null +++ b/.claude/hooks/README.md @@ -0,0 +1,207 @@ +# Claude Code Hooks + +Security and validation hooks for this operator development. + +## Overview + +This repository uses **prek** (git hook manager) for quality checks and validation. Claude Code hooks integrate with prek to provide immediate feedback during development. + +## Architecture + +```text +┌─────────────────────────────────────┐ +│ Developer / Claude Code Agent │ +└──────────────┬──────────────────────┘ + │ + ▼ +┌─────────────────────────────────────┐ +│ Stop Hook (conditional) │ +│ - Default: runs only with changes │ +│ - Strict: runs every turn │ +│ - Blocks if issues found │ +│ - Claude fixes automatically │ +└──────────────┬──────────────────────┘ + │ + ▼ +┌─────────────────────────────────────┐ +│ Prek Hooks (CI config) │ +│ - golangci-lint (static analysis) │ +│ - RBAC wildcard check │ +│ - go build validation │ +│ - go mod tidy check │ +│ - file hygiene (trailing space) │ +└──────────────┬──────────────────────┘ + │ +┌──────────────▼──────────────────────┐ +│ Prek Hooks (full config) │ +│ + rh-pre-commit (InfoSec) │ +│ + gitleaks (secret scanning) │ +└──────────────┬──────────────────────┘ + │ + ▼ +┌─────────────────────────────────────┐ +│ Git Commit │ +└──────────────┬──────────────────────┘ + │ + ▼ +┌─────────────────────────────────────┐ +│ CI/CD (Tekton Pipelines) │ +└─────────────────────────────────────┘ +``` + +## Available Hooks + +### [stop-prek-validation.sh](./stop-prek-validation.sh) +**Purpose**: Run prek validation when Claude makes changes (or always, if configured) + +**Triggers**: On Claude Code session stop (Stop hook) + +**Default mode** (recommended): +- Only runs if there are uncommitted changes +- Skips validation for read-only queries (fast iteration) +- Validates when Claude modifies code (before commit) + +**Strict mode** (opt-in): +- Set environment variable: `export CLAUDE_LINT_ON_STOP=true` +- Always runs validation on every stop +- Slower but catches issues immediately + +**Common behavior**: +- Runs `prek run --config hack/prek.ci.toml` on changed files +- Uses CI-compatible config (skips network-dependent hooks like rh-pre-commit, gitleaks) +- Blocks Claude from stopping if issues found +- Feeds errors back to Claude for automatic fixes + +**Enable strict mode**: +```bash +# In your shell profile (~/.zshrc, ~/.bashrc) +export CLAUDE_LINT_ON_STOP=true +``` + +--- + +### [pre-edit.sh](./pre-edit.sh) +**Purpose**: Prevent editing generated files and warn about high-risk changes + +**Status**: Available for standalone use (not wired as a Claude Code hook) + +**Checks**: +- Generated files (`zz_generated.*.go`) +- Generated mocks (`pkg/dmsclient/mock/mock_dmsclient.go`) +- Vendored code (`vendor/`) +- Boilerplate files (managed upstream) +- High-risk security files (RBAC, auth, NetworkPolicy) +- CI/CD pipelines (`.tekton/*.yaml`) +- Dockerfiles + +**Manual Usage**: +```bash +.claude/hooks/pre-edit.sh path/to/file.go +``` + +--- + +## Prek Configuration + +This repository maintains **two prek configurations**: + +### 1. **prek.toml** (Full validation) +Used for local development with internal network access. + +**Hooks**: +- File hygiene (trailing whitespace, EOF, syntax checks) +- **rh-pre-commit**: Red Hat InfoSec security checks (requires `gitlab.cee.redhat.com` access) +- **gitleaks**: Secret detection (configured via `.gitleaks.toml`) +- **golangci-lint**: Static analysis +- **go-build**: Compile check +- **go-mod-tidy**: Dependency drift detection +- **rbac-wildcard-check**: RBAC validation + +**Usage**: +```bash +prek run # Uses prek.toml by default +``` + +### 2. **hack/prek.ci.toml** (CI-compatible) +Used by Claude Code stop hook and CI environments without internal network access. + +**Excludes**: +- `rh-pre-commit` (requires Red Hat internal network) +- `gitleaks` (may not be available in all CI environments) + +**Usage**: +```bash +hack/ci.sh +# or +prek run --config hack/prek.ci.toml --all-files +``` + +## Setup + +### Prerequisites +```bash +# Install prek (choose one) +uv tool install prek # recommended +pipx install prek # alternative +pip install --user prek # fallback +``` + +### Install Git Hooks +```bash +prek install +``` + +## Security Guardrails + +### Secret Prevention +**Implementation**: gitleaks via prek +**Configuration**: `.gitleaks.toml` +**Action**: BLOCK commit + +### InfoSec Scanning +**Implementation**: rh-pre-commit via prek +**Action**: BLOCK commit on violations + +### RBAC Validation +**Implementation**: rbac-wildcard-check via prek +**Detects**: Wildcard resources/verbs in `deploy/*.yaml` +**Action**: BLOCK commit + +### File Edit Protection +**Implementation**: pre-edit.sh (standalone) +**Action**: BLOCK edit on generated/vendored files, WARN on high-risk files + +## Hook Performance + +**Targets:** +- Stop hook: <30s for full validation +- Pre-commit hook: <30s on typical changeset + +**Never bypass hooks:** +```bash +# FORBIDDEN +git commit --no-verify +SKIP=hook-id git commit +``` + +Security hooks (gitleaks, rh-pre-commit) must NEVER be bypassed. + +## Version Management + +Prek version pinned in `.prek-version`: +```bash +cat .prek-version # v0.4.1 +``` + +Hook dependencies pinned in `prek.toml`: +- `rh-pre-commit-2.3.0` +- `v8.18.0` (gitleaks) +- `v2.0.2` (golangci-lint) + +## References + +- [Prek Documentation](https://prek.j178.dev/) +- [Gitleaks](https://github.com/gitleaks/gitleaks) +- [RH InfoSec Tools](https://gitlab.cee.redhat.com/infosec-public/developer-workbench/tools) +- [golangci-lint](https://golangci-lint.run/) +- [AGENTS.md](../../AGENTS.md) - Development guidelines diff --git a/.claude/hooks/pre-edit.sh b/.claude/hooks/pre-edit.sh new file mode 100755 index 00000000..dc9f8e43 --- /dev/null +++ b/.claude/hooks/pre-edit.sh @@ -0,0 +1,163 @@ +#!/usr/bin/env bash +# +# Pre-Edit Hook for this operator +# Prevents editing generated files, vendored code, and high-risk files without warning +# +# Usage: Called automatically by Claude Code before file edits +# + +set -euo pipefail + +FILE="${1:-}" + +if [[ -z "$FILE" ]]; then + echo "Usage: $0 " + exit 1 +fi + +REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null || echo ".") + +if [[ "$FILE" = /* ]]; then + if [[ ! "$FILE" == "$REPO_ROOT"/* ]]; then + echo "❌ ERROR: File path is outside repository: $FILE" + exit 1 + fi + FILE="${FILE#"$REPO_ROOT"/}" +fi + +if command -v realpath >/dev/null 2>&1; then + CANONICAL=$(realpath -m --relative-to="$REPO_ROOT" "$FILE" 2>/dev/null || realpath --relative-to="$REPO_ROOT" "$FILE" 2>/dev/null || echo "") +else + CANONICAL="" +fi + +if [[ -z "$CANONICAL" ]]; then + if command -v python3 >/dev/null 2>&1; then + CANONICAL=$(python3 -c "import os.path; print(os.path.relpath(os.path.normpath('$FILE'), '$REPO_ROOT'))" 2>/dev/null || echo "") + elif command -v python >/dev/null 2>&1; then + CANONICAL=$(python -c "import os.path; print(os.path.relpath(os.path.normpath('$FILE'), '$REPO_ROOT'))" 2>/dev/null || echo "") + fi +fi + +if [[ -z "$CANONICAL" ]] || [[ "$CANONICAL" == *".."* ]]; then + echo "❌ ERROR: Invalid file path (contains traversal): $FILE" + exit 1 +fi +FILE="$CANONICAL" +FILE="${FILE#./}" + +confirm_or_exit() { + local prompt="$1" + echo "$prompt" + + if [[ ! -t 0 ]]; then + echo "❌ ERROR: Non-interactive environment detected" + echo " This operation requires manual confirmation" + exit 1 + fi + + read -r response + if [[ ! "$response" =~ ^[Yy]$ ]]; then + exit 1 + fi +} + +# ============================================================================= +# GENERATED FILES - BLOCK EDITS +# ============================================================================= + +if [[ "$FILE" == *"zz_generated."* ]]; then + echo "❌ BLOCKED: Cannot edit generated file: $FILE" + echo " This file is auto-generated by controller-gen." + echo " To regenerate: boilerplate/_lib/container-make generate" + exit 1 +fi + +# Mock at pkg/dmsclient/mock/mock_dmsclient.go +if [[ "$FILE" == *"/mock/mock_"* ]]; then + echo "❌ BLOCKED: Cannot edit generated mock: $FILE" + echo " This file is auto-generated by mockgen." + echo " To regenerate: boilerplate/_lib/container-make generate" + exit 1 +fi + +if [[ "$FILE" == deploy/crds/* ]] && [[ "$FILE" == *.yaml ]]; then + echo "⚠️ WARNING: Editing generated CRD manifest: $FILE" + echo " CRDs are generated from API types." + echo " Consider editing api/v1alpha1/*.go instead." + echo " To regenerate CRDs: make manifests" + echo "" + confirm_or_exit " Continue? (y/N)" +fi + +# ============================================================================= +# LOCKFILES - WARNING +# ============================================================================= + +if [[ "$FILE" == "go.sum" ]]; then + echo "⚠️ WARNING: Editing go.sum directly" + echo " This file is managed by 'go mod tidy'." + confirm_or_exit " Are you sure you want to edit it manually? (y/N)" +fi + +# ============================================================================= +# VENDORED CODE - BLOCK EDITS +# ============================================================================= + +if [[ "$FILE" == vendor/* ]]; then + echo "❌ BLOCKED: Cannot edit vendored code: $FILE" + echo " Vendor directory is managed by go modules." + echo " Update dependencies in go.mod instead." + exit 1 +fi + +if [[ "$FILE" == boilerplate/* ]] && [[ "$FILE" != boilerplate/update* ]]; then + echo "⚠️ WARNING: Editing boilerplate file: $FILE" + echo " Boilerplate is managed upstream." + echo " Local changes may be overwritten by 'make boilerplate-update'." + confirm_or_exit " Continue? (y/N)" +fi + +# ============================================================================= +# HIGH-RISK FILES - WARNING +# ============================================================================= + +HIGH_RISK_PATTERNS=( + "*/rbac.go" + "*/auth*.go" + "*_rbac.yaml" + "*/networkpolicy*.go" + "*[Cc]luster[Rr]ole*.yaml" + ".tekton/*.yaml" + "build/Dockerfile" +) + +for pattern in "${HIGH_RISK_PATTERNS[@]}"; do + # shellcheck disable=SC2053 + if [[ "$FILE" == $pattern ]]; then + echo "⚠️ HIGH-RISK FILE: $FILE" + echo " This file affects security or CI/CD." + echo " Changes require:" + echo " - Careful review" + echo " - Test coverage" + echo " - Security validation" + echo "" + confirm_or_exit " Continue? (y/N)" + break + fi +done + +# ============================================================================= +# LARGE DIFFS - WARNING +# ============================================================================= + +if [[ -f "$FILE" ]]; then + LINES=$(wc -l < "$FILE") + if (( LINES > 500 )); then + echo "⚠️ LARGE FILE: $FILE ($LINES lines)" + echo " Prefer targeted edits over broad refactors." + confirm_or_exit " Continue? (y/N)" + fi +fi + +exit 0 diff --git a/.claude/hooks/session-start-prek-setup.sh b/.claude/hooks/session-start-prek-setup.sh new file mode 100755 index 00000000..a6684bdd --- /dev/null +++ b/.claude/hooks/session-start-prek-setup.sh @@ -0,0 +1,58 @@ +#!/usr/bin/env bash +# +# Session Start Hook: Prek Setup +# +# Ensures prek (pre-commit) is installed and configured when Claude Code starts +# +# What it does: +# 1. Checks if prek is installed +# 2. If installed, ensures git hooks are wired up (prek install) +# 3. Provides helpful guidance if prek is missing +# +set -uo pipefail + +# Ensure we're running from the git repository root +REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) +if [[ -z "$REPO_ROOT" ]]; then + exit 0 +fi +cd "$REPO_ROOT" || exit 0 + +# Check if prek is installed +if ! command -v prek &> /dev/null; then + cat >&2 <<'EOF' +┌────────────────────────────────────────────────────────────────┐ +│ ⚠️ prek (pre-commit hooks) is not installed │ +│ │ +│ This project uses prek for code quality validation. │ +│ │ +│ Install it: │ +│ uv tool install prek # recommended │ +│ pipx install prek # alternative │ +│ pip install --user prek # fallback │ +│ │ +│ Then wire up git hooks: prek install │ +└────────────────────────────────────────────────────────────────┘ + +EOF + exit 0 +fi + +# Resolve the actual hooks directory — in a worktree, .git is a file, +# not a directory, so we use git rev-parse to find the hooks path. +GIT_HOOKS_DIR=$(git rev-parse --git-path hooks 2>/dev/null) + +# Check if git hooks are configured with prek +if [[ ! -f "$GIT_HOOKS_DIR/pre-commit" ]] || ! grep -q "prek" "$GIT_HOOKS_DIR/pre-commit" 2>/dev/null; then + echo "⚙️ Setting up prek pre-commit hooks..." >&2 + + if prek install >/dev/null 2>&1; then + echo "✅ Pre-commit enabled" >&2 + else + echo "⚠️ Failed to install prek hooks - you may need to run 'prek install' manually" >&2 + fi +else + echo "✅ Pre-commit enabled" >&2 +fi + +exit 0 diff --git a/.claude/hooks/stop-prek-validation.sh b/.claude/hooks/stop-prek-validation.sh new file mode 100755 index 00000000..21f17c75 --- /dev/null +++ b/.claude/hooks/stop-prek-validation.sh @@ -0,0 +1,86 @@ +#!/usr/bin/env bash +# +# Stop Hook: Prek Validation +# +# Runs prek validation when Claude Code stops with smart triggering: +# +# Default mode (CLAUDE_LINT_ON_STOP not set): +# - Only runs when there are uncommitted changes +# - Skips validation for read-only queries (fast iteration) +# - Validates when Claude modifies code (catch issues before commit) +# +# Strict mode (export CLAUDE_LINT_ON_STOP=true): +# - Always runs validation on every stop +# - Use when you want maximum quality enforcement +# - Slower but catches issues immediately +# +set -uo pipefail + +REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) +if [[ -z "$REPO_ROOT" ]]; then + jq -n '{"decision": "block", "reason": "Not in a git repository. Cannot run prek validation."}' + exit 0 +fi +cd "$REPO_ROOT" || exit 1 + +if ! command -v jq &> /dev/null; then + cat <<'EOF' +{"decision": "block", "reason": "jq is not installed — required for hook processing.\n\nInstall it:\n brew install jq # macOS\n apt-get install jq # Debian/Ubuntu\n yum install jq # RHEL/CentOS\n\nRetry the action once installed."} +EOF + exit 0 +fi + +HOOK_INPUT=$(cat) + +STOP_HOOK_ACTIVE=$(echo "$HOOK_INPUT" | jq -r '.stop_hook_active // false') +if [[ "$STOP_HOOK_ACTIVE" == "true" ]]; then + exit 0 +fi + +FORCE_LINT="${CLAUDE_LINT_ON_STOP:-false}" + +if [[ "$FORCE_LINT" != "true" ]]; then + if git diff-index --quiet HEAD -- 2>/dev/null && [[ -z "$(git ls-files --others --exclude-standard)" ]]; then + exit 0 + fi +fi + +if ! command -v prek &> /dev/null; then + jq -n \ + --arg reason "prek is not installed — required for quality checks before stopping. + +Install it: + uv tool install prek # recommended + pipx install prek # alternative + pip install --user prek # fallback + +Then wire up the git hook: prek install + +Retry the action once installed so validation can run." \ + '{"decision": "block", "reason": $reason}' + exit 0 +fi + +# Collect changed files (staged + unstaged + untracked) +mapfile -t CHANGED_FILES < <( + git diff --name-only --diff-filter=d HEAD 2>/dev/null + git ls-files --others --exclude-standard 2>/dev/null +) + +if [[ ${#CHANGED_FILES[@]} -eq 0 ]]; then + PREK_OUTPUT=$(prek run --all-files --config hack/prek.ci.toml 2>&1) +else + # Pass files via null-delimited list to handle names with spaces + PREK_OUTPUT=$(printf '%s\0' "${CHANGED_FILES[@]}" | xargs -0 prek run --config hack/prek.ci.toml --files 2>&1) +fi +PREK_EXIT=$? + +if [[ $PREK_EXIT -eq 0 ]]; then + exit 0 +fi + +jq -n \ + --arg reason "prek validation failed. Fix the issues below, then try again: + +$PREK_OUTPUT" \ + '{"decision": "block", "reason": $reason}' diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 00000000..4ba7395d --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,84 @@ +{ + "permissions": { + "allow": [ + "Bash(make go-build)", + "Bash(make go-test)", + "Bash(make go-check)", + "Bash(make lint)", + "Bash(make run)", + "Bash(make generate)", + "Bash(make coverage)", + "Bash(make validate)", + "Bash(go build ./...)", + "Bash(go test ./...)", + "Bash(go test *)", + "Bash(go fmt ./...)", + "Bash(go mod tidy)", + "Bash(prek run)", + "Bash(prek run *)", + "Bash(prek install)", + "Bash(prek --version)", + "Bash(boilerplate/_lib/container-make)", + "Bash(boilerplate/_lib/container-make *)", + "Bash(git status *)", + "Bash(git diff *)", + "Bash(git log *)", + "Bash(git branch *)", + "Bash(grep *)", + "Bash(find *)", + "Bash(ls *)", + "Bash(cat *)" + ], + "ask": [ + "Bash(git commit *)", + "Bash(git push *)", + "Bash(git reset *)", + "Bash(git rebase *)", + "Bash(git push --force-with-lease *)", + "Bash(git push * --force-with-lease)", + "Bash(git push * --force-with-lease *)", + "Bash(make docker-build)", + "Bash(kubectl *)", + "Bash(oc *)" + ], + "deny": [ + "Bash(git commit --no-verify)", + "Bash(git commit --no-verify *)", + "Bash(git commit -n)", + "Bash(git commit -n *)", + "Bash(git push --force)", + "Bash(git push --force *)", + "Bash(git push -f)", + "Bash(git push -f *)", + "Bash(git push * --force)", + "Bash(git push * --force *)", + "Bash(git push * -f)", + "Bash(git push * -f *)", + "Bash(rm -rf /)", + "Bash(chmod 777 *)" + ] + }, + "hooks": { + "SessionStart": [ + { + "hooks": [ + { + "type": "command", + "command": "bash \"$(git rev-parse --show-toplevel)/.claude/hooks/session-start-prek-setup.sh\"", + "async": true + } + ] + } + ], + "Stop": [ + { + "hooks": [ + { + "type": "command", + "command": "bash \"$(git rev-parse --show-toplevel)/.claude/hooks/stop-prek-validation.sh\"" + } + ] + } + ] + } +} diff --git a/.claude/skills/README.md b/.claude/skills/README.md new file mode 100644 index 00000000..d7e220ce --- /dev/null +++ b/.claude/skills/README.md @@ -0,0 +1,88 @@ +# Claude Skills + +Reusable workflow skills for this operator development. + +## Available Skills + +### [prow-ci](./prow-ci/SKILL.md) +**Purpose**: Access and analyze OpenShift Prow CI results + +**When to use**: +- Investigating CI failures +- Checking test results +- Analyzing build logs +- Debugging failed PR checks + +**Key capabilities**: +- Access Prow dashboard and job results +- Retrieve build logs and artifacts +- Debug test failures +- Compare local vs CI results + +**Resources**: +- [Prow Dashboard](https://prow.ci.openshift.org/) +- [CI Search](https://github.com/openshift/ci-search) + +## Usage + +Skills are reusable workflows that combine multiple tools and knowledge to accomplish specific tasks. + +### Invoking Skills + +Skills can be referenced in Claude conversations: +```text +"Use the prow-ci skill to investigate the failed test in PR #123" +"Check Prow CI results for the latest build" +``` + +### Skill Components + +Each skill typically includes: +- **Purpose**: What the skill does +- **Usage**: When to invoke it +- **Commands**: Specific commands to run +- **Troubleshooting**: Common issues and solutions +- **Integration**: How it works with other tools + +## Creating New Skills + +To add a new skill: + +1. Create subdirectory: `skillname/` in this directory +2. Create `SKILL.md` inside the subdirectory +3. Use frontmatter with metadata: + ```yaml + --- + name: skillname + description: Brief description of what this skill does + trigger: skill triggers, slash command synonyms + --- + ``` +4. Document commands and workflows in the markdown body +5. Update this README +6. Test the skill workflow + +**Directory structure**: +```text +.claude/skills/ +├── README.md +└── skillname/ + ├── SKILL.md # Required: skill definition + └── reference/ # Optional: supporting docs +``` + +## Integration with Other Components + +**Skills vs Agents**: +- **Agents**: Autonomous actors with specific responsibilities +- **Skills**: Reusable workflows that agents or humans execute + +**Skills vs Hooks**: +- **Hooks**: Automated enforcement (runs automatically) +- **Skills**: On-demand workflows (runs when invoked) + +## References + +- [AGENTS.md](../../AGENTS.md) - Agent behavioral rules +- [.claude/agents/](../agents/) - Specialized agents +- [.claude/hooks/](../hooks/) - Security and validation hooks diff --git a/.claude/skills/prow-ci/SKILL.md b/.claude/skills/prow-ci/SKILL.md new file mode 100644 index 00000000..96668eb0 --- /dev/null +++ b/.claude/skills/prow-ci/SKILL.md @@ -0,0 +1,146 @@ +--- +name: prow-ci +description: Fetch and analyze OpenShift Prow CI job failures with automated artifact download and failure pattern detection +trigger: prow, prow-ci, /prow-ci, ci results, check ci, analyze ci failure +--- + +# Prow CI Analysis for deadmanssnitch-operator + +This skill fetches Prow CI job artifacts from Google Cloud Storage and provides automated failure analysis. + +## Prerequisites + +Before using this skill, verify gcloud CLI is installed: +```bash +which gcloud +``` + +If not installed, see: https://cloud.google.com/sdk/docs/install + +**Note**: The `test-platform-results` GCS bucket is publicly accessible — no authentication required. + +## Quick Start + +```bash +# Check PR status and get Prow job URLs +gh pr checks + +# Analyze a failed job +/prow-ci + +# Or ask naturally: +"Analyze the lint failure in PR " +"Check why the validate job failed" +``` + +## Implementation + +When invoked, this skill: + +1. **Fetches artifacts** using `fetch_prow_artifacts.py`: + - Downloads **prowjob.json** (job metadata) + - Downloads **build-log.txt** (complete build output with all errors) + - Saves to `.work/prow-artifacts//` + +2. **Analyzes failures** using `analyze_failure.py`: + - Parses build-log.txt for error patterns + - Detects common failure patterns (lint, build, timeout, OOM) + - Extracts error messages and stack traces + +3. **Generates report**: + - Markdown format with failure summary + - Pattern detection results + - Actionable failure details + +## Usage Instructions + +### Step 1: Get Prow Job URL + +```bash +# View PR checks to find failed jobs +gh pr checks + +# Or get detailed status +gh pr view --json statusCheckRollup --jq '.statusCheckRollup[] | select(.state == "FAILURE")' +``` + +Example Prow job URL: +```text +https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_deadmanssnitch_operator//pull-ci-openshift-deadmanssnitch-operator-master-lint/ +``` + +### Step 2: Fetch and Analyze + +```bash +# From repository root +python3 .claude/skills/prow-ci/fetch_prow_artifacts.py "" -o .work/prow-artifacts +``` + +### Step 3: Analyze Failures + +```bash +python3 .claude/skills/prow-ci/analyze_failure.py .work/prow-artifacts/ -f markdown +``` + +## Common Job Names + +**Prow CI Jobs** (configured in openshift/release): +- `pull-ci-openshift-deadmanssnitch-operator-master-coverage` - Code coverage +- `pull-ci-openshift-deadmanssnitch-operator-master-lint` - Linting checks +- `pull-ci-openshift-deadmanssnitch-operator-master-test` - Unit tests +- `pull-ci-openshift-deadmanssnitch-operator-master-validate` - Validation checks + +**Tekton Pipelines** (configured in `.tekton/`): +- `deadmanssnitch-operator-pull-request` - Main PR pipeline +- `deadmanssnitch-operator-e2e-pull-request` - E2E testing pipeline +- `deadmanssnitch-operator-pko-pull-request` - PKO pipeline + +## Reproducing Failures Locally + +```bash +# For unit tests (matches: pull-ci-...-test) +make go-test + +# For linting (matches: pull-ci-...-lint) +make go-check + +# For validation (matches: pull-ci-...-validate) +make validate + +# For coverage (matches: pull-ci-...-coverage) +make coverage + +# For container builds (Tekton pipelines) +make docker-build +``` + +## Prow Resources + +**Main Dashboard**: https://prow.ci.openshift.org/ +**CI Search**: https://github.com/openshift/ci-search +**Job History**: https://prow.ci.openshift.org/?repo=openshift%2Fdeadmanssnitch-operator + +## Troubleshooting + +### Can't find job results? +- Check both Prow AND Tekton — this repo uses both systems +- Prow jobs: `pull-ci-openshift-deadmanssnitch-operator-master-*` +- Tekton jobs: Show as pipeline names in PR checks +- Verify repo name format in Prow: `openshift_deadmanssnitch_operator` (underscore, not dash) + +### Tekton pipeline failures? +- Check the pipeline link in PR checks (links to Konflux/AppStudio UI) +- Common issues: + - Image build failures → Check Dockerfile syntax + - Pipeline timeout → Check for slow steps or network issues +- Local validation: + ```bash + kubectl apply --dry-run=client -f .tekton/ + podman build -f build/Dockerfile -t test:local . + ``` + +## References + +- [Prow Dashboard](https://prow.ci.openshift.org/) +- [CI Search Tool](https://github.com/openshift/ci-search) +- [OpenShift CI Documentation](https://docs.ci.openshift.org/) diff --git a/.claude/skills/prow-ci/analyze_failure.py b/.claude/skills/prow-ci/analyze_failure.py new file mode 100644 index 00000000..1daddb55 --- /dev/null +++ b/.claude/skills/prow-ci/analyze_failure.py @@ -0,0 +1,196 @@ +#!/usr/bin/env python3 +""" +Analyze Prow CI job failures from downloaded artifacts. +""" + +import argparse +import json +import os +import re +import sys + + +def analyze_build_log(log_file): + """Analyze build-log.txt for common failure patterns.""" + if not os.path.exists(log_file): + return None + + analysis = { + 'errors': [], + 'failures': [], + 'warnings': [], + 'patterns': {} + } + + patterns = { + 'compilation_error': re.compile(r'(?:compilation failed|build failed|cannot find package)', re.IGNORECASE), + 'test_failure': re.compile(r'(?:FAIL:|Test failed:|tests failed)', re.IGNORECASE), + 'lint_error': re.compile(r'(?:golangci-lint|gofmt|go vet) .* failed', re.IGNORECASE), + 'timeout': re.compile(r'(?:timeout|timed out|deadline exceeded)', re.IGNORECASE), + 'oom': re.compile(r'(?:out of memory|OOMKilled|killed by signal)', re.IGNORECASE), + 'image_pull': re.compile(r'(?:Failed to pull image|ErrImagePull|ImagePullBackOff)', re.IGNORECASE), + 'permission_denied': re.compile(r'(?:permission denied|forbidden|unauthorized)', re.IGNORECASE), + } + + for pattern_name in patterns: + analysis['patterns'][pattern_name] = 0 + + with open(log_file, 'r', encoding='utf-8', errors='replace') as f: + for line in f: + line_stripped = line.strip() + + for pattern_name, pattern_regex in patterns.items(): + if pattern_regex.search(line): + analysis['patterns'][pattern_name] += 1 + + if len(analysis['errors']) < 10 and re.search(r'\bERROR\b', line, re.IGNORECASE): + analysis['errors'].append(line_stripped) + if len(analysis['failures']) < 10 and re.search(r'\bFAIL(ED)?\b', line, re.IGNORECASE): + analysis['failures'].append(line_stripped) + if len(analysis['warnings']) < 5 and re.search(r'\bWARNING\b', line, re.IGNORECASE): + analysis['warnings'].append(line_stripped) + + analysis['patterns'] = {k: v for k, v in analysis['patterns'].items() if v > 0} + + return analysis + + +def analyze_prowjob(prowjob_file): + """Extract key information from prowjob.json.""" + if not os.path.exists(prowjob_file): + return None + + try: + with open(prowjob_file, 'r') as f: + data = json.load(f) + except (json.JSONDecodeError, OSError) as e: + print(f"Error: Could not parse prowjob from {prowjob_file}: {e}", file=sys.stderr) + return None + + status = data.get('status', {}) + spec = data.get('spec', {}) + + return { + 'state': status.get('state', 'unknown'), + 'start_time': status.get('startTime'), + 'completion_time': status.get('completionTime'), + 'url': status.get('url', ''), + 'job_name': spec.get('job', 'unknown'), + 'type': spec.get('type', 'unknown'), + 'refs': spec.get('refs', {}), + } + + +def generate_analysis_report(artifacts_dir): + """Generate comprehensive failure analysis report.""" + report = { + 'prowjob': None, + 'build_log': None, + 'summary': '' + } + + prowjob_file = os.path.join(artifacts_dir, 'prowjob.json') + report['prowjob'] = analyze_prowjob(prowjob_file) + + build_log_file = os.path.join(artifacts_dir, 'build-log.txt') + report['build_log'] = analyze_build_log(build_log_file) + + summary_parts = [] + + if report['prowjob']: + pj = report['prowjob'] + summary_parts.append(f"Job: {pj['job_name']}") + summary_parts.append(f"State: {pj['state']}") + + if report['build_log'] and report['build_log']['patterns']: + summary_parts.append("\nDetected Patterns:") + for pattern, count in report['build_log']['patterns'].items(): + summary_parts.append(f" - {pattern}: {count} occurrences") + + if report['build_log'] and report['build_log']['errors']: + summary_parts.append(f"\nTop Errors ({len(report['build_log']['errors'])}):") + for err in report['build_log']['errors'][:3]: + summary_parts.append(f" - {err[:150]}") + + report['summary'] = '\n'.join(summary_parts) + + return report + + +def format_markdown_report(report): + """Format analysis as Markdown.""" + lines = ["# Prow CI Failure Analysis\n"] + + if report['prowjob']: + pj = report['prowjob'] + lines.append("## Job Information") + lines.append(f"- **Job**: {pj['job_name']}") + lines.append(f"- **State**: {pj['state']}") + lines.append(f"- **Type**: {pj['type']}") + if pj.get('url'): + lines.append(f"- **URL**: {pj['url']}") + lines.append("") + + if report['build_log']: + bl = report['build_log'] + + if bl['patterns']: + lines.append("## Detected Patterns") + for pattern, count in sorted(bl['patterns'].items(), key=lambda x: x[1], reverse=True): + lines.append(f"- **{pattern}**: {count} occurrences") + lines.append("") + + if bl['errors']: + lines.append("## Errors") + for err in bl['errors']: + lines.append(f"- {err}") + lines.append("") + + if bl['failures']: + lines.append("## Failures") + for fail in bl['failures'][:5]: + lines.append(f"- {fail}") + lines.append("") + + return '\n'.join(lines) + + +def main(): + parser = argparse.ArgumentParser(description='Analyze Prow CI job failures') + parser.add_argument('artifacts_dir', help='Directory containing downloaded artifacts') + parser.add_argument('-f', '--format', choices=['text', 'json', 'markdown'], + default='markdown', help='Output format') + parser.add_argument('-o', '--output', help='Output file (default: stdout)') + + args = parser.parse_args() + + if not os.path.exists(args.artifacts_dir): + print(f"Error: Artifacts directory not found: {args.artifacts_dir}", file=sys.stderr) + return 1 + + report = generate_analysis_report(args.artifacts_dir) + + if report.get('build_log') is None: + print(f"Error: Missing required build-log.txt in {args.artifacts_dir}", file=sys.stderr) + print("The artifacts directory must contain build-log.txt for analysis.", file=sys.stderr) + return 1 + + if args.format == 'json': + output = json.dumps(report, indent=2) + elif args.format == 'markdown': + output = format_markdown_report(report) + else: + output = report['summary'] + + if args.output: + with open(args.output, 'w') as f: + f.write(output) + print(f"Analysis saved to: {args.output}") + else: + print(output) + + return 0 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/.claude/skills/prow-ci/fetch_prow_artifacts.py b/.claude/skills/prow-ci/fetch_prow_artifacts.py new file mode 100644 index 00000000..f054caad --- /dev/null +++ b/.claude/skills/prow-ci/fetch_prow_artifacts.py @@ -0,0 +1,138 @@ +#!/usr/bin/env python3 +""" +Fetch Prow CI job artifacts from Google Cloud Storage. +""" + +import argparse +import json +import os +import re +import subprocess +import sys + + +def parse_prow_url(url): + """ + Parse Prow job URL to extract GCS path components. + + Returns dict with: + - gcs_base_path: Full GCS path (gs://...) + - bucket_path: Path within bucket + - build_id: Numeric build ID + - job_name: Name of the Prow job + """ + if 'test-platform-results' not in url: + raise ValueError("URL must contain 'test-platform-results'") + + match = re.search(r'test-platform-results/(.+?)(?:\?|$)', url) + if not match: + raise ValueError("Could not parse test-platform-results path") + + bucket_path = match.group(1).rstrip('/') + + build_match = re.search(r'/(\d{10,})/?', bucket_path) + if not build_match: + raise ValueError("Could not find build ID (10+ digits) in URL") + + build_id = build_match.group(1) + + job_match = re.search(r'/([^/]+)/\d{10,}/?', bucket_path) + if not job_match: + raise ValueError("Could not extract job name from URL") + + job_name = job_match.group(1) + + gcs_base_path = f"gs://test-platform-results/{bucket_path}" + + return { + 'gcs_base_path': gcs_base_path, + 'bucket_path': bucket_path, + 'build_id': build_id, + 'job_name': job_name + } + + +def download_from_gcs(gcs_path, local_path): + """Download a file from GCS using gcloud storage cp.""" + try: + os.makedirs(os.path.dirname(local_path), exist_ok=True) + cmd = [ + 'gcloud', 'storage', 'cp', + gcs_path, + local_path, + '--no-user-output-enabled' + ] + subprocess.run(cmd, check=True, capture_output=True) + return True + except subprocess.CalledProcessError as e: + print(f"Warning: Could not download {gcs_path}: {e.stderr.decode()}", file=sys.stderr) + return False + + +def fetch_prowjob_json(gcs_base_path, output_dir): + """Fetch prowjob.json and return parsed JSON.""" + gcs_path = f"{gcs_base_path}/prowjob.json" + local_path = os.path.join(output_dir, 'prowjob.json') + + if download_from_gcs(gcs_path, local_path): + try: + with open(local_path, 'r') as f: + return json.load(f) + except json.JSONDecodeError as e: + print(f"Error: Could not parse JSON from {local_path}: {e}", file=sys.stderr) + return None + return None + + +def fetch_build_log(gcs_base_path, output_dir): + """Fetch build-log.txt.""" + gcs_path = f"{gcs_base_path}/build-log.txt" + local_path = os.path.join(output_dir, 'build-log.txt') + return download_from_gcs(gcs_path, local_path) + + +def main(): + parser = argparse.ArgumentParser(description='Fetch Prow CI job artifacts from GCS') + parser.add_argument('url', help='Prow job URL (gcsweb or direct GCS)') + parser.add_argument('-o', '--output', default='.work/prow-artifacts', + help='Output directory (default: .work/prow-artifacts)') + + args = parser.parse_args() + + try: + parsed = parse_prow_url(args.url) + except ValueError as e: + print(f"Error: {e}", file=sys.stderr) + return 1 + + print(f"Prow Job: {parsed['job_name']}") + print(f"Build ID: {parsed['build_id']}") + print(f"GCS Path: {parsed['gcs_base_path']}") + print() + + output_dir = os.path.join(args.output, parsed['build_id']) + os.makedirs(output_dir, exist_ok=True) + + had_errors = False + + print("Fetching prowjob.json...") + prowjob = fetch_prowjob_json(parsed['gcs_base_path'], output_dir) + if prowjob is not None: + print("✓ prowjob.json downloaded") + else: + print("⚠ Could not fetch prowjob.json (optional artifact)") + + print("Fetching build-log.txt...") + if fetch_build_log(parsed['gcs_base_path'], output_dir): + print("✓ build-log.txt downloaded") + else: + print("✗ Could not fetch build-log.txt") + had_errors = True + + print(f"\nArtifacts saved to: {output_dir}") + + return 1 if had_errors else 0 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/.gitignore b/.gitignore index d93c959d..5b2e08a7 100644 --- a/.gitignore +++ b/.gitignore @@ -84,3 +84,7 @@ tags ### Local testing of app-sre pipeline creates these .docker/ saas-deadmanssnitch-operator-bundle + +### Claude Code runtime artifacts +.claude/worktrees/ +.work/ diff --git a/.gitleaks.toml b/.gitleaks.toml new file mode 100644 index 00000000..227298e6 --- /dev/null +++ b/.gitleaks.toml @@ -0,0 +1,94 @@ +# Gitleaks Configuration for deadmanssnitch-operator +# https://github.com/gitleaks/gitleaks +# +# Purpose: Detect hardcoded secrets, credentials, and sensitive data +# Integration: Runs in pre-commit hook (prek) and Tekton CI +# +# Usage: +# gitleaks detect --source . --verbose +# prek run gitleaks +# + +title = "gitleaks config for deadmanssnitch-operator" + +# ============================================================================= +# GLOBAL ALLOWLIST +# ============================================================================= + +[allowlist] +description = "Global allowlist for deadmanssnitch-operator" + +# Test fixtures with fake credentials, generated code, and vendored code +paths = [ + '''testdata/.*\.go''', + '''pkg/.*/testdata/.*\.go''', + '''boilerplate/.*''', + '''vendor/.*''', + '''zz_generated\..*\.go''', +] + +# Allow specific test values that look like secrets but aren't +regexes = [ + '''(?i)fake[_-]?token''', + '''(?i)test[_-]?secret''', + '''(?i)example[_-]?key''', + '''(?i)dummy[_-]?password''', + '''(?i)placeholder''', + '''AKIAIOSFODNN7EXAMPLE''', # AWS example from docs +] + +commits = [] + +stopwords = [ + "example", + "test", + "fake", + "dummy", + "placeholder", + "sample", + "mock", +] + +# ============================================================================= +# CUSTOM RULES (deadmanssnitch-operator specific) +# ============================================================================= + +[[rules]] +id = "dms-api-key" +description = "Dead Man's Snitch API key" +regex = '''(?i)dms[_-]?api[_-]?key\s*[:=]\s*['"]?[a-zA-Z0-9]{20,}''' +tags = ["token", "dms", "critical"] + +[[rules]] +id = "openshift-pull-secret" +description = "OpenShift pull secret" +regex = '''(?i)pull[_-]?secret.*auth.*[a-zA-Z0-9+/]{30,}={0,2}''' +tags = ["secret", "openshift", "high"] + +[[rules]] +id = "kubeconfig-embedded" +description = "Embedded kubeconfig with credentials" +regex = '''client-certificate-data:\s*[a-zA-Z0-9+/]{30,}={0,2}''' +tags = ["kubeconfig", "certificate", "critical"] + +[[rules]] +id = "private-key-pem" +description = "PEM-encoded private key" +regex = '''-----BEGIN\s+(RSA\s+)?PRIVATE KEY-----''' +tags = ["private-key", "pem", "critical"] + +# ============================================================================= +# NOTES +# ============================================================================= + +# 1. This config extends gitleaks default rules +# 2. False positives should be added to allowlist with justification +# 3. Never disable gitleaks entirely (security critical) +# 4. Review allowlist periodically for stale entries +# 5. All allowlist additions should be documented in PR + +# To test this config: +# gitleaks detect --source . --config .gitleaks.toml --verbose +# +# To scan specific commit: +# gitleaks detect --source . --log-opts diff --git a/.prek-version b/.prek-version new file mode 100644 index 00000000..5aff472d --- /dev/null +++ b/.prek-version @@ -0,0 +1 @@ +v0.4.1 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..6573fe0f --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,152 @@ +# Contributing to deadmanssnitch-operator + +Thank you for your interest in contributing to this project. + +## Quick Start + +1. **Setup**: Install Go 1.25.4+ +2. **Install prek**: `uv tool install prek && prek install` +3. **Build**: `make go-build` +4. **Test**: `make go-test` +5. **Lint**: `make go-check` + +See [DEVELOPMENT.md](./DEVELOPMENT.md) for detailed setup instructions. + +## Before Submitting a PR + +All contributions must pass: + +1. **Formatting & linting**: `prek run --all-files` +2. **Unit tests**: `make go-test` +3. **Build verification**: `make go-build` +4. **Security scan**: Automatic via prek (gitleaks) + +## Development Workflow + +### Human Contributors + +```bash +# Create a feature branch +git checkout -b feature/my-change + +# Make changes, following existing code patterns +# Add/update tests for your changes + +# Run validation locally +prek run --all-files +make go-test + +# Commit with descriptive message +git commit -m "feat: add support for X" + +# Push and create PR +git push origin feature/my-change +``` + +### AI-Assisted Development + +When using AI coding agents (Claude Code, GitHub Copilot, Cursor, etc.): + +**Agents MUST:** +- Run `prek run` on changed files before committing +- Execute relevant tests after code changes: `make go-test` +- Preserve existing code style and patterns +- Avoid editing generated files (`**/zz_generated.*.go`, `pkg/dmsclient/mock/mock_dmsclient.go`) +- Never bypass hooks with `--no-verify` +- Never commit secrets, tokens, or credentials +- Reuse existing utilities and abstractions +- Make incremental, focused changes + +**Validation expectations:** +1. Format check: `go fmt ./...` +2. Lint: `make go-check` (or `prek run golangci-lint`) +3. Type safety: Verified by `go build ./...` in pre-commit +4. Tests: `make go-test` for affected packages +5. Secret scan: Automatic via prek gitleaks hook + +**Required checks before PR:** +- [ ] All prek hooks pass +- [ ] Unit tests pass for modified packages +- [ ] No new linter warnings introduced +- [ ] No secrets or credentials in diff +- [ ] Mocks regenerated if interfaces changed: `boilerplate/_lib/container-make generate` + +## Code Style + +Follow existing patterns: +- Standard Go formatting (`gofmt`) +- golangci-lint rules in `boilerplate/openshift/golang-osd-operator/golangci.yml` +- testify/assert for test assertions +- GoMock (`go.uber.org/mock`) for interface mocking + +## Testing Requirements + +- **Unit tests required** for all new functionality +- Use standard Go testing with testify/assert for assertions +- Mock the DMS API client with GoMock (mock at `pkg/dmsclient/mock/`) +- Aim for meaningful test coverage, not just metrics + +See [TESTING.md](./TESTING.md) for testing guidelines. + +## Regenerating Code + +After modifying API types or interfaces: + +```bash +# Regenerate deepcopy, OpenAPI, mocks (in container for consistency) +boilerplate/_lib/container-make generate +``` + +## Security + +**Never commit:** +- API keys, tokens, passwords +- AWS credentials, kubeconfig files +- Private keys, certificates +- `.env` files with secrets +- Debug statements printing sensitive data + +The prek gitleaks hook will block commits containing secrets. + +**High-risk changes** (requiring extra review): +- Authentication/authorization logic +- RBAC manifests with wildcard permissions +- Network policies +- CI/CD pipeline modifications +- Dockerfile changes + +## Commit Message Format + +Use conventional commits style: + +```text +: + + + + +``` + +Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`, `ci` + +Examples: +- `feat: add support for fleet notification filtering` +- `fix: correct RBAC permissions for service monitor` +- `test: add unit tests for snitch reconciliation` + +## Pull Request Process + +1. **Title**: Clear, descriptive summary +2. **Description**: Explain what changed and why +3. **Testing**: Describe how you tested the changes +4. **CI**: All Tekton pipeline checks must pass +5. **Review**: Address review feedback promptly + +## Questions? + +- Review similar PRs for patterns +- Ask in PR comments for clarification + +## License + +All contributions are licensed under Apache 2.0. See [LICENSE](./LICENSE). diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md new file mode 100644 index 00000000..7d3016c9 --- /dev/null +++ b/DEVELOPMENT.md @@ -0,0 +1,116 @@ +# Development + +- [Development](#development) + - [Development Environment Setup](#development-environment-setup) + - [golang](#golang) + - [prek (pre-commit hooks)](#prek-pre-commit-hooks) + - [Makefile](#makefile) + - [Code Generation](#code-generation) + - [Build using boilerplate container](#build-using-boilerplate-container) + - [Mocks](#mocks) + +This document covers everything you need to develop this operator locally. + +## Development Environment Setup + +### golang + +Go 1.25.4 or newer is required (see `go.mod`). + +```bash +$ go version +go version go1.25.4 linux/amd64 +``` + +**Note**: `FIPS_ENABLED=true` is set by the Makefile, which requires `GOEXPERIMENT=boringcrypto` and may fail outside the CI container. For local Go builds, use `go build .` directly or `make container-test` to build inside the boilerplate container. + +### prek (pre-commit hooks) + +This project uses [prek](https://prek.j178.dev/) for git hook management. + +```bash +# Install prek +uv tool install prek # recommended +# or: pipx install prek + +# Wire up git hooks +prek install +``` + +## Makefile + +Common make targets: + +```bash +# Build, lint, and test (default) +make + +# Individual targets +make go-check # golangci-lint + other static analysis +make go-test # Unit tests (requires envtest) +make go-build # Build binary (FIPS-enabled, may fail outside container) +make lint # YAML validation + go-check +make validate # Ensure generated code is committed +make generate # CRDs, deepcopy, openapi-gen, mocks +make coverage # Code coverage report +make run # Run operator locally (requires kubeconfig with Hive CRDs) + +# Container-based targets (run inside boilerplate container, matches CI) +make container-test +make container-lint +make container-validate +make container-all +``` + +## Code Generation + +After modifying CRD types (`api/v1alpha1/`) or the DMS client interface (`pkg/dmsclient/dmsclient.go`): + +```bash +# Regenerate CRDs, deepcopy, openapi, mocks (use container for CI parity) +boilerplate/_lib/container-make generate + +# Verify generated files are committed (what CI runs) +make generate-check +``` + +Generated files that must be committed: +- `api/v1alpha1/zz_generated.deepcopy.go` +- `api/v1alpha1/zz_generated.openapi.go` +- `pkg/dmsclient/mock/mock_dmsclient.go` +- `deploy/crds/*.yaml` + +## Build using boilerplate container + +To run lint, test and build in the boilerplate container (matches CI environment): + +```bash +boilerplate/_lib/container-make TARGET +``` + +Examples: + +```bash +# Run unit tests +boilerplate/_lib/container-make go-test + +# Run lint +boilerplate/_lib/container-make go-check + +# Run coverage +boilerplate/_lib/container-make coverage + +# Run all validation +boilerplate/_lib/container-make container-all +``` + +## Mocks + +The DMS API client mock lives at `pkg/dmsclient/mock/mock_dmsclient.go`, generated from the +`Client` interface in `pkg/dmsclient/dmsclient.go` using `go.uber.org/mock/mockgen`. + +**Do not edit the mock directly.** Regenerate it with: + +```bash +boilerplate/_lib/container-make generate +``` diff --git a/TESTING.md b/TESTING.md new file mode 100644 index 00000000..98dc107a --- /dev/null +++ b/TESTING.md @@ -0,0 +1,269 @@ +# Testing Guide + +Testing guidelines for the deadmanssnitch-operator. + +## Framework + +- **testify/assert**: Assertions and test helpers (`github.com/stretchr/testify`) +- **GoMock**: Interface mocking (`go.uber.org/mock/gomock`) +- **controller-runtime fake client**: Kubernetes API simulation for controller tests +- **envtest**: Kubernetes API server for integration-style tests + +## Quick Commands + +```bash +# Run all tests +make go-test + +# Run specific package +go test -v ./controllers/deadmanssnitchintegration/ + +# Run a single test by name +go test -v -run TestReconcileClusterDeployment ./controllers/deadmanssnitchintegration/ + +# Run all packages +go test ./... + +# Container-based (CI parity) +boilerplate/_lib/container-make go-test +``` + +## Writing Tests + +### Test Structure + +Tests use standard Go `testing.T` with testify assertions: + +```go +package deadmanssnitchintegration_test + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func TestMyFeature(t *testing.T) { + result, err := MyFunction() + require.NoError(t, err) + assert.Equal(t, expected, result) +} +``` + +### Mocking Interfaces + +The DMS API client is mocked with GoMock. The mock is pre-generated at +`pkg/dmsclient/mock/mock_dmsclient.go`. + +```go +import ( + "testing" + "go.uber.org/mock/gomock" + "github.com/openshift/deadmanssnitch-operator/pkg/dmsclient/mock" +) + +func TestReconcileCreate(t *testing.T) { + ctrl := gomock.NewController(t) + defer ctrl.Finish() + + mockClient := mock.NewMockClient(ctrl) + mockClient.EXPECT(). + CreateSnitch(gomock.Any()). + Return(&dmsclient.Snitch{CheckInURL: "https://nosnch.in/abc"}, nil) + + // inject mock into reconciler and test... +} +``` + +The `setupDefaultMocks()` helper in the controller test creates a standard set of +test objects (DMSI CR, ClusterDeployment, Secrets) for use across tests. + +**Regenerate all mocks:** +```bash +boilerplate/_lib/container-make generate +``` + +## Test Organization + +### Unit Tests +- Test individual functions and methods +- Mock external dependencies (DMS API client) +- Fast execution (<1s per package) +- Located alongside source code + +### Controller Tests +- Test reconciliation logic end-to-end +- Use controller-runtime's fake client +- Test custom resource lifecycle (create, update, delete, finalizers) +- Located in `controllers/deadmanssnitchintegration/` + +### PKO Template Tests +- Located in `pkg/pko/template_test.go` +- **Snapshot tests**: golden files in `deploy_pko/.test-fixtures/`, validated by `kubectl-package validate` +- **Structural tests**: Go assertions on rendered template output (kind, annotations, conditional fields) + +## Agent-Driven Validation + +When AI agents modify code: + +**Minimal validation:** +```bash +# After changing controllers/deadmanssnitchintegration/ +go test ./controllers/deadmanssnitchintegration/ +``` + +**Full validation before commit:** +```bash +make go-test +``` + +**If tests fail:** +1. Read test output carefully +2. Fix the underlying issue (don't skip tests) +3. Rerun to confirm fix +4. Regenerate mocks if interface changed: `boilerplate/_lib/container-make generate` + +## Common Patterns + +### Testing Controllers + +```go +func TestReconcileClusterDeployment(t *testing.T) { + ctrl := gomock.NewController(t) + defer ctrl.Finish() + + // Create mock client and set expectations + mockClient := mock.NewMockClient(ctrl) + mockClient.EXPECT().FindSnitch(gomock.Any()).Return(nil, nil) + mockClient.EXPECT().CreateSnitch(gomock.Any()).Return(&dmsclient.Snitch{}, nil) + + // Create fake k8s client with test objects + scheme := setupScheme() + fakeClient := fake.NewClientBuilder().WithScheme(scheme).WithObjects(testObjects...).Build() + + // Run reconciler + r := &DeadmansSnitchIntegrationReconciler{ + Client: fakeClient, + DmsClient: mockClient, + } + result, err := r.Reconcile(context.TODO(), req) + assert.NoError(t, err) + assert.False(t, result.Requeue) +} +``` + +### Testing Error Conditions + +```go +func TestReconcileCreateError(t *testing.T) { + ctrl := gomock.NewController(t) + defer ctrl.Finish() + + mockClient := mock.NewMockClient(ctrl) + mockClient.EXPECT().CreateSnitch(gomock.Any()).Return(nil, fmt.Errorf("API error")) + + // ...verify error is returned and handled correctly +} +``` + +### Using testify Matchers + +```go +// Equality +assert.Equal(t, expected, actual) + +// Nil checks +require.NoError(t, err) +assert.Nil(t, obj) + +// Collections +assert.Contains(t, slice, "item") +assert.Len(t, slice, 3) +assert.Empty(t, slice) + +// Booleans +assert.True(t, condition) +assert.False(t, condition) +``` + +## Coverage + +Generate coverage report: +```bash +go test -coverprofile=coverage.out ./... +go tool cover -html=coverage.out -o coverage.html +``` + +**Note**: Aim for meaningful coverage, not arbitrary percentages. +- Test critical paths and error handling +- Don't test generated code or trivial getters/setters + +## PKO Template Tests + +```bash +# Validate against existing fixtures +kubectl-package validate deploy_pko/ + +# Regenerate fixtures after template changes +rm -rf deploy_pko/.test-fixtures/ +kubectl-package validate deploy_pko/ + +# Run Go-level template assertions +go test ./pkg/pko/... +``` + +## Debugging Tests + +```bash +# Verbose output +go test -v ./controllers/deadmanssnitchintegration/ + +# Run single test +go test -v -run TestReconcileCreate ./controllers/deadmanssnitchintegration/ + +# Race detector +go test -race ./... +``` + +## CI Expectations + +Tests run in Tekton pipeline with: +- Fresh environment +- No cached dependencies +- Strict timeout limits + +**Local CI parity:** +```bash +boilerplate/_lib/container-make go-test +``` + +## Common Issues + +**Mock not found or outdated:** +```bash +# Regenerate mocks +boilerplate/_lib/container-make generate +``` + +**envtest not installed:** +```bash +make setup-envtest +``` + +**Test passes locally, fails in CI:** +```bash +# Run in container environment +boilerplate/_lib/container-make go-test + +# Check for: +# - Time-dependent tests +# - Environment-specific assumptions +# - File path dependencies +``` + +## Further Reading + +- [testify Documentation](https://github.com/stretchr/testify) +- [GoMock Guide](https://pkg.go.dev/go.uber.org/mock/gomock) +- [controller-runtime Testing](https://book.kubebuilder.io/reference/testing.html) diff --git a/hack/ci.sh b/hack/ci.sh new file mode 100755 index 00000000..9b556ce9 --- /dev/null +++ b/hack/ci.sh @@ -0,0 +1,9 @@ +#!/usr/bin/env bash +set -euo pipefail + +if ! command -v prek &>/dev/null; then + echo "Error: prek is not installed. Install with: uv tool install prek" >&2 + exit 1 +fi + +prek run --config hack/prek.ci.toml --all-files diff --git a/hack/prek.ci.toml b/hack/prek.ci.toml new file mode 100644 index 00000000..8316d57a --- /dev/null +++ b/hack/prek.ci.toml @@ -0,0 +1,63 @@ +# Prek Configuration for CI +# Excludes hooks requiring internal network access or that may not be available in CI +# https://prek.j178.dev/ + +# File hygiene and syntax validation +[[repos]] +repo = "builtin" +hooks = [ + { id = "trailing-whitespace", args = ["--markdown-linebreak-ext=md"], exclude = "^(boilerplate/|\\.pre-commit-config\\.yaml)" }, + { id = "end-of-file-fixer", exclude = "^(boilerplate/|\\.pre-commit-config\\.yaml)" }, + { id = "check-added-large-files", args = ["--maxkb=1024"] }, + { id = "check-case-conflict" }, + { id = "check-merge-conflict" }, + { id = "check-json" }, + { id = "check-yaml", args = ["--allow-multiple-documents"] }, + { id = "check-toml" }, +] + +# golangci-lint static analysis +[[repos]] +repo = "https://github.com/golangci/golangci-lint" +rev = "v2.0.2" +hooks = [ + { id = "golangci-lint", args = [ + "--config=boilerplate/openshift/golang-osd-operator/golangci.yml", + "--timeout=120s" + ] }, +] + +# Local custom hooks +[[repos]] +repo = "local" +hooks = [ + # Go build check + { + id = "go-build", + name = "go build", + language = "system", + entry = "bash -c 'T=$(command -v timeout || command -v gtimeout || echo); ${T:+$T 30s} go build ./...'", + types = ["go"], + pass_filenames = false + }, + + # Go mod tidy check + { + id = "go-mod-tidy", + name = "go mod tidy", + language = "system", + entry = "bash -c 'T=$(command -v timeout || command -v gtimeout || echo); ${T:+$T 60s} go mod tidy && git diff --exit-code go.mod go.sum'", + files = '(\.go$|go\.(mod|sum)$)', + pass_filenames = false + }, + + # RBAC wildcard check + { + id = "rbac-wildcard-check", + name = "RBAC wildcard permissions", + language = "system", + entry = "bash -c 'make rbac-wildcard-check'", + files = '^deploy/.*\.ya?ml$', + pass_filenames = false + }, +] diff --git a/prek.toml b/prek.toml new file mode 100644 index 00000000..98f8bbbe --- /dev/null +++ b/prek.toml @@ -0,0 +1,78 @@ +# Prek Configuration for deadmanssnitch-operator +# https://prek.j178.dev/ + +# File hygiene and syntax validation +[[repos]] +repo = "builtin" +hooks = [ + { id = "trailing-whitespace", args = ["--markdown-linebreak-ext=md"], exclude = "^(boilerplate/|\\.pre-commit-config\\.yaml)" }, + { id = "end-of-file-fixer", exclude = "^(boilerplate/|\\.pre-commit-config\\.yaml)" }, + { id = "check-added-large-files", args = ["--maxkb=1024"] }, + { id = "check-case-conflict" }, + { id = "check-merge-conflict" }, + { id = "check-json" }, + { id = "check-yaml", args = ["--allow-multiple-documents"] }, + { id = "check-toml" }, +] + +# Red Hat InfoSec security scanning +[[repos]] +repo = "https://gitlab.cee.redhat.com/infosec-public/developer-workbench/tools.git" +rev = "rh-pre-commit-2.3.0" +hooks = [ + { id = "rh-pre-commit", stages = ["pre-commit"] }, +] + +# Gitleaks secret scanning +[[repos]] +repo = "https://github.com/gitleaks/gitleaks" +rev = "v8.18.0" +hooks = [ + { id = "gitleaks", args = ["--config=.gitleaks.toml"] }, +] + +# golangci-lint static analysis +[[repos]] +repo = "https://github.com/golangci/golangci-lint" +rev = "v2.0.2" +hooks = [ + { id = "golangci-lint", args = [ + "--config=boilerplate/openshift/golang-osd-operator/golangci.yml", + "--timeout=120s" + ] }, +] + +# Local custom hooks +[[repos]] +repo = "local" +hooks = [ + # Go build check + { + id = "go-build", + name = "go build", + language = "system", + entry = "bash -c 'T=$(command -v timeout || command -v gtimeout || echo); ${T:+$T 30s} go build ./...'", + types = ["go"], + pass_filenames = false + }, + + # Go mod tidy check + { + id = "go-mod-tidy", + name = "go mod tidy", + language = "system", + entry = "bash -c 'T=$(command -v timeout || command -v gtimeout || echo); ${T:+$T 60s} go mod tidy && git diff --exit-code go.mod go.sum'", + files = '(\.go$|go\.(mod|sum)$)', + pass_filenames = false + }, + + # RBAC wildcard check + { + id = "rbac-wildcard-check", + name = "RBAC wildcard permissions", + language = "system", + entry = "bash -c 'make rbac-wildcard-check'", + files = '^deploy/.*\.ya?ml$', + pass_filenames = false + }, +]