diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000..07e3cb7999 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,65 @@ +# AI Agent Instructions for library-go + +## What This Repo Is + +library-go is a shared Go helper library consumed by dozens of OpenShift operators and components. It provides reusable building blocks: operator controller framework, certificate management, configuration observers, encryption/KMS integration, manifest-based clients, and more. + +**This is a library, not an application.** There is no main binary. Changes here affect every OpenShift component that vendors this repo. + +## Critical Rules + +1. **Never add imports from `k8s.io/kubernetes` or `openshift/origin`.** This is an absolute constraint — the library must remain independent of those repos. +2. **Always run `go mod tidy && go mod vendor` after any dependency change.** The vendor directory is checked in and CI validates it. +3. **Run `make verify` and `make test-unit` before considering any change complete.** +4. **Do not introduce breaking API changes** to existing public functions or types without explicit direction. Dozens of repos depend on these interfaces. + +## Repository Structure + +```text +pkg/ +├── operator/ # Controller framework (33 subpackages) — the core of the library +│ ├── certrotation/ # Certificate lifecycle and rotation controllers +│ ├── configobserver/ # Watches config inputs, produces RawExtension outputs +│ ├── encryption/ # KMS/encryption state machine (11 subdirs) — very complex +│ ├── staticpod/ # Static pod management with atomic directory swaps +│ ├── resourcesynccontroller/ # Cross-namespace secret/configmap sync +│ └── ... +├── crypto/ # Low-level TLS, certificates, key generation, cipher suites +├── pki/ # High-level API-driven PKI profiles (newer) +├── config/ # Configuration management, serving, leader election +├── manifestclient/ # Manifest-based client operations (offline-capable) +├── controller/ # Controller utilities, factory, file observer +├── assets/ # Asset creation and templating +└── ... # 35+ top-level packages total +test/ +├── e2e-encryption/ # Encryption end-to-end tests +├── e2e-monitoring/ # Monitoring end-to-end tests +└── library/ # Shared test helpers +``` + +## Key Patterns to Follow + +- **ConfigObserver pattern**: Controllers watch multiple config inputs and produce a single `RawExtension` output. Changes are detected by comparing the merged observer output against the existing observed config. Follow this pattern for any new observer. +- **Preconditions over defaults**: `StaticResourceController` uses preconditions (feature gates, platform checks) to decide behavior. Do not try to handle all combinations with defaults. +- **Controller naming**: Controllers follow the pattern `NewXxxController(...)` returning a `factory.Controller`. Follow existing naming and constructor conventions. + +## High-Risk Areas — Proceed with Caution + +- **`pkg/operator/encryption/`** — State machine with complex preconditions, crypto providers, KMS plugin integration, and etcd encryption config management. Do not modify without deep understanding. +- **`pkg/crypto/`** and **`pkg/operator/certrotation/`** — Certificate rotation triggers at 80% of validity (4/5 of cert lifetime) and handles multi-CA chain support. Subtle bugs here cause cluster outages. Note: CSR handling is in `pkg/operator/csr/`, not in certrotation. +- **`pkg/operator/staticpod/`** — Atomic directory swaps use `renameat2(RENAME_EXCHANGE)` on Linux; non-Linux platforms are not supported and return an error. Platform-specific code needs careful testing. + +## Build and Test + +```bash +make build # Compile all packages +make test-unit # Run unit tests +make verify # Linters, gofmt, vet +``` + +## What NOT to Do + +- Do not add new top-level packages without a strong reason — the bar for inclusion is high. +- Do not modify OWNERS or OWNERS_ALIASES files. +- Do not use Kubernetes code generators (deepcopy-gen, client-gen, informer-gen). Some CRD manifests are synced from `openshift/api` via the Makefile, but the Go code is hand-written. +- Do not add test dependencies on real cloud providers or external services in unit tests. Use fakes and mocks from `k8s.io/client-go/kubernetes/fake`. diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 0000000000..71081e9a9b --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,98 @@ +# Architecture: library-go + +## Overview + +library-go is the shared foundation library for OpenShift. It provides reusable components that are vendored by dozens of OpenShift operators and control plane components. The library is organized around several major subsystems, each addressing a core operational concern. + +**Design philosophy:** Code must have concrete use cases in at least two separate OpenShift repositories. The library must not depend on `k8s.io/kubernetes` or `openshift/origin`, keeping it lightweight and broadly reusable. + +## Major Subsystems + +### Operator Controller Framework (`pkg/operator/`) + +The largest subsystem (~33 subpackages). Provides the building blocks that most OpenShift operators are built on. + +Key components: + +- **ConfigObserver** (`configobserver/`) — Watches multiple configuration inputs (configmaps, secrets, API resources) and synthesizes them into a single `RawExtension` output. Change detection works by comparing the merged observer output against the existing observed config. This is the standard pattern for operators that need to react to configuration changes from multiple sources. + +- **ResourceSyncController** (`resourcesynccontroller/`) — Synchronizes secrets and configmaps across namespaces. Supports partial sync (specific keys only). Used by operators that need configuration or credentials from one namespace available in another. + +- **StaticPodController** (`staticpod/`) — Manages static pod lifecycle with revision tracking and atomic directory swaps. Uses `renameat2(RENAME_EXCHANGE)` on Linux for atomic operations; non-Linux platforms are not supported and return an error. Includes installers, pruners, and readiness checks. + +- **RevisionController** (`revisioncontroller/`) — Tracks configuration revisions to enable rollback and audit trails for static pod operators. + +- **StatusController** (`status/`) — Aggregates status conditions from multiple sources into a unified `ClusterOperator` status. Handles degraded, progressing, and available conditions. + +- **ManagementStateController** (`managementstatecontroller/`) — Handles the `Managed`/`Unmanaged`/`Removed` lifecycle states for operators. + +### Encryption and KMS (`pkg/operator/encryption/`) + +Complex subsystem (11 subdirectories) that manages encryption of Kubernetes resources at rest in etcd. + +Architecture: + +- **State machine** (`statemachine/`) — Drives encryption through states: unencrypted → key exists → migration in progress → encrypted. Preconditions gate transitions. +- **Controllers** (`controllers/`) — Coordinate key creation, migration, and pruning. +- **Crypto providers** (`crypto/`) — Pluggable encryption providers (AES-CBC, AES-GCM, KMS v1/v2, secretbox). +- **Deployer** (`deployer/`) — Applies encryption configuration to API server pods. + +The KMS integration supports both KMS v1 and v2 protocols, with preflight checks to validate provider connectivity before enabling encryption. + +### Certificate Management + +Dual-layer design: + +- **`pkg/crypto/`** — Low-level primitives: TLS certificate generation, key pair creation (RSA 2048+, ECDSA P256/P384/P521), cipher suite management, certificate filtering and rotation logic. Enforces TLS adherence policies. + +- **`pkg/operator/certrotation/`** — High-level controller that manages certificate lifecycle. Monitors expiry and triggers rotation at 80% of validity (4/5 of the cert lifetime). Maintains certificate chains across rotation events. Note: CSR handling is in `pkg/operator/csr/`, not here. + +- **`pkg/pki/`** — Newer API-driven PKI profile system. Provides a `PKIProfile` abstraction that allows cluster-wide key algorithm policies to be applied consistently across all certificate operations. + +### Manifest-Based Client (`pkg/manifestclient/`) + +An alternative to traditional generated Kubernetes clients. Operates on embedded manifests with a discovery reader, enabling offline operation and simpler dependency chains. Used in contexts where full API server connectivity is not available or desirable. + +### Configuration (`pkg/config/`) + +Provides configuration management utilities: serving info setup, cluster operator status handling, leader election configuration, and configuration validation. Used by operators during initialization. + +## Dependency Architecture + +```text +library-go +├── depends on +│ ├── k8s.io/api, apimachinery, client-go, apiserver +│ ├── github.com/openshift/api (OpenShift type definitions) +│ ├── github.com/openshift/client-go (generated OpenShift clients) +│ └── go.etcd.io/etcd/client/v3 (encryption/KMS operations) +│ +├── consumed by +│ ├── cluster-image-registry-operator +│ ├── cluster-authentication-operator +│ ├── cluster-kube-apiserver-operator +│ ├── cluster-openshift-apiserver-operator +│ └── ... (dozens of OpenShift operators) +│ +└── must NOT depend on + ├── k8s.io/kubernetes + └── openshift/origin +``` + +## Design Decisions + +| Decision | Rationale | +|----------|-----------| +| No `k8s.io/kubernetes` dependency | Keeps the library vendorable without pulling in the entire Kubernetes monorepo. Maintains a manageable dependency tree. | +| ConfigObserver produces RawExtension | Decouples configuration observation from consumption. Observers can be composed independently without knowing each other's schemas. | +| Atomic static pod swaps | Prevents partial state during pod updates. A failed swap leaves the previous revision intact rather than producing a broken intermediate state. | +| Dual crypto/PKI layers | `pkg/crypto` handles raw operations while `pkg/pki` adds policy. Separating these allows operators to use low-level crypto without buying into the full PKI profile system. | +| Vendor directory checked in | Ensures reproducible builds across CI and developer machines without network access to module proxies. Standard practice for OpenShift repos. | +| No Kubernetes code generators | Does not use deepcopy-gen, client-gen, or informer-gen. Some CRD manifests are synced from openshift/api via the Makefile. Keeps the library explicit and auditable. | + +## Testing Architecture + +- **Unit tests** — Colocated with source (`*_test.go`). Use Kubernetes fake clientsets for API simulation. +- **E2E encryption tests** (`test/e2e-encryption/`) — Validate full encryption lifecycle including KMS provider interaction. Require etcd and KMS sidecar infrastructure. +- **E2E monitoring tests** (`test/e2e-monitoring/`) — Validate metrics collection and alerting integration. +- **Shared test helpers** (`test/library/`) — Reusable test utilities for encryption, metrics, and API server testing. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 120000 index 0000000000..47dc3e3d86 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000000..c4245e99f1 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,81 @@ +# Contributing to library-go + +library-go is a shared Go helper library for OpenShift. It provides reusable, production-grade components for operators, API servers, certificate management, and more. Because it is imported by dozens of OpenShift repositories, changes here have wide impact and are held to a high bar. + +## Before You Start + +- **High bar for inclusion.** New code must have concrete use cases in at least two separate OpenShift repositories and be of reasonable complexity. +- **No forbidden imports.** This repo must not depend on `k8s.io/kubernetes` or `openshift/origin`. PRs that add either will be rejected. + +## Development Workflow + +1. Fork the repo and clone your fork. +2. Create a feature branch from `master`. +3. Make your changes, add or update tests. +4. Run verification locally before pushing: + +```bash +make build # Compile all packages +make test-unit # Run unit tests +make verify # Run linters, gofmt, vet +``` + +5. If you changed dependencies, update the vendor directory: + +```bash +go mod tidy && go mod vendor +``` + +The vendor directory is checked in. Never skip this step — CI will fail if vendor is stale. + +6. Push your branch and open a PR against `openshift/library-go:master`. + +## Pull Request Guidelines + +- Keep PRs focused. One logical change per PR. +- Write clear commit messages. Follow existing conventions: + - `UPSTREAM: ` or `UPSTREAM: ` for upstream-related changes + - Reference Jira tickets where applicable (e.g., `OCPBUGS-12345: fix cert rotation race`) +- Include unit tests for new functionality. For operator controller changes, consider e2e test coverage. +- PRs require approval from at least one approver listed in the `OWNERS` file. + +## PR Review Rules + +- All PRs require `/lgtm` from a reviewer and `/approve` from an approver (OWNERS file). These are separate roles — the approver confirms the change belongs in the repo, the reviewer confirms correctness. +- Prow enforces required labels: `lgtm` and `approved` must both be present before merge. +- CI checks (`make verify`, `make test-unit`) must pass. Reviewers should not `/lgtm` a PR with failing CI. +- Review for backward compatibility — this is a shared library. Ask: "will this break any downstream consumer?" If unclear, request the author demonstrate that existing callers still compile and pass tests. +- Changes to high-risk areas (encryption, certrotation, staticpod) should be reviewed by someone familiar with that subsystem. Check the per-directory OWNERS files for the right reviewers. +- Carry patches (`UPSTREAM: `) require extra scrutiny — they must be rebased on every upstream rebase and justified in the commit message. +- Do not approve PRs that add `k8s.io/kubernetes` or `openshift/origin` imports under any circumstance. + +## Testing + +| Command | What it runs | +|---------|-------------| +| `make test-unit` | Unit tests across `./pkg/...` | +| `make test-e2e-encryption` | Encryption/KMS end-to-end tests | +| `make test-e2e-monitoring` | Monitoring end-to-end tests | +| `make verify` | Linters, format checks, vet | + +Unit tests live alongside source files (`*_test.go`). Shared test helpers are in `test/library/`. + +## Code Conventions + +- Follow standard Go conventions (gofmt, govet). +- Use the existing patterns in the package you are modifying — library-go has well-established patterns for controllers, observers, and resource sync. +- Keep API-facing changes backward compatible. Breaking changes require discussion with approvers. + +## Areas Requiring Extra Care + +- **Encryption / KMS** (`pkg/operator/encryption/`): Complex state machine with preconditions, observer patterns, and KMS plugin integration. Changes here need deep familiarity with the subsystem. +- **Certificate rotation** (`pkg/crypto/`, `pkg/operator/certrotation/`): Involves expiry checks (rotation at 80% of validity) and multi-CA chain handling. CSR logic is in `pkg/operator/csr/`. Test thoroughly. +- **Static pod management** (`pkg/operator/staticpod/`): Uses `renameat2(RENAME_EXCHANGE)` for atomic directory swaps on Linux; non-Linux is not supported. Be careful with platform-specific code. + +## CI + +CI runs via OpenShift's CI infrastructure (Prow / ci-operator). The build root image is defined in `.ci-operator.yaml`. All `make verify` and `make test-unit` checks must pass for a PR to merge. + +## Questions? + +If you are unsure whether a change belongs in library-go, open an issue first to discuss. The approvers can help determine if this is the right home for your contribution.