-
Notifications
You must be signed in to change notification settings - Fork 266
[WIP] CNTRLPLANE-3234: health reporter writer #2318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,9 +18,9 @@ require ( | |
| github.com/onsi/gomega v1.38.2 | ||
| github.com/opencontainers/go-digest v1.0.0 | ||
| github.com/opencontainers/selinux v1.13.0 | ||
| github.com/openshift/api v0.0.0-20260521125114-09730f85d883 | ||
| github.com/openshift/api v0.0.0-20260618083218-a3c8dea7f8bc | ||
| github.com/openshift/build-machinery-go v0.0.0-20250530140348-dc5b2804eeee | ||
| github.com/openshift/client-go v0.0.0-20260512113608-deb4dc54551a | ||
| github.com/openshift/client-go v0.0.0-20260618131434-17fd91ed6167 | ||
|
Comment on lines
+21
to
+23
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Verify that the local filesystem replace directive has been removed
# Check for any replace directives in go.mod
echo "=== Checking for replace directives ==="
rg -n "^replace " go.mod || echo "No replace directives found"
# Check all github.com/openshift dependencies for pseudo-versions
echo ""
echo "=== Checking all openshift dependency versions ==="
rg -n "github.com/openshift" go.mod | grep -E "^\s*github.com/openshift"
# Extract the actual versions being used
echo ""
echo "=== Extracting exact versions ==="
grep "github.com/openshift" go.mod | awk '{print $1, $2}'Repository: openshift/library-go Length of output: 459 Replace pseudo-version dependencies with stable released versions or explicitly justify their necessity. The local filesystem However, the three Either:
🤖 Prompt for AI AgentsSource: Coding guidelines |
||
| github.com/pkg/errors v0.9.1 | ||
| github.com/pkg/profile v1.7.0 | ||
| github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring v0.74.0 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,15 +6,13 @@ import ( | |
| "regexp" | ||
| "time" | ||
|
|
||
| "github.com/openshift/library-go/pkg/operator/v1helpers" | ||
| "github.com/spf13/cobra" | ||
| "github.com/spf13/pflag" | ||
|
|
||
| "k8s.io/apimachinery/pkg/util/sets" | ||
| "k8s.io/apimachinery/pkg/util/wait" | ||
| "k8s.io/apiserver/pkg/server" | ||
| k8senvelopekmsv2 "k8s.io/apiserver/pkg/storage/value/encrypt/envelope/kmsv2" | ||
| "k8s.io/client-go/rest" | ||
| "k8s.io/client-go/tools/clientcmd" | ||
| "k8s.io/klog/v2" | ||
| ) | ||
|
|
@@ -35,21 +33,21 @@ type options struct { | |
| NodeName string | ||
| Kubeconfig string | ||
|
|
||
| newOperatorClient func(*rest.Config) (v1helpers.OperatorClient, error) | ||
| newWriter NewEncryptionStatusWriterFunc | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Before discussing the details, I think it would be very useful to have a PoC PR in respective operators to see how callers will call this (https://github.com/openshift/cluster-kube-apiserver-operator/blob/4e7619777cb6e5bb7853836d07656da3e1eb14df/cmd/cluster-kube-apiserver-operator/main.go#L70-L74) |
||
| } | ||
|
|
||
| type Config struct { | ||
| operatorClient v1helpers.OperatorClient | ||
| prober *prober | ||
| writeStatus EncryptionStatusWriter | ||
| prober *prober | ||
|
|
||
| interval time.Duration | ||
| writeTimeout time.Duration | ||
| nodeName string | ||
| } | ||
|
|
||
| func NewCommand(ctx context.Context, newOperatorClient func(*rest.Config) (v1helpers.OperatorClient, error)) *cobra.Command { | ||
| func NewCommand(ctx context.Context, newWriter NewEncryptionStatusWriterFunc) *cobra.Command { | ||
| o := &options{ | ||
| newOperatorClient: newOperatorClient, | ||
| newWriter: newWriter, | ||
| } | ||
|
|
||
| cmd := &cobra.Command{ | ||
|
|
@@ -126,9 +124,11 @@ func (o *options) Config(ctx context.Context) (*Config, error) { | |
| return nil, fmt.Errorf("build rest config: %w", err) | ||
| } | ||
|
|
||
| operatorClient, err := o.newOperatorClient(restCfg) | ||
| // fieldManager is the per-node ownership identity. | ||
| fieldManager := "kms-health-reporter-" + o.NodeName | ||
| writeStatus, err := o.newWriter(restCfg, fieldManager) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("build operator client: %w", err) | ||
| return nil, fmt.Errorf("build encryption status writer: %w", err) | ||
| } | ||
|
|
||
| plugins, err := buildPlugins(ctx, o.KMSSockets, o.ReadTimeout) | ||
|
|
@@ -137,21 +137,25 @@ func (o *options) Config(ctx context.Context) (*Config, error) { | |
| } | ||
|
|
||
| return &Config{ | ||
| operatorClient: operatorClient, | ||
| prober: newProber(plugins), | ||
| interval: o.Interval, | ||
| writeTimeout: o.WriteTimeout, | ||
| nodeName: o.NodeName, | ||
| writeStatus: writeStatus, | ||
| prober: newProber(plugins), | ||
| interval: o.Interval, | ||
| writeTimeout: o.WriteTimeout, | ||
| nodeName: o.NodeName, | ||
| }, nil | ||
| } | ||
|
|
||
| func (c *Config) Run(ctx context.Context) error { | ||
| wait.JitterUntilWithContext(ctx, func(ctx context.Context) { | ||
| // Each Status RPC enforces the read timeout internally (set at dial | ||
| // time); ctx here only carries shutdown cancellation. | ||
| conditions := c.prober.probeAll(ctx) | ||
| // TODO: hand conditions to the writer once it lands; logging is a placeholder. | ||
| klog.InfoS("kms plugin health", "conditions", conditions) | ||
| reports := c.prober.probeAll(ctx) | ||
|
|
||
| writeCtx, cancel := context.WithTimeout(ctx, c.writeTimeout) | ||
| defer cancel() | ||
| if err := c.writeStatus(writeCtx, buildEncryptionStatus(c.nodeName, reports)); err != nil { | ||
| klog.ErrorS(err, "failed to publish kms plugin health") | ||
| } | ||
| }, c.interval, 0.1, false) | ||
|
|
||
| return nil | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,10 +1,14 @@ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| package health | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import ( | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| "context" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| "testing" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| "time" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| "github.com/stretchr/testify/require" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| applyoperatorv1 "github.com/openshift/client-go/operator/applyconfigurations/operator/v1" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| kmsservice "k8s.io/kms/pkg/service" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // validOptions returns an options value that passes validate. Each test case | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
@@ -119,3 +123,33 @@ func TestValidate(t *testing.T) { | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| }) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // TestRunReportsOnce checks the loop wiring: Run probes, builds the status, and | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // hands it to the reporter. The reporter cancels the context so the loop ends | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // after a single tick. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| func TestRunReportsOnce(t *testing.T) { | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ctx, cancel := context.WithCancel(context.Background()) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| var have *applyoperatorv1.KMSEncryptionStatusApplyConfiguration | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| c := &Config{ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| nodeName: "node-1", | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| interval: time.Hour, // never reached; cancelled after the first tick | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| writeTimeout: time.Second, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| prober: &prober{ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| plugins: []pluginClient{ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| {keyID: "1", service: &fakeService{resp: &kmsservice.StatusResponse{Healthz: "ok", KeyID: "kek-abc"}}}, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| }, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| now: func() time.Time { return time.Unix(0, 0).UTC() }, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| }, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| writeStatus: func(_ context.Context, status *applyoperatorv1.KMSEncryptionStatusApplyConfiguration) error { | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| have = status | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| cancel() | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| return nil | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| }, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| require.NoError(t, c.Run(ctx)) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+131
to
+151
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a timeout-bound context to prevent stuck CI on regressions. At Line [131], using 💡 Proposed fix- ctx, cancel := context.WithCancel(context.Background())
+ ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+ t.Cleanup(cancel)📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| require.Len(t, have.HealthReports, 1) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| require.Equal(t, "node-1", *have.HealthReports[0].NodeName) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| require.Equal(t, "1", *have.HealthReports[0].KeyId) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| package health | ||
|
|
||
| import ( | ||
| "context" | ||
|
|
||
| operatorv1 "github.com/openshift/api/operator/v1" | ||
| applyoperatorv1 "github.com/openshift/client-go/operator/applyconfigurations/operator/v1" | ||
| metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
| "k8s.io/client-go/rest" | ||
| ) | ||
|
|
||
| // NewEncryptionStatusWriterFunc builds the EncryptionStatusWriter for a target | ||
| // apiserver operator status CR. fieldManager sets the owner in the | ||
| // managedFields when doing SSA. | ||
| type NewEncryptionStatusWriterFunc func(restConfig *rest.Config, fieldManager string) (EncryptionStatusWriter, error) | ||
|
|
||
| // EncryptionStatusWriter is capable of applying the | ||
| // KMSEncryptionStatusApplyConfiguration at the correct place in the operator's | ||
| // status. | ||
| type EncryptionStatusWriter func(ctx context.Context, status *applyoperatorv1.KMSEncryptionStatusApplyConfiguration) error | ||
|
|
||
| // buildEncryptionStatus builds the KMSEncryptionStatusApplyConfiguration to be | ||
| // applied by the operator. | ||
| func buildEncryptionStatus(nodeName string, reports []pluginHealthReport) *applyoperatorv1.KMSEncryptionStatusApplyConfiguration { | ||
| healthReports := make([]*applyoperatorv1.KMSPluginHealthReportApplyConfiguration, 0, len(reports)) | ||
| for _, r := range reports { | ||
| hr := applyoperatorv1.KMSPluginHealthReport(). | ||
| WithNodeName(nodeName). | ||
| WithKeyId(r.KeyID). | ||
| WithStatus(mapStatus(r.Status)). | ||
| WithLastCheckedTime(metav1.NewTime(r.LastChecked)) | ||
|
|
||
| // kekId/detail have MinLength=1; setting "" would fail validation. | ||
| if r.KEKID != "" { | ||
| hr = hr.WithKEKId(r.KEKID) | ||
| } | ||
| if r.Detail != "" { | ||
| hr = hr.WithDetail(r.Detail) | ||
| } | ||
|
|
||
| healthReports = append(healthReports, hr) | ||
| } | ||
|
|
||
| return applyoperatorv1.KMSEncryptionStatus().WithHealthReports(healthReports...) | ||
| } | ||
|
|
||
| // mapStatus defaults to Error so an unknown value never becomes an empty, | ||
| // invalid enum. | ||
| func mapStatus(s string) operatorv1.KMSPluginHealthStatus { | ||
| switch s { | ||
| case statusHealthy: | ||
| return operatorv1.KMSPluginHealthStatusHealthy | ||
| case statusUnhealthy: | ||
| return operatorv1.KMSPluginHealthStatusUnhealthy | ||
| default: | ||
| return operatorv1.KMSPluginHealthStatusError | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,67 @@ | ||
| package health | ||
|
|
||
| import ( | ||
| "testing" | ||
| "time" | ||
|
|
||
| "github.com/stretchr/testify/require" | ||
|
|
||
| operatorv1 "github.com/openshift/api/operator/v1" | ||
| applyoperatorv1 "github.com/openshift/client-go/operator/applyconfigurations/operator/v1" | ||
| metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
| ) | ||
|
|
||
| func TestBuildEncryptionStatus(t *testing.T) { | ||
| // Fixed UTC time dodges Go's monotonic clock and timezone drift. | ||
| checked := time.Unix(0, 0).UTC() | ||
| reports := []pluginHealthReport{ | ||
| {KeyID: "1", KEKID: "kek-abc", Status: statusHealthy, LastChecked: checked}, | ||
| {KeyID: "2", Status: statusUnhealthy, Detail: "not ok", LastChecked: checked}, | ||
| {KeyID: "3", Status: statusError, Detail: "DeadlineExceeded", LastChecked: checked}, | ||
| } | ||
|
|
||
| have := buildEncryptionStatus("node-1", reports) | ||
|
|
||
| // Each entry stamps nodeName; kekId only on healthy, detail only on the | ||
| // unhealthy/error entries, status mapped to the API enum. | ||
| want := applyoperatorv1.KMSEncryptionStatus().WithHealthReports( | ||
| applyoperatorv1.KMSPluginHealthReport(). | ||
| WithNodeName("node-1"). | ||
| WithKeyId("1"). | ||
| WithStatus(operatorv1.KMSPluginHealthStatusHealthy). | ||
| WithLastCheckedTime(metav1.NewTime(checked)). | ||
| WithKEKId("kek-abc"), | ||
| applyoperatorv1.KMSPluginHealthReport(). | ||
| WithNodeName("node-1"). | ||
| WithKeyId("2"). | ||
| WithStatus(operatorv1.KMSPluginHealthStatusUnhealthy). | ||
| WithLastCheckedTime(metav1.NewTime(checked)). | ||
| WithDetail("not ok"), | ||
| applyoperatorv1.KMSPluginHealthReport(). | ||
| WithNodeName("node-1"). | ||
| WithKeyId("3"). | ||
| WithStatus(operatorv1.KMSPluginHealthStatusError). | ||
| WithLastCheckedTime(metav1.NewTime(checked)). | ||
| WithDetail("DeadlineExceeded"), | ||
| ) | ||
|
|
||
| require.Equal(t, want, have) | ||
| } | ||
|
|
||
| func TestMapStatus(t *testing.T) { | ||
| tests := []struct { | ||
| in string | ||
| want operatorv1.KMSPluginHealthStatus | ||
| }{ | ||
| {statusHealthy, operatorv1.KMSPluginHealthStatusHealthy}, | ||
| {statusUnhealthy, operatorv1.KMSPluginHealthStatusUnhealthy}, | ||
| {statusError, operatorv1.KMSPluginHealthStatusError}, | ||
| {"unexpected", operatorv1.KMSPluginHealthStatusError}, | ||
| } | ||
|
|
||
| for _, tc := range tests { | ||
| t.Run(tc.in, func(t *testing.T) { | ||
| require.Equal(t, tc.want, mapStatus(tc.in)) | ||
| }) | ||
| } | ||
| } |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Is there a released version of github.com/openshift/api with KMSPluginHealthStatus enum?💡 Result:
No—based on the current openshift/api repository contents, I did not find a released/defined version of github.com/openshift/api that contains the KMSPluginHealthStatus enum. In the surfaced openshift/api changes, the KMS-related API work focuses on KMS encryption/configuration validation (vault KMS plugin configuration, including fields like kmsPluginImage, vaultAddress, approleSecret, etc.) rather than a KMSPluginHealthStatus enum being added or referenced [1][2][3]. One of the related efforts includes adding a KMS “health” subcommand in cluster-openshift-apiserver-operator, but that is in a different repository and does not indicate a KMSPluginHealthStatus enum within openshift/api [4]. Caveat: this conclusion is based on search results for the exact symbol name “KMSPluginHealthStatus” within openshift/api; if the enum exists under a different name (or in a generated protobuf/types file not indexed by the search), it may not have appeared in these results.
Citations:
🏁 Script executed:
Repository: openshift/library-go
Length of output: 951
🏁 Script executed:
Repository: openshift/library-go
Length of output: 309
🏁 Script executed:
# Check what versions of openshift/api are available and whether any released version has KMSPluginHealthStatus git ls-files go.modRepository: openshift/library-go
Length of output: 71
🏁 Script executed:
Repository: openshift/library-go
Length of output: 6496
🏁 Script executed:
Repository: openshift/library-go
Length of output: 103
🏁 Script executed:
Repository: openshift/library-go
Length of output: 294
🏁 Script executed:
Repository: openshift/library-go
Length of output: 46
Replace pseudo-version with a released version or justify the necessity.
The
openshift/apiversionv0.0.0-20260618083218-a3c8dea7f8bcis a pseudo-version (unreleased development commit), which violates the supply chain security guideline: "No pre-release or yanked versions in production."The
KMSPluginHealthStatusenum used inwriter.godoes exist and is compatible in the target version. However, using unreleased commits introduces maintenance and security risks. Either wait foropenshift/apito release a stable version with this enum, or provide justification if this dependency must remain unreleased.🤖 Prompt for AI Agents
Source: Coding guidelines