feat(capture): Add --cleanup-after-upload flag for automatic resource cleanup#2367
Merged
Conversation
Retina Code Coverage ReportTotal coverage increased from
|
| Impacted Files | Coverage | |
|---|---|---|
| pkg/controllers/daemon/namespace/namespace_controller.go | 76.24% ... 78.46% (2.22%) |
⬆️ |
| cli/cmd/capture/create.go | 35.99% ... 73.86% (37.87%) |
⬆️ |
| pkg/controllers/operator/capture/controller.go | 0.0% ... 13.38% (13.38%) |
⬆️ |
| cli/cmd/capture/delete.go | 75.0% ... 82.8% (7.8%) |
⬆️ |
f91cdb9 to
62c9897
Compare
SRodi
reviewed
May 20, 2026
583136c to
99882d7
Compare
a6bcfaf to
31d7504
Compare
mereta
reviewed
May 21, 2026
mereta
reviewed
May 21, 2026
mereta
reviewed
May 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a
--cleanup-after-uploadflag tokubectl retina capture createthat enables automatic cleanup of capture jobs, secrets, and host-path files after successful upload to remote storage (blob, S3, or PVC).Key behaviors:
--cleanup-after-uploadis combined with--no-wait=trueand a remote destination, jobs get aTTLSecondsAfterFinished(5 min) so Kubernetes garbage-collects them automatically.CaptureReconcilernow deletes the Capture resource once all jobs succeed andCleanUpAfterUploadis true with remote storage configured (blob, S3, or PVC).duration + 30min) to prevent indefinite hangs.waitUntilJobsCompletedeadline is nowduration + 5min(floored at 5min), fixing premature timeouts for short captures.deleteSecretandcapture deletenow tolerateNotFounderrors caused by ownerRef garbage collection racing with explicit deletion.Related Issue
N/A — feature request for safer automated capture workflows.
Checklist
git commit -S -s ...).Screenshots (if applicable) or Testing Completed
Unit Tests
All tests pass in the affected packages. The new logic is 100% covered by the new CLI and Operator controller tests added.
Manual Testing
--cleanup-after-uploadwithout remote./bin/kubectl-retina capture create --name test1 --node-names aks-nodepool1-... --host-path /tmp/captures --cleanup-after-upload --duration 5s./bin/kubectl-retina capture create --name test2 --node-names aks-nodepool1-... --blob-upload "<SAS URL>" --cleanup-after-upload --no-wait=true --duration 10skubectl get job -o yaml(TTL=300),kubectl get secret -o yaml(ownerReferences present), job auto-deleted after ~5 min./bin/kubectl-retina capture create --name test3 --node-names aks-nodepool1-... --blob-upload "<SAS URL>" --cleanup-after-upload --no-wait=false --duration 10skubectl get jobsshowed no jobs remaining, blob appeared in storage account, secret deleted./bin/kubectl-retina capture create --name test4 --node-names aks-nodepool1-... --blob-upload "<SAS URL>" --no-wait=false --duration 10s./bin/kubectl-retina capture create --name test5 --node-names aks-nodepool1-... --host-path /tmp/captures --no-wait=true --duration 5skubectl get job -o yamlshowed no TTLSecondsAfterFinishedWait timeout fix verified: Test 3 used
--duration 10sand the CLI completed within ~75s (10s capture + upload time), confirming the newduration + 5mindeadline works correctly for short captures (previously would timeout at 2×duration = 20s before upload finished).NotFound tolerance verified: In test 2, running
kubectl retina capture delete --name test2after TTL had already garbage-collected the job+secret returned success (no "secret not found" error).Additional Notes
CleanUpAfterUploadfield is added to the CRD spec as optional with+kubebuilder:default=false. Existing Captures are unaffected.JobActiveDeadlineBufferSeconds(30 min) is generous to accommodate large captures uploading over slow links or Windows nodes with slow image pulls.BlobUpload,S3Upload, andPersistentVolumeClaim.