Skip to content

Add Redis-backed netd team bandwidth limits#437

Merged
laotoutou merged 2 commits into
mainfrom
feat/netd-team-bandwidth
Jun 11, 2026
Merged

Add Redis-backed netd team bandwidth limits#437
laotoutou merged 2 commits into
mainfrom
feat/netd-team-bandwidth

Conversation

@laotoutou

Copy link
Copy Markdown
Contributor

Summary

  • add optional Redis-backed, cluster-scoped team egress/ingress bandwidth limits in netd
  • wire team bandwidth settings through Sandbox0Infra netd config, runtime config, generated CRDs, and Redis config injection
  • add an e2e scenario that enables builtin Redis and verifies two same-team sandboxes share a netd bandwidth bucket
  • update self-hosted configuration docs for Redis-backed netd team limits

Closes #436

Tests

  • GOTOOLCHAIN=go1.25.0+auto go test ./netd/pkg/proxy ./infra-operator/internal/runtimeconfig ./infra-operator/internal/controller/services/netd ./infra-operator/internal/controller/services/redis ./infra-operator/internal/ownership
  • GOTOOLCHAIN=go1.25.0+auto go test ./netd/... ./infra-operator/internal/runtimeconfig ./infra-operator/internal/controller/services/netd ./infra-operator/internal/controller/services/redis ./infra-operator/internal/ownership ./tests/e2e/cases ./tests/e2e/scenarios/single-cluster -run TestCompileOnly
  • make manifests

Notes

Local e2e was not run; PR CI and the remote kind environment cover the e2e path.

@laotoutou laotoutou force-pushed the feat/netd-team-bandwidth branch from 3e7f625 to 39eff83 Compare June 11, 2026 11:41
@laotoutou laotoutou force-pushed the feat/netd-team-bandwidth branch from 39eff83 to b91225d Compare June 11, 2026 12:07
@laotoutou

Copy link
Copy Markdown
Contributor Author

Remote validation update for PR head 1109fd7d38c84fe487180b464169aaef4526d40b:

  • Synced the PR head files to the Aliyun Singapore ECS remote test host.
  • Rebuilt sandbox0ai/infra:latest remotely from the synced source; resulting local image digest was sha256:0425671f75a8a7aaa66e09b288038f5b188c4994ec4e7d6747dd6d3be624d6b3.
  • Created a fresh 3-node kind cluster and preloaded sandbox0ai/otemplates:default-v0.2.0 into the kind nodes.
  • Verified the network-policy sample enables built-in Redis and team bandwidth settings:
    • redis: enabled in the sample.
    • teamIngressBandwidthBytesPerSecond: 65536.
    • teamBandwidthBurstBytes: 65536.
  • Ran the focused real e2e, not a unit test:
E2E_SINGLE_CLUSTER_SCENARIOS=network-policy go test -v -count=1 ./tests/e2e/scenarios/single-cluster \
  -run TestSingleCluster \
  -ginkgo.focus="API network policy mode.*enforces Redis-backed team bandwidth through netd" \
  -timeout=30m

The focused e2e claimed two real default sandbox pods and executed concurrent downloads through netd:

kubectl exec --namespace tpl-default rs-mrswmylvnr2a-default-mrjjj -c procd -- curl ... http://10.244.1.7:8080/large.bin
kubectl exec --namespace tpl-default rs-mrswmylvnr2a-default-m4v9k -c procd -- curl ... http://10.244.1.7:8080/large.bin

Result:

Ran 1 of 84 Specs in 416.768 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 83 Skipped
ok github.com/sandbox0-ai/sandbox0/tests/e2e/scenarios/single-cluster 416.798s

The earlier full PR e2e failure was in the existing SSH egress-auth spec, not the new Redis-backed team bandwidth spec. I reran the failed PR E2E job after this remote validation; it is currently in progress.

Remote ECS was stopped back to StopCharging after validation.

@laotoutou laotoutou merged commit cde5f4f into main Jun 11, 2026
9 of 10 checks passed
@laotoutou laotoutou deleted the feat/netd-team-bandwidth branch June 11, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Redis-backed team-scoped netd bandwidth limits

2 participants