Skip to content

Add release image BuildKit cache support#394

Draft
khluu wants to merge 1 commit into
mainfrom
ci/release-image-cache
Draft

Add release image BuildKit cache support#394
khluu wants to merge 1 commit into
mainfrom
ci/release-image-cache

Conversation

@khluu

@khluu khluu commented Jun 25, 2026

Copy link
Copy Markdown
Member

Summary

Adds infra-side support for cached CUDA release image builds while keeping the vLLM release pipeline change small.

Changes:

  • Accept CACHE_FROM_BASE_BRANCH in docker/ci.hcl, keeping CACHE_FROM_BASE as a compatibility alias.
  • Add a private ECR vllm-release-cache repository with the existing 14-day cache lifecycle.
  • Add IAM read/write access for release/postmerge queues that already build and push release images.
  • Document the CI cache lane vs release cache lane in the README.

Why

The CUDA release image jobs currently build with raw docker build commands and do not import/export the registry BuildKit cache used by CI image builds. This gives release builds cold-cache behavior even though they share expensive layers with CI builds.

This PR intentionally only creates the infra target for a release cache. The vLLM-side release pipeline can then add direct BuildKit registry cache flags to the existing docker build commands without converting the release matrix to Bake.

The release cache is separate from vllm-ci-test-cache and vllm-ci-postmerge-cache so release-specific variants do not churn CI cache refs.

Consumer Contract

CUDA release image jobs should use refs like:

936637512419.dkr.ecr.us-east-1.amazonaws.com/vllm-release-cache:<arch>-cu<cuda>-ubuntu<ubuntu>

Suggested initial variant tags:

x86_64-cu130-ubuntu2204
aarch64-cu130-ubuntu2204
x86_64-cu129-ubuntu2204
aarch64-cu129-ubuntu2204
x86_64-cu130-ubuntu2404
aarch64-cu130-ubuntu2404
x86_64-cu129-ubuntu2404
aarch64-cu129-ubuntu2404

Use direct BuildKit flags on the existing docker build release commands, for example:

--cache-from type=registry,ref=936637512419.dkr.ecr.us-east-1.amazonaws.com/vllm-release-cache:x86_64-cu130-ubuntu2204,mode=max \
--cache-to type=registry,ref=936637512419.dkr.ecr.us-east-1.amazonaws.com/vllm-release-cache:x86_64-cu130-ubuntu2204,mode=max,compression=zstd

The vLLM-side change can also add extra --cache-from flags for the existing CI fallback refs if desired, but release jobs should only write release-specific layers to vllm-release-cache.

Validation

  • git diff --check origin/main...HEAD

Not run locally because this devbox does not have terraform, docker, tofu, or hclfmt installed:

  • terraform fmt -check terraform/aws/ecr.tf terraform/aws/iam.tf
  • docker buildx bake --print test-ci

Signed-off-by: vLLM CI <ci@vllm.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant