Skip to content

deploy/aarch64: add reproducible mvm vmlinux build pipeline#256

Open
lkml-likexu wants to merge 1 commit into
TencentCloud:masterfrom
lkml-likexu:master
Open

deploy/aarch64: add reproducible mvm vmlinux build pipeline#256
lkml-likexu wants to merge 1 commit into
TencentCloud:masterfrom
lkml-likexu:master

Conversation

@lkml-likexu
Copy link
Copy Markdown
Collaborator

Cross-compiling the ARM64 mvm guest kernel has so far been a manual, host-environment-dependent process: contributors had to install the right cross toolchain, remember which OpenCloudOS-Kernel tag to use, hand-craft a .config, and then figure out where the resulting Image should be placed and which cmdline the shim expects. This made the build hard to reproduce across machines and easy to get subtly wrong.

Introduce a self-contained pipeline under deploy/aarch64/ that pins all of these inputs:

  • a Dockerfile providing the exact cross-compile toolchain image;
  • mvm.config, the boot-tested kernel configuration for mvm guests;
  • mvm.cmdline, the recommended kernel command line for the shim;
  • build_mvm_vmlinux.sh, an idempotent driver that builds the image, fetches the pinned kernel tag, runs the cross build inside the container as the invoking user, and emits a stable Image plus a tag/sha-stamped copy alongside a build log.

The script also surfaces the two manual follow-up steps (vmlinux placement and shim cmdline) so that downstream integration is unambiguous. With this in place, producing the mvm vmlinux is a single command and yields the same artifact on any host.

Assisted-by: Anthropic:claude-opus-4-7
Signed-off-by: Like Xu likexu@tencent.com

Cross-compiling the ARM64 mvm guest kernel has so far been a manual,
host-environment-dependent process: contributors had to install the
right cross toolchain, remember which OpenCloudOS-Kernel tag to use,
hand-craft a .config, and then figure out where the resulting Image
should be placed and which cmdline the shim expects. This made the
build hard to reproduce across machines and easy to get subtly wrong.

Introduce a self-contained pipeline under deploy/aarch64/ that pins
all of these inputs:

  - a Dockerfile providing the exact cross-compile toolchain image;
  - mvm.config, the boot-tested kernel configuration for mvm guests;
  - mvm.cmdline, the recommended kernel command line for the shim;
  - build_mvm_vmlinux.sh, an idempotent driver that builds the image,
    fetches the pinned kernel tag, runs the cross build inside the
    container as the invoking user, and emits a stable Image plus a
    tag/sha-stamped copy alongside a build log.

The script also surfaces the two manual follow-up steps (vmlinux
placement and shim cmdline) so that downstream integration is
unambiguous. With this in place, producing the mvm vmlinux is a
single command and yields the same artifact on any host.

Assisted-by: Anthropic:claude-opus-4-7
Signed-off-by: Like Xu <likexu@tencent.com>
@github-actions
Copy link
Copy Markdown

Claude Code is working…

I'll analyze this and get back to you.

View job run

@fslongjin
Copy link
Copy Markdown
Member

/cubebot review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 14, 2026

Claude finished @fslongjin's task in 59s —— View job


Code Review: deploy/aarch64: add reproducible mvm vmlinux build pipeline

I've reviewed all 4 changed files. Overall the code quality is high — the script is robust, well-structured, and addresses the reproducibility problem effectively. Below are my findings.


deploy/aarch64/Dockerfile — Minor issues

Good: The package selection is comprehensive for kernel cross-compilation. Using BuildKit's --mount=type=cache for dnf is a good pattern.

1. Cache mount partially self-defeated (deploy/aarch64/Dockerfile:4,29-30)

The RUN instruction uses --mount=type=cache,target=/var/cache/dnf but then runs rm -rf /var/cache/dnf/* at the end. Since dnf writes cached RPMs into this directory during the install commands, the explicit cleanup empties the cache mount's contents, meaning subsequent Docker builds won't benefit from cached packages across builds. Consider either:

# Option A: Let dnf clean only metadata, keep RPM cache
dnf clean all && rm -rf /var/cache/yum/*
# (remove the rm -rf /var/cache/dnf/* part)

# Option B: Or drop the cache mount if offline builds aren't needed
RUN set -eux; \
    dnf install -yq epol-release; \
    ...

deploy/aarch64/build_mvm_vmlinux.sh — Well-structured build script

Good:

  • set -Eeuo pipefail — best practice for reliability
  • on_error trap with LINENO — helpful for debugging failures
  • Idempotent clone_source handling fresh clones, re-runs, and tag switches
  • Running the container as the invoking user (--user "${uid}:${gid}") — correct for file ownership
  • Stamped artifact copy with tag+SHA — excellent for traceability
  • post_hint function — helpful UX for next steps

2. Unnecessary --network=host in build container (deploy/aarch64/build_mvm_vmlinux.sh:268)

The docker run passes --network=host, but the in-container script only runs make and cp — neither require network access. Dropping this would be a minor hardening improvement since the container doesn't need host-network access:

docker run --rm \
    --user "${uid}:${gid}" \
    ...

3. Duplicate safe.directory entries (deploy/aarch64/build_mvm_vmlinux.sh:196)

git config --global --add safe.directory "${SRC_DIR}" appends a new entry every time the script re-runs, accumulating duplicates in the global git config. Consider idempotent alternatives:

# Option A: Only add if not already present
git config --global safe.directory "${SRC_DIR}" 2>/dev/null || true

# Option B: Check first
if ! git config --global --get-all safe.directory | grep -Fx "${SRC_DIR}" >/dev/null 2>&1; then
    git config --global --add safe.directory "${SRC_DIR}"
fi

4. Global git config side effect (deploy/aarch64/build_mvm_vmlinux.sh:196)

The script modifies the user's global git config. This is acceptable for a build script but should be documented. Consider using GIT_CONFIG_GLOBAL=/dev/null scoping or at minimum adding a log line indicating this is happening:

log "Marking ${SRC_DIR} as a safe git directory (global config)."
git config --global --add safe.directory "${SRC_DIR}" 2>/dev/null || true

5. Edge case: JOBS validation (deploy/aarch64/build_mvm_vmlinux.sh:43)

JOBS is derived from nproc or defaults to 4, but the --jobs flag accepts any string including invalid values like 0 or negative numbers. Consider adding validation:

--jobs)
    JOBS="${2:?}"
    if ! [[ "${JOBS}" =~ ^[1-9][0-9]*$ ]]; then
        err "Invalid --jobs value: must be a positive integer"
        exit 2
    fi
    shift 2
    ;;

deploy/aarch64/mvm.cmdline — Correct

Standard serial console configuration for ARM64 VMs (console=ttyAMA0,115200). No issues.


deploy/aarch64/mvm.config — Standard kernel config

10,928 lines is typical for a Linux kernel .config. I did not do a line-by-line review of all config options, but the file integrates correctly with the build pipeline via olddefconfig.


Summary

Severity Count Key items
High 0
Medium 0
Low 4 Cache mount cleanup, host-network hardening, safe.directory duplicates, missing JOBS validation
Nit 1 Missing log line for global git config modification

The pipeline is well-architected and ready for use. The issues above are minor — none block merging, but addressing them would improve hygiene.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants