feat(gcp-cvm): add GCP confidential image build kit#5
Conversation
Build a hardened, measured, attestable GCP (Intel TDX) confidential image from a stock Ubuntu 24.04 cloud image. - build-gcp-tapp.sh: one-command pipeline (base provisioning + kernel swap + cryptpilot-convert + ESP sync + security hardening) - prepare-gcp-tapp.sh / fix-esp-grub.sh: stage-B-only and ESP-only helpers - cryptpilot-gcp-boot-fix.md: root-cause analysis, full SOP, integrity notes, and the convert-side issues to confirm with the cryptpilot maintainers - security hardening: purge ssh/cloud-init/google-guest-agent/osconfig/ startup-scripts/snapd/etc., mask console getty, MAC-agnostic netplan - README: point the confidential-instance section to gcp-cvm/ for the GCP variant Binaries (*.deb, *.qcow2) are gitignored; tapp-server is pulled from release. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
✅ Reviewed. Pure shell scripts + docs, no Rust code changes — no compilation impact.
One suggestion: LGTM. |
…e grubenv saved_entry The proper fix belongs in cryptpilot-fde load_kernel_artifacts (src/cmd/fde/disk.rs:395-397), which currently errors when grubenv has no saved_entry. Freshly built / never-booted images have an empty grubenv and boot the grub.cfg default (set default="0" → first menuentry), so the reference extractor should resolve that fallback instead of erroring. Documents the convert-side saved_entry injection as a local workaround only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nce-value Document how to compute remote-attestation reference values from the built image with `cryptpilot-fde show-reference-value --disk`, including: - the prerequisite cryptpilot-fde fix (openanolis/cryptpilot#126) so it works on never-booted images (empty grubenv) without the convert workaround - how to build the fixed cryptpilot-fde from the 0gfoundation fork branch - flags (--disk, --hash-algo, --stage) and the AAEL reference-value outputs Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- tapp-server: default release v0.0.5 -> v0.1.0 (build-gcp-tapp.sh URL, README, doc) - cryptpilot saved_entry fix: reference #128 (against current master 0.8.0, cryptpilot-fde/src/disk/grub.rs) instead of the superseded #126 (stale 0.2.7) - §12 build steps updated for 0.8.0: cargo build -p cryptpilot-fde produces cryptpilot-fde-host/-guest; add cryptsetup-devel build dep; branch fix/srv-default-entry - build-gcp-tapp.sh: add HARDEN toggle (HARDEN=1 hardened / HARDEN=0 dev image) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…reparation - cryptpilot-gcp-boot-fix.md and README.md fully translated to English - add "§0 Preparation": how temp-fixed.qcow2 is produced from the official Ubuntu noble cloud image + GCP gVNIC driver (gve-dkms), plus the prebuilt image download link Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t-fde deb The cryptpilot-fde deb is a build-time prerequisite for the conversion step (§9), unrelated to producing the base image (temp-fixed.qcow2 = Ubuntu + gVNIC). Drop it from the Preparation materials. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
§0 only lists the Ubuntu cloud image as the base-image material. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The stock Ubuntu noble cloud image ships a ~3.5 GiB disk, too small for the kernel + app + Docker layers. Document resizing the root partition to 20 GiB via qemu-img resize + growpart + resize2fs, and warn against virt-resize --expand which renumbers partitions (sda1 -> sda4) and breaks the sda1=rootfs / sda16=/boot assumptions the build flow depends on. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A freshly built image (from the truly-official Ubuntu cloud image) could not be reached over SSH, while images built from the legacy temp-fixed.qcow2 base could. Root cause: temp-fixed shipped google-guest-agent, which injects the instance SSH public key from the metadata server (via 169.254.169.254, no DNS) into ~ubuntu/.ssh/authorized_keys; the official image does not. The dev variant (HARDEN=0) now reinstalls google-guest-agent and adds the 169.254.169.254 metadata.google.internal mapping to /etc/hosts (needed because the build pins resolv.conf to public DNS, so the agent cannot resolve the metadata hostname otherwise). The hardened variant intentionally omits it — google-guest-agent is a back-door-class component and the hardened image is not SSH-reachable by design. Also document in §0 that temp-fixed.qcow2 deviates from the official image (20 GiB resize + google-guest-agent). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The three build scripts (build-gcp-tapp.sh, prepare-gcp-tapp.sh, fix-esp-grub.sh) still had Chinese comments and echo strings. Translate them all to English; no logic changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ocs) - .gitmodules: SSH → HTTPS url so clone/submodule update works without an SSH key (review blocker #1). - register-shared-as.sh: guard against placeholder (_TODO) reference files; trap-clean the temp file on any exit; and detect a no-op injection by checking leftover qrv() on ref_* lines only (the qrv(key) helper definition legitimately keeps its own — checking the whole policy would false-positive). (review #2, #5, #6) - policy.rego: document the AR4SI/EAR claim tiers and the non-affirming defaults (executables 33 / hardware 97 / configuration 36). (review #3) - docs: fix verify.rs path to tapp-common/src/verify.rs. (review #4) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0g-tapp had no quick local way to check that a converted GCP confidential image actually boots before uploading it to GCP. Add boot-smoke-test.sh, which boots the image under QEMU/OVMF (UEFI) in the qemux/qemu container (KVM if available, else TCG) and scans the serial console for the full boot chain: grub -> gcp kernel -> cryptpilot-fde (dm-verity + zram + dm-snapshot) -> /sysroot mount -> switch-root -> multi-user / tapp-server.service. Validates everything except the TDX-only bits (RTMR extend, remote attestation), which require real hardware; it is a pre-flight check, not a replacement for on-hardware testing. Documented in the gcp-cvm README. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…m build scripts The local boot smoke test is a verification tool, not part of the build pipeline; keep it under gcp-cvm/test/ rather than alongside the build scripts. Update the README references accordingly. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ixed base Remove deployment-specific values that were baked in as defaults: - OWNER_ADDRESS (a concrete wallet address) and KBS_URLS (concrete KBS node IPs) are now REQUIRED; the build aborts if unset, so no specific deployment value is ever committed or silently baked into config.toml. - §0 of the doc no longer points at the opaque, non-reproducible temp-fixed.qcow2 download; the base is built reproducibly from the official Ubuntu cloud image (resize + gVNIC). The SSH/google-guest-agent note is reframed around the dev vs hardened build instead of temp-fixed. README documents the now-required variables. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cible by others A third party could not follow the docs end-to-end: the conversion host requirement was implicit and the package sources were missing. Add an explicit Prerequisites section (README + doc §0.1): - conversion host must be Anolis / Alibaba Cloud Linux 3 (al8); cryptpilot-convert is not packaged for Ubuntu hosts - install cryptpilot-convert via the al8 RPM from openanolis/cryptpilot v0.7.0; the target-image runtime is the matching .deb from the same release - host tooling: libguestfs-tools, qemu-img, nbd module, LIBGUESTFS_BACKEND=direct, root; docker only for the smoke test Also make the §0.2 resize step a copy-pasteable virt-customize command. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Read through all three scripts + Worth confirming before treating any output as a production image
Non-blocking / follow-up
Naming nit (in favor): |
…#21, Phase 1) Add ENABLE_SYSBOX=1 to build-gcp-tapp.sh: installs sysbox-ce and registers sysbox-runc as a dockerd runtime so in-container root is user-namespace remapped (a sandbox kernel CVE is no longer host-equivalent). Because the cryptpilot rootfs writable overlay is RAM-backed (zram) and ephemeral, docker/sysbox data must not live on it: the build pins docker data-root to /data/docker and adds a docker.service RequiresMountsFor=/data drop-in + an fstab LABEL=tapp-data entry, so docker fails loud if the persistent /data disk is not mounted (never silently writes to the RAM root). Phase 1 is isolation only; /data confidentiality (KBS-bound dm-crypt) is deferred to Phase 2. Default build is unchanged (ENABLE_SYSBOX defaults to 0). boot-smoke-test.sh gains an optional CHECK_SYSBOX=1 static image check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Build a hardened, measured, remotely-attestable GCP (Intel TDX) confidential image for 0g-tapp from a stock Ubuntu 24.04 cloud image, using cryptpilot for measured FDE.
What's here
gcp-cvm/build-gcp-tapp.sh— one-command pipeline: base provisioning (tapp-server + Docker + Intel SGX/libtdx-attest + config + DNS) → kernel swap (generic → gcp, for RTMR extend) →cryptpilot-convert→ ESP grub sync → optional security hardening. Toggle withHARDEN=1|0.gcp-cvm/prepare-gcp-tapp.sh/gcp-cvm/fix-esp-grub.sh— stage-B-only (kernel + convert + ESP) and ESP-only helpers, reused by the main pipeline.gcp-cvm/cryptpilot-gcp-boot-fix.md— root-cause analysis + full reproducible SOP + integrity notes, covering the four problems we hit (boot crash from GCP's dual grub.cfg, read-only rootfs / RTMR-not-extended, runtime RTMR extend failure, broken DNS) and the convert-side issues to confirm with the cryptpilot maintainers.gcp-cvm/README.md,gcp-cvm/config_dir/fde.toml, rootREADME.mdpointer,.gitignorefor*.deb/*.qcow2.Two image variants (same base, same pipeline)
HARDEN=1)HARDEN=0)google-guest-agent(GCP SSH key injection)google-guest-agentis a back-door-class component (it can push changes into the instance from outside the measured app), so the hardened image is intentionally not SSH-reachable; the dev image reinstalls it (+ a169.254.169.254 metadata.google.internal/etc/hostsentry, needed because the build pins resolv.conf to public DNS) to restore SSH.§0 reproducibility note
The regression was run from the truly-official Ubuntu cloud image, which revealed that the previously-used
temp-fixed.qcow2base was not a pristine "official + gVNIC" image — it also carried (1) the root partition resized to 20 GiB and (2)google-guest-agent. Both are now documented in §0 and handled by the build, so the image is reproducible from the official image alone.Companion fix
The cryptpilot
show-reference-valuefix for never-booted images (empty grubenv / missingsaved_entry) is openanolis/cryptpilot#128.Binaries are gitignored; tapp-server is pulled from release v0.1.0.
🤖 Generated with Claude Code