Skip to content

feat: append builder info to detect builds made with uncommitted changes in the builder#666

Draft
sayakpaul wants to merge 7 commits into
mainfrom
dirty
Draft

feat: append builder info to detect builds made with uncommitted changes in the builder#666
sayakpaul wants to merge 7 commits into
mainfrom
dirty

Conversation

@sayakpaul

Copy link
Copy Markdown
Member

Fixes #648

Considerations:

  • The Nix sandbox has no .git, so this information has to be passed in explicitly. So, builder-info related stuff become command-line arguments embedded in the derivation's build script at eval time. Is there a better way to handle it?
  • nix build / nix flake show writes an untracked flake.lock, which makes the kernel tree dirty aka kernel.dirty=true unless the lock is committed. I have made a note about it in the docs. Is there a better way to handle it?

Example metadata.json with builder-info:

{
    "name": "relu",
    "id": "_relu_cuda_9f8a0ff",
    "version": 1,
    "license": "Apache-2.0",
    "source": "https://github.com/myorg/relu",
    "python-depends": [],
    "backend": {
      "type": "cuda",
      "archs": ["7.0", "7.5", "8.0", "8.6", "8.9", "9.0+PTX"]
    },
    "digest": {
      "algorithm": "sha256",
      "files": {
        "__init__.py": "xLMbARTcTl8L/m1kJLc/h/QL4Kzt772F872a46pfRGI=",
        "_relu_cuda_9f8a0ff.abi3.so": "vtdzzToloH38HZkVs7sFEf69QFDxROuPsBAond3Jic0=",
        "_ops.py": "Hrp5aF4o0eHSttw4sQGsbBAXFqvLJ42Y9YJ2KkqvZhg=",
        "relu/__init__.py": "DFYPlrhXwYjEqCl/8n0SmWGZV8NFml5DPhMjKfv98GY="
      }
    },
    "build-info": {
      "kernel-builder": {
        "version": "0.16.0-dev0",
        "sha": "03795978d693a6e0872f6a36187de55888744cfd",
        "dirty": false
      },
      "kernel": {
        "sha": "9f8a0ff1dcb2e1d34d46950eadf0714050f9af04",
        "dirty": false
      }
    }
  }

We throw a warning during upload to the Hub and also show a warning on the Hub card that the corresponding builds were made with a "dirty" version. Kernel loading also logs similar warnings.

Since the real changes affect the above three surfaces, the total changes are a bit bigger than I expected (including all the docs and tests). I can break the PR down further:

  • Add builder info in the kernel-builder CLI
  • kernels related changes
  • Append card-related changes

@sayakpaul sayakpaul marked this pull request as draft June 19, 2026 07:33
@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Adds the --kernel-sha/--kernel-dirty and --kernel-builder-sha/
--kernel-builder-dirty flags to the generated CLI reference, fixing the
check-cli-docs CI job.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fixes the `nix fmt --ci` formatting check.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@sayakpaul sayakpaul marked this pull request as ready for review June 19, 2026 09:19
Comment on lines +49 to +50
git_info: Option<GitInfo>,
kernel_builder: Option<KernelBuilderInfo>,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't overload this struct, it's purely for uniquely identifying a kernel (e.g. for ops name generation). Let's keep it small and atomic.

Comment on lines +36 to +41
// Prefer externally-provided `kernel-builder` provenance (e.g. from Nix),
// falling back to the provenance baked in at compile time.
let kernel_builder = kernel_id
.kernel_builder()
.cloned()
.unwrap_or_else(kernel_builder_info);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems the wrong place to do this. The builder's hash/version should be burned into the builder, so we should adjust the derivation to give the build access to it.

Comment on lines +49 to +50
kernel_sha: Option<String>,
kernel_dirty: bool,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can lead to incorrect metadata in case someone generates pyproject.toml and then adds more commits. So doing this from CMake (and then giving the kernel build sandbox access to git files) seems like the way to do it. For the Nix build (since we probably do not want to copy all the git objects to the sandbox), we could pass the SHA to CMake. Not sure how important this is though.

This is currently also an issue for the sha hash we use in the ops name. We are generating it when making the project files, but it should probably be done inside CMake (but would better be a separate PR to avoid PRs becoming too big).

build_dir.display()
);

let dirty_build = detect_dirty_build(&variants);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split the dirty warnings into a separate PR? I think it's best to iterate on the metadata first as a smaller unit.

Comment thread kernel-builder/build.rs
emit_git_info();
}

fn emit_git_info() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth using built over our own custom code:

https://crates.io/crates/built

Also uses git2 rather than shelling out, which is generally more reliable than shelling out (e.g. git may not be available in a build sandbox).

Also seems to be used by at least one very high-profile crate (rav1e).


# Extra `create-pyproject` flags recording git provenance (commit SHA and
# dirty state) of the kernel source and `kernel-builder`.
provenanceArgs ? "",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's nicer if the args are just the atomic data, not flags.

Comment thread nix-builder/lib/build.nix
rev,
doGetKernelCheck,
stripRPath ? false,
provenanceArgs ? "",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set defaults? Makes it easy to miss.


# Git provenance of the `kernel-builder` itself (the flake revision it was
# evaluated from), recorded in the build metadata. `null`/`false` when
# `kernel-builder` is used from a non-git source (e.g. a local `path:`).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A local path can also be a git source. I think in flakes it must even be.

Comment on lines +17 to +18
builderRev ? null,
builderDirty ? false,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it kernel-overridable, I don't think we want to expose this in the public API. We should probably let the function take two attrsets. Also, I think we can pass through the builder flake and get it from there, rather than separate attributes. Finding these in flake.nix seems like the wrong place.

Comment on lines +49 to +69
provenanceArgs =
let
kernelSha =
if self == null then
null
else if self ? rev then
self.rev
else if self ? dirtyRev then
lib.removeSuffix "-dirty" self.dirtyRev
else
null;
kernelDirty = self != null && !(self ? rev);
builderSha = if builderRev == null then null else lib.removeSuffix "-dirty" builderRev;
parts =
lib.optional (kernelSha != null) "--kernel-sha ${kernelSha}"
++ lib.optional (kernelSha != null && kernelDirty) "--kernel-dirty"
++ lib.optional (builderSha != null) "--kernel-builder-sha ${builderSha}"
++ lib.optional (builderSha != null && builderDirty) "--kernel-builder-dirty";
in
lib.concatStringsSep " " parts;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above in arch.nix, this is a very local concern, doing this gen-flake-outputs.nix seems like the wrong level of abstraction.

@sayakpaul

Copy link
Copy Markdown
Member Author

Will break the PR into multiple parts, starting with appending the builder info, as discussed.

Will wait for the metadata versioning PR, first before continuing. This so that we don't have to make the provenance data optional,

@sayakpaul sayakpaul marked this pull request as draft June 19, 2026 12:29
@github-actions

Copy link
Copy Markdown

Coverage report — kernels/

Measured on: Python 3.10 / Torch 2.12.0.
Other CI configurations are not included in this number.
Hardware-gated code paths (ROCm/XPU/NPU/Darwin/Windows) are excluded or unreachable on the Linux+CUDA runner.

Total coverage: 84.5% — threshold: 80% — ✅

Per-file breakdown
Name Stmts Miss Cover Missing
src/kernels/__init__.py 10 0 100%
src/kernels/_system.py 6 1 83% 10
src/kernels/_versions.py 63 7 89% 46, 49, 52-53, 56-57, 100
src/kernels/backends.py 194 55 72% 40, 44, 48-51, 68, 90, 108, 117, 121, 125-127, 148, 170, 181, 188-191, 201, 205-225, 233, 256-276
src/kernels/compat.py 8 1 88% 5
src/kernels/deps.py 54 4 93% 58-59, 95, 98
src/kernels/layer/__init__.py 6 0 100%
src/kernels/layer/_interval_tree.py 103 4 96% 23, 52, 147, 150
src/kernels/layer/device.py 48 14 71% 42, 47-49, 91, 96-98, 101, 149, 152, 155-157
src/kernels/layer/func.py 93 7 92% 72, 100, 154, 257, 263, 272, 290
src/kernels/layer/globals.py 5 0 100%
src/kernels/layer/kernelize.py 73 8 89% 255, 273, 281-282, 288, 292, 308-310
src/kernels/layer/layer.py 174 15 91% 166, 209, 215, 228, 320-321, 333, 342, 350, 361, 390, 394, 407, 460, 490
src/kernels/layer/mode.py 14 0 100%
src/kernels/layer/repos.py 130 34 74% 27, 33, 36-41, 61-62, 68, 71-74, 88, 92, 101-102, 108, 111-114, 121-122, 128, 131-134, 141-142, 148, 151-154, 235
src/kernels/lockfile.py 71 46 35% 37-104, 108-131
src/kernels/status.py 49 2 96% 23, 81
src/kernels/utils.py 315 55 83% 65, 77-81, 87-88, 242, 246, 249, 311, 319, 358-359, 397, 428, 433, 468, 697, 700, 702, 708, 721-722, 743-755, 759-766, 774, 778-788, 792-799, 837, 841, 860, 862
src/kernels/variants.py 262 19 93% 56, 87, 108, 138, 247-248, 289, 291, 371-378, 384-390, 421-427, 439-445, 534-536
src/kernels/verify.py 88 1 99% 32
TOTAL 1766 273 85%

Updated by the Test kernels workflow on commit 2f1855613fb3b7f93c9a95b08f568369622d6266.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Signal when a kernel build comes from an uncommitted (dirty) kernel-builder version

3 participants