Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Adds the --kernel-sha/--kernel-dirty and --kernel-builder-sha/ --kernel-builder-dirty flags to the generated CLI reference, fixing the check-cli-docs CI job. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fixes the `nix fmt --ci` formatting check. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| git_info: Option<GitInfo>, | ||
| kernel_builder: Option<KernelBuilderInfo>, |
There was a problem hiding this comment.
We shouldn't overload this struct, it's purely for uniquely identifying a kernel (e.g. for ops name generation). Let's keep it small and atomic.
| // Prefer externally-provided `kernel-builder` provenance (e.g. from Nix), | ||
| // falling back to the provenance baked in at compile time. | ||
| let kernel_builder = kernel_id | ||
| .kernel_builder() | ||
| .cloned() | ||
| .unwrap_or_else(kernel_builder_info); |
There was a problem hiding this comment.
This seems the wrong place to do this. The builder's hash/version should be burned into the builder, so we should adjust the derivation to give the build access to it.
| kernel_sha: Option<String>, | ||
| kernel_dirty: bool, |
There was a problem hiding this comment.
This can lead to incorrect metadata in case someone generates pyproject.toml and then adds more commits. So doing this from CMake (and then giving the kernel build sandbox access to git files) seems like the way to do it. For the Nix build (since we probably do not want to copy all the git objects to the sandbox), we could pass the SHA to CMake. Not sure how important this is though.
This is currently also an issue for the sha hash we use in the ops name. We are generating it when making the project files, but it should probably be done inside CMake (but would better be a separate PR to avoid PRs becoming too big).
| build_dir.display() | ||
| ); | ||
|
|
||
| let dirty_build = detect_dirty_build(&variants); |
There was a problem hiding this comment.
Can we split the dirty warnings into a separate PR? I think it's best to iterate on the metadata first as a smaller unit.
| emit_git_info(); | ||
| } | ||
|
|
||
| fn emit_git_info() { |
There was a problem hiding this comment.
Might be worth using built over our own custom code:
https://crates.io/crates/built
Also uses git2 rather than shelling out, which is generally more reliable than shelling out (e.g. git may not be available in a build sandbox).
Also seems to be used by at least one very high-profile crate (rav1e).
|
|
||
| # Extra `create-pyproject` flags recording git provenance (commit SHA and | ||
| # dirty state) of the kernel source and `kernel-builder`. | ||
| provenanceArgs ? "", |
There was a problem hiding this comment.
I think it's nicer if the args are just the atomic data, not flags.
| rev, | ||
| doGetKernelCheck, | ||
| stripRPath ? false, | ||
| provenanceArgs ? "", |
There was a problem hiding this comment.
Why set defaults? Makes it easy to miss.
|
|
||
| # Git provenance of the `kernel-builder` itself (the flake revision it was | ||
| # evaluated from), recorded in the build metadata. `null`/`false` when | ||
| # `kernel-builder` is used from a non-git source (e.g. a local `path:`). |
There was a problem hiding this comment.
A local path can also be a git source. I think in flakes it must even be.
| builderRev ? null, | ||
| builderDirty ? false, |
There was a problem hiding this comment.
This makes it kernel-overridable, I don't think we want to expose this in the public API. We should probably let the function take two attrsets. Also, I think we can pass through the builder flake and get it from there, rather than separate attributes. Finding these in flake.nix seems like the wrong place.
| provenanceArgs = | ||
| let | ||
| kernelSha = | ||
| if self == null then | ||
| null | ||
| else if self ? rev then | ||
| self.rev | ||
| else if self ? dirtyRev then | ||
| lib.removeSuffix "-dirty" self.dirtyRev | ||
| else | ||
| null; | ||
| kernelDirty = self != null && !(self ? rev); | ||
| builderSha = if builderRev == null then null else lib.removeSuffix "-dirty" builderRev; | ||
| parts = | ||
| lib.optional (kernelSha != null) "--kernel-sha ${kernelSha}" | ||
| ++ lib.optional (kernelSha != null && kernelDirty) "--kernel-dirty" | ||
| ++ lib.optional (builderSha != null) "--kernel-builder-sha ${builderSha}" | ||
| ++ lib.optional (builderSha != null && builderDirty) "--kernel-builder-dirty"; | ||
| in | ||
| lib.concatStringsSep " " parts; | ||
|
|
There was a problem hiding this comment.
As mentioned above in arch.nix, this is a very local concern, doing this gen-flake-outputs.nix seems like the wrong level of abstraction.
|
Will break the PR into multiple parts, starting with appending the builder info, as discussed. Will wait for the metadata versioning PR, first before continuing. This so that we don't have to make the provenance data optional, |
Coverage report —
|
| Name | Stmts | Miss | Cover | Missing |
|---|---|---|---|---|
| src/kernels/__init__.py | 10 | 0 | 100% | |
| src/kernels/_system.py | 6 | 1 | 83% | 10 |
| src/kernels/_versions.py | 63 | 7 | 89% | 46, 49, 52-53, 56-57, 100 |
| src/kernels/backends.py | 194 | 55 | 72% | 40, 44, 48-51, 68, 90, 108, 117, 121, 125-127, 148, 170, 181, 188-191, 201, 205-225, 233, 256-276 |
| src/kernels/compat.py | 8 | 1 | 88% | 5 |
| src/kernels/deps.py | 54 | 4 | 93% | 58-59, 95, 98 |
| src/kernels/layer/__init__.py | 6 | 0 | 100% | |
| src/kernels/layer/_interval_tree.py | 103 | 4 | 96% | 23, 52, 147, 150 |
| src/kernels/layer/device.py | 48 | 14 | 71% | 42, 47-49, 91, 96-98, 101, 149, 152, 155-157 |
| src/kernels/layer/func.py | 93 | 7 | 92% | 72, 100, 154, 257, 263, 272, 290 |
| src/kernels/layer/globals.py | 5 | 0 | 100% | |
| src/kernels/layer/kernelize.py | 73 | 8 | 89% | 255, 273, 281-282, 288, 292, 308-310 |
| src/kernels/layer/layer.py | 174 | 15 | 91% | 166, 209, 215, 228, 320-321, 333, 342, 350, 361, 390, 394, 407, 460, 490 |
| src/kernels/layer/mode.py | 14 | 0 | 100% | |
| src/kernels/layer/repos.py | 130 | 34 | 74% | 27, 33, 36-41, 61-62, 68, 71-74, 88, 92, 101-102, 108, 111-114, 121-122, 128, 131-134, 141-142, 148, 151-154, 235 |
| src/kernels/lockfile.py | 71 | 46 | 35% | 37-104, 108-131 |
| src/kernels/status.py | 49 | 2 | 96% | 23, 81 |
| src/kernels/utils.py | 315 | 55 | 83% | 65, 77-81, 87-88, 242, 246, 249, 311, 319, 358-359, 397, 428, 433, 468, 697, 700, 702, 708, 721-722, 743-755, 759-766, 774, 778-788, 792-799, 837, 841, 860, 862 |
| src/kernels/variants.py | 262 | 19 | 93% | 56, 87, 108, 138, 247-248, 289, 291, 371-378, 384-390, 421-427, 439-445, 534-536 |
| src/kernels/verify.py | 88 | 1 | 99% | 32 |
| TOTAL | 1766 | 273 | 85% |
Updated by the Test kernels workflow on commit 2f1855613fb3b7f93c9a95b08f568369622d6266.
Fixes #648
Considerations:
.git, so this information has to be passed in explicitly. So, builder-info related stuff become command-line arguments embedded in the derivation's build script at eval time. Is there a better way to handle it?nix build/nix flake showwrites an untrackedflake.lock, which makes the kernel tree dirty akakernel.dirty=trueunless the lock is committed. I have made a note about it in the docs. Is there a better way to handle it?Example
metadata.jsonwith builder-info:{ "name": "relu", "id": "_relu_cuda_9f8a0ff", "version": 1, "license": "Apache-2.0", "source": "https://github.com/myorg/relu", "python-depends": [], "backend": { "type": "cuda", "archs": ["7.0", "7.5", "8.0", "8.6", "8.9", "9.0+PTX"] }, "digest": { "algorithm": "sha256", "files": { "__init__.py": "xLMbARTcTl8L/m1kJLc/h/QL4Kzt772F872a46pfRGI=", "_relu_cuda_9f8a0ff.abi3.so": "vtdzzToloH38HZkVs7sFEf69QFDxROuPsBAond3Jic0=", "_ops.py": "Hrp5aF4o0eHSttw4sQGsbBAXFqvLJ42Y9YJ2KkqvZhg=", "relu/__init__.py": "DFYPlrhXwYjEqCl/8n0SmWGZV8NFml5DPhMjKfv98GY=" } }, "build-info": { "kernel-builder": { "version": "0.16.0-dev0", "sha": "03795978d693a6e0872f6a36187de55888744cfd", "dirty": false }, "kernel": { "sha": "9f8a0ff1dcb2e1d34d46950eadf0714050f9af04", "dirty": false } } }We throw a warning during upload to the Hub and also show a warning on the Hub card that the corresponding builds were made with a "dirty" version. Kernel loading also logs similar warnings.
Since the real changes affect the above three surfaces, the total changes are a bit bigger than I expected (including all the docs and tests). I can break the PR down further:
kernel-builderCLIkernelsrelated changes