[fix(docker)]: build fine-tune env from source instead of the unpublished release wheel by devangpratap · Pull Request #2028 · kvcache-ai/ktransformers

devangpratap · 2026-05-31T00:04:29Z

What does this PR do?

The docker/Dockerfile sft build downloads a prebuilt ktransformers wheel:

https://github.com/kvcache-ai/ktransformers/releases/download/v0.5.3/ktransformers-0.5.3+cu128torch28fancy-cp312-cp312-linux_x86_64.whl

No ktransformers release (v0.5.0 through v0.6.2) publishes wheel assets, so that URL returns 404 and curl -fsSL aborts the build.

On current main the top-level ktransformers package is a thin metapackage (kt-kernel plus [sft] extras), and the repo is already cloned into the image. This installs it from that source instead of the missing wheel, and aligns the fine-tune env with the dependencies current kt-kernel pins.

Changes

Remove the KTRANSFORMERS_VERSION / KTRANSFORMERS_WHEEL ARGs and the dead wheel download.
Build the local kt-kernel from source in the fine-tune env (same as the serve env), then install ktransformers[sft] from the cloned /workspace/ktransformers. This uses the checked-out C++ kernels and avoids pip falling back to a PyPI kt-kernel that may not exist for an unreleased version.
Bump the fine-tune env to torch==2.9.1, which is what kt-kernel (pulled in by [sft]) pins.
Switch FLASH_ATTN_WHEEL to the matching torch2.9 build from the same v2.8.3 release.
Add editables to the build tools, required by the editable --no-build-isolation install of LLaMA-Factory.
Drop the removed wheel from the cleanup step.

What was tested

Replicated the fine-tune env dependency resolution in a nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04 container (CUDA 12.8, Python 3.12), running the same commands in the same order as the Dockerfile fine-tune stage:

torch==2.9.1 (cu128), then LLaMA-Factory[torch,metrics], then ktransformers[sft], then the flash_attn torch2.9 wheel.
All install with exit code 0. torch stays at 2.9.1+cu128 after every step (no silent up/downgrade).
kt_kernel, llamafactory, transformers (transformers-kt 5.6.0), and accelerate (accelerate-kt 1.14.0) import.
pip check reports no broken requirements.

This confirms the torch/flash_attn/editables alignment and that ktransformers[sft] plus LLaMA-Factory coexist under torch 2.9.1. In that run kt-kernel came from PyPI as a stand-in for dependency resolution.

What was not tested

A full docker build of this Dockerfile was not run. The base image (docker.1ms.run) and the tsinghua conda/pip mirrors are not reachable from the test machine, and the single framework stage also compiles DeepEP, kt-kernel (all CPU variants), and sglang for the serve env, which this PR does not touch.
The local kt-kernel source build in the fine-tune env (CPUINFER_BUILD_ALL_VARIANTS=1 ./install.sh build) was not run here. It mirrors the existing serve-env build step exactly, but it is a heavy C++/CUDA compile that was not executed in this verification.
The editables addition depends on the hatchling version resolved at build time. It was required to reproduce a clean editable install here; it is harmless if the mirror-resolved hatchling would not have needed it.

Before submitting

Read the contributor guideline.
Did you write any new necessary tests? No automated test added; a Dockerfile change is verified by reproducing the install sequence as described above.

…shed release wheel The sft build downloaded a prebuilt ktransformers wheel from a release URL that 404s (no ktransformers release publishes wheel assets), so docker build aborted at the curl step. Install ktransformers[sft] from the already-cloned source instead, and align the fine-tune env with what current kt-kernel pins: torch 2.9.1, the matching flash_attn torch2.9 wheel, and editables for the editable no-build-isolation install of LLaMA-Factory.

gemini-code-assist

Code Review

This pull request updates the Dockerfile to install ktransformers from the cloned source tree instead of downloading a prebuilt wheel, and upgrades PyTorch to version 2.9.1 along with the corresponding flash_attn wheel. A critical issue was identified where installing ktransformers[sft] directly without building the local kt-kernel package first will cause pip to download kt-kernel from PyPI, ignoring local C++ kernel modifications. It is recommended to build and install the local kt-kernel in the fine-tune environment prior to installing ktransformers.

gemini-code-assist · 2026-05-31T00:05:32Z

 RUN --mount=type=cache,target=/root/.cache/pip \
    if [ "$FUNCTIONALITY" = "sft" ]; then \
-        /opt/miniconda3/envs/fine-tune/bin/pip install /workspace/${KTRANSFORMERS_WHEEL}; \
+        /opt/miniconda3/envs/fine-tune/bin/pip install "/workspace/ktransformers[sft]"; \
    fi


Installing ktransformers[sft] directly from /workspace/ktransformers[sft] without first building and installing the local kt-kernel package in the fine-tune environment will cause pip to resolve and download kt-kernel from PyPI (or the configured mirror). This means any local modifications to the C++ kernels in kt-kernel/ will be completely ignored during fine-tuning, and it could also fail if the matching version of kt-kernel is not yet published.

To ensure consistency and that local changes are correctly built and used, we should build and install the local kt-kernel in the fine-tune environment using the same install.sh script, just as is done for the serve environment.

RUN --mount=type=cache,target=/root/.cache/pip \ if [ "$FUNCTIONALITY" = "sft" ]; then \ . /opt/miniconda3/etc/profile.d/conda.sh && conda activate fine-tune \ && cd /workspace/ktransformers/kt-kernel \ && CPUINFER_BUILD_ALL_VARIANTS=1 ./install.sh build \ && pip install "/workspace/ktransformers[sft]"; \ fi

…ng ktransformers Per review feedback: building kt-kernel from the cloned source (as the serve env already does) means the locally checked-out C++ kernels are used and pip does not fall back to a PyPI kt-kernel, which may not exist for an unreleased version.

devangpratap · 2026-06-05T22:57:21Z

@jdai0 sorry for the ping, this is ready for review, just needs the run-ci label. Thanks

jdai0 · 2026-06-08T07:29:45Z

Thanks, I've added the run-ci label.

gemini-code-assist Bot reviewed May 31, 2026

View reviewed changes

jdai0 added the run-ci label Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix(docker)]: build fine-tune env from source instead of the unpublished release wheel#2028

[fix(docker)]: build fine-tune env from source instead of the unpublished release wheel#2028
devangpratap wants to merge 2 commits into
kvcache-ai:mainfrom
devangpratap:fix/1762-docker-source-install

devangpratap commented May 31, 2026 •

edited by jdai0

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 31, 2026

Uh oh!

devangpratap commented Jun 5, 2026

Uh oh!

jdai0 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

devangpratap commented May 31, 2026 • edited by jdai0 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Changes

What was tested

What was not tested

Before submitting

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

devangpratap commented Jun 5, 2026

Uh oh!

jdai0 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

devangpratap commented May 31, 2026 •

edited by jdai0

Loading