[fix(docker)]: build fine-tune env from source instead of the unpublished release wheel#2028
[fix(docker)]: build fine-tune env from source instead of the unpublished release wheel#2028devangpratap wants to merge 2 commits into
Conversation
…shed release wheel The sft build downloaded a prebuilt ktransformers wheel from a release URL that 404s (no ktransformers release publishes wheel assets), so docker build aborted at the curl step. Install ktransformers[sft] from the already-cloned source instead, and align the fine-tune env with what current kt-kernel pins: torch 2.9.1, the matching flash_attn torch2.9 wheel, and editables for the editable no-build-isolation install of LLaMA-Factory.
There was a problem hiding this comment.
Code Review
This pull request updates the Dockerfile to install ktransformers from the cloned source tree instead of downloading a prebuilt wheel, and upgrades PyTorch to version 2.9.1 along with the corresponding flash_attn wheel. A critical issue was identified where installing ktransformers[sft] directly without building the local kt-kernel package first will cause pip to download kt-kernel from PyPI, ignoring local C++ kernel modifications. It is recommended to build and install the local kt-kernel in the fine-tune environment prior to installing ktransformers.
| RUN --mount=type=cache,target=/root/.cache/pip \ | ||
| if [ "$FUNCTIONALITY" = "sft" ]; then \ | ||
| /opt/miniconda3/envs/fine-tune/bin/pip install /workspace/${KTRANSFORMERS_WHEEL}; \ | ||
| /opt/miniconda3/envs/fine-tune/bin/pip install "/workspace/ktransformers[sft]"; \ | ||
| fi |
There was a problem hiding this comment.
Installing ktransformers[sft] directly from /workspace/ktransformers[sft] without first building and installing the local kt-kernel package in the fine-tune environment will cause pip to resolve and download kt-kernel from PyPI (or the configured mirror). This means any local modifications to the C++ kernels in kt-kernel/ will be completely ignored during fine-tuning, and it could also fail if the matching version of kt-kernel is not yet published.
To ensure consistency and that local changes are correctly built and used, we should build and install the local kt-kernel in the fine-tune environment using the same install.sh script, just as is done for the serve environment.
RUN --mount=type=cache,target=/root/.cache/pip \
if [ "$FUNCTIONALITY" = "sft" ]; then \
. /opt/miniconda3/etc/profile.d/conda.sh && conda activate fine-tune \
&& cd /workspace/ktransformers/kt-kernel \
&& CPUINFER_BUILD_ALL_VARIANTS=1 ./install.sh build \
&& pip install "/workspace/ktransformers[sft]"; \
fi
…ng ktransformers Per review feedback: building kt-kernel from the cloned source (as the serve env already does) means the locally checked-out C++ kernels are used and pip does not fall back to a PyPI kt-kernel, which may not exist for an unreleased version.
|
@jdai0 sorry for the ping, this is ready for review, just needs the run-ci label. Thanks |
|
Thanks, I've added the run-ci label. |
What does this PR do?
Fixes #1762.
The
docker/Dockerfilesft build downloads a prebuilt ktransformers wheel:No ktransformers release (v0.5.0 through v0.6.2) publishes wheel assets, so that URL returns 404 and
curl -fsSLaborts the build.On current main the top-level
ktransformerspackage is a thin metapackage (kt-kernelplus[sft]extras), and the repo is already cloned into the image. This installs it from that source instead of the missing wheel, and aligns the fine-tune env with the dependencies currentkt-kernelpins.Changes
KTRANSFORMERS_VERSION/KTRANSFORMERS_WHEELARGs and the dead wheel download.kt-kernelfrom source in the fine-tune env (same as the serve env), then installktransformers[sft]from the cloned/workspace/ktransformers. This uses the checked-out C++ kernels and avoids pip falling back to a PyPIkt-kernelthat may not exist for an unreleased version.torch==2.9.1, which is whatkt-kernel(pulled in by[sft]) pins.FLASH_ATTN_WHEELto the matchingtorch2.9build from the samev2.8.3release.editablesto the build tools, required by the editable--no-build-isolationinstall of LLaMA-Factory.What was tested
Replicated the fine-tune env dependency resolution in a
nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04container (CUDA 12.8, Python 3.12), running the same commands in the same order as the Dockerfile fine-tune stage:torch==2.9.1(cu128), thenLLaMA-Factory[torch,metrics], thenktransformers[sft], then the flash_attn torch2.9 wheel.2.9.1+cu128after every step (no silent up/downgrade).kt_kernel,llamafactory,transformers(transformers-kt 5.6.0), andaccelerate(accelerate-kt 1.14.0) import.pip checkreports no broken requirements.This confirms the torch/flash_attn/
editablesalignment and thatktransformers[sft]plus LLaMA-Factory coexist under torch 2.9.1. In that runkt-kernelcame from PyPI as a stand-in for dependency resolution.What was not tested
docker buildof this Dockerfile was not run. The base image (docker.1ms.run) and the tsinghua conda/pip mirrors are not reachable from the test machine, and the singleframeworkstage also compiles DeepEP, kt-kernel (all CPU variants), and sglang for the serve env, which this PR does not touch.kt-kernelsource build in the fine-tune env (CPUINFER_BUILD_ALL_VARIANTS=1 ./install.sh build) was not run here. It mirrors the existing serve-env build step exactly, but it is a heavy C++/CUDA compile that was not executed in this verification.editablesaddition depends on the hatchling version resolved at build time. It was required to reproduce a clean editable install here; it is harmless if the mirror-resolved hatchling would not have needed it.Before submitting