Skip to content

moshi: make bitsandbytes optional (fix #385)#415

Open
Kymi808 wants to merge 1 commit into
kyutai-labs:mainfrom
Kymi808:fix/optional-bitsandbytes
Open

moshi: make bitsandbytes optional (fix #385)#415
Kymi808 wants to merge 1 commit into
kyutai-labs:mainfrom
Kymi808:fix/optional-bitsandbytes

Conversation

@Kymi808

@Kymi808 Kymi808 commented May 19, 2026

Copy link
Copy Markdown

Fixes #385.

Summary

bitsandbytes is only used in moshi.utils.quantize.QLinear, which is constructed exclusively under explicit quantization paths (replace_linear_with_qlinear in lm.py and transformer.py, gated by a quantize flag). Upstream bitsandbytes ships only x86_64-Linux and Windows wheels — no aarch64-Linux wheels — so keeping it as a hard dependencies entry broke pip install moshi on aarch64 platforms (e.g. NVIDIA DGX Spark, ARM cloud instances) for users who weren't using quantization at all.

The same install problem surfaced in #357, where you wrote:

"Fwiw you could try removing the bitsandbytes dependency as it's only used when fine-tuning a model."

This PR follows that direction — moving bitsandbytes to an opt-in extra.

Changes

  • moshi/pyproject.toml — move bitsandbytes from main dependencies to a new [project.optional-dependencies] group called quant. Users who need quantized linear layers install via pip install "moshi[quant]". The version pin (>= 0.45, < 0.50.0; sys_platform == 'linux') is preserved unchanged.
  • moshi/moshi/utils/quantize.py — wrap the lazy bitsandbytes imports in a small _import_bitsandbytes() helper that surfaces a clear ImportError pointing at the new extras install command, so users who instantiate QLinear without the extra get an actionable error rather than a bare ModuleNotFoundError.

Verified locally

  • pyproject.toml parses as valid TOML; bitsandbytes no longer in dependencies; the new quant extra exposes the same constraint.
  • import moshi.utils.quantize succeeds with bitsandbytes unavailable.
  • Instantiating QLinear(...) without bitsandbytes installed raises ImportError with the documented install hint:

    "bitsandbytes is required for quantized linear layers but is not installed. Install it via pip install "moshi[quant]". Note: bitsandbytes only ships x86_64 Linux / Windows wheels; aarch64 Linux is unsupported upstream."

  • flake8 is clean on the changed file. (Pyright reports 2 pre-existing reportPrivateImportUsage warnings on the existing torch.float references — out of scope for this PR.)

Diff is +26 / -3.

CLA

I, Kymi808, confirm that I have read and understood the terms of the CLA of Kyutai-labs, as outlined in the repository's CONTRIBUTING.md, and I agree to be bound by these terms.

bitsandbytes is only used in moshi.utils.quantize.QLinear, which is
constructed exclusively under explicit quantization paths
(replace_linear_with_qlinear in lm.py and transformer.py, gated by a
`quantize` flag). Upstream bitsandbytes ships only x86_64-linux and
Windows wheels — no aarch64-linux wheels — so keeping it as a hard
dependency broke `pip install moshi` on aarch64 platforms (e.g. DGX
Spark) for users who weren't using quantization at all. The same
issue blocked dockerized installs in kyutai-labs#357, where the maintainer's
suggestion was to remove the dependency since "it's only used when
fine-tuning a model."

Changes
-------

- pyproject.toml: move bitsandbytes from `dependencies` to a new
  `[project.optional-dependencies]` group `quant`. Users who need
  quantized linear layers can install via `pip install "moshi[quant]"`.
- moshi/utils/quantize.py: wrap the lazy bitsandbytes imports in a
  small `_import_bitsandbytes` helper that surfaces a clear ImportError
  pointing at the new extras install command, so users who instantiate
  `QLinear` without the extra installed get an actionable message
  rather than a bare `ModuleNotFoundError`.

Verified locally
----------------

- pyproject.toml parses as valid TOML; bitsandbytes no longer in main
  dependencies; new `quant` extra exposes the same constraint
  previously pinned in `dependencies`.
- `import moshi.utils.quantize` succeeds with bitsandbytes unavailable.
- Instantiating `QLinear(...)` without bitsandbytes installed raises
  ImportError with the documented install hint.
- `flake8` clean on the changed file. (Pyright reports 2 pre-existing
  `reportPrivateImportUsage` warnings on the `torch.float` references
  that already existed in the file; not in this PR's scope.)

I, Kymi808, confirm that I have read and understood the terms of the
CLA of Kyutai-labs, as outlined in the repository's CONTRIBUTING.md,
and I agree to be bound by these terms.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] bitsandbytes version

1 participant