Text Diffusion Quantization

Squeezing text-diffusion models onto your laptop. ⚡

An open-source effort to make diffusion-based language and vision-language models run efficiently on consumer hardware through quantization, optimization, and memory-efficient inference — one model at a time.

Why

Diffusion-based (V)LMs are built for high-end GPUs and servers. This project asks a simpler question: how small can we make them before they stop being useful? Every model we tackle gets the same treatment — measure its real footprint, quantize it, prove it still works, and document exactly what fits where.

Research Areas

4-bit and 8-bit quantization
Memory-efficient diffusion inference
Vision encoder compression
ONNX, TensorRT, and OpenVINO optimization
CPU and integrated GPU acceleration
Apple Silicon support
Low-RAM deployment techniques
Benchmarking quality vs. performance tradeoffs

Models

Model	Status	Result
Nemotron-Labs-Diffusion-VLM-8B	✅ 4-bit proven	5.6 GiB checkpoint, runs in 8.3 GiB (fits a 16 GB laptop), 0-point accuracy drop on MMLU + ScienceQA

More models coming. Each one follows the same workflow: footprint → quantize → verify → benchmark.

Mission

Push the boundaries of local AI by bringing state-of-the-art diffusion models to everyday laptops — and documenting every breakthrough along the way.

Status

🚧 Experimental Research Project

Contributions, benchmarks, optimization ideas, and reproducible results are welcome.

Development

docs/WORKING_NOTES.md — environment setup, model quirks, how to run each phase, troubleshooting, and the roadmap.
reports/weight_footprint.md — measured weight footprint and per-precision projections.
reports/quantization_results.md — BF16 vs 4-bit results (3× smaller, ~2× less memory, no meaningful quality loss).
reports/benchmark.md — speed/memory benchmark (python -m text_diffusion_quantization.benchmark).
reports/eval.md — accuracy on MMLU + ScienceQA proving 4-bit ≈ BF16 (python -m text_diffusion_quantization.evaluate).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
reports		reports
scripts		scripts
src/text_diffusion_quantization		src/text_diffusion_quantization
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Diffusion Quantization

Why

Research Areas

Models

Mission

Status

Development

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Text Diffusion Quantization

Why

Research Areas

Models

Mission

Status

Development

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages