Flow Map Language Models:
One-step Language Modeling via Continuous Denoising

Chanhyuk Lee¹, Jaehoon Yoo¹, Manan Agarwal², Sheel Shah², Jerry Huang²,
Aditi Raghunathan², Seunghoon Hong¹, Nicholas M. Boffi^†2, Jinwoo Kim^†1

¹KAIST ²Carnegie Mellon University ^†Equal advising

[Project Page] | [Paper] | [Blog]

News

[2026-04] We released LM1B/OpenWebText checkpoints for FLM and FMLM.

TL;DR

We introduce Flow Language Model (FLM) and its flow-map distilled variant Flow Map Language Model (FMLM), enabling one-step parallel text generation through continuous denoising.

Overview

FLM applies the benefits of continuous image generation to discrete state spaces by encoding text as one-hot vectors and using flow matching to directly map noise to one-hot data. Unlike discrete diffusion, FLM gradually denoises all tokens in parallel, allowing it to represent a superposition of sequences while capturing correlations between tokens — a fundamental bottleneck for discrete diffusion in the few-step regime.

How to Run

Install Dependencies

pip install torch>=2.3.0
pip install -r requirements.txt
# Install flash-attn separately matching your python / torch version (see https://github.com/Dao-AILab/flash-attention/releases)
pip install flash-attn==2.8.3 --no-build-isolation

Our DiT backbone supports torch.compile with max-autotune for faster training. Enable it by setting the environment variable before running any script:

export DIT_USE_COMPILE=TRUE

With the option, we are able to train OpenWebText experiments with 512 batch size on 8 H100 (80GB VRAM), without gradient accumulation.

Training

Before running, update data.cache_dir in the scripts to point to your dataset location. If the directory is empty, the dataset will be automatically downloaded and preprocessed.

Set algo.teacher_path to your pre-trained FLM checkpoint before running FMLM distillation.

Model	Dataset	Script
FLM	LM1B	scripts/train_lm1b_flm.sh
FMLM	LM1B	scripts/train_lm1b_fmlm_denoiser.sh
FLM	OpenWebText	scripts/train_owt_flm.sh
FMLM	OpenWebText	scripts/train_owt_fmlm_denoiser.sh

Evaluation

Set CKPT_PATH in the script to your trained checkpoint before running.

Model	Dataset	Script
FLM	LM1B	scripts/gen_ppl_lm1b_flm.sh
FMLM	LM1B	scripts/gen_ppl_lm1b_fmlm.sh
FLM	OpenWebText	scripts/gen_ppl_owt_flm.sh
FMLM	OpenWebText	scripts/gen_ppl_owt_fmlm.sh

Checkpoints

Pretrained Checkpoints

Pretrained FLM and FMLM checkpoints are available at here.

Model	Dataset	Checkpoint
FLM	LM1B	`lm1b_flm.ckpt`
FMLM	LM1B	`lm1b_fmlm.ckpt`
FLM	OpenWebText	`owt_flm.ckpt`
FMLM	OpenWebText	`owt_fmlm.ckpt`

Set eval.checkpoint_path (or algo.teacher_path for distillation) to the downloaded checkpoint path when running evaluation or distillation scripts.

Baseline Checkpoints

Reproduced baseline checkpoints are available at here.

For other checkpoints, mostly for OpenWebText, refer to Duo, SDTT, RDLM, di4c repositories.

BibTeX

@article{lee2026flow,
    title={Flow Map Language Models: One-step Language Modeling via Continuous Denoising},
    author={Chanhyuk Lee and Jaehoon Yoo and Manan Agarwal
            and Sheel Shah and Jerry Huang
            and Aditi Raghunathan and Seunghoon Hong
            and Nicholas M. Boffi and Jinwoo Kim},
    journal={arXiv preprint arXiv:2602.16813},
    year={2026},
}

Acknowledgements

This codebase builds upon Duo and ReDi

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
analysis		analysis
assets		assets
configs		configs
experiments		experiments
figures		figures
models		models
scripts		scripts
tests		tests
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
algo.py		algo.py
dataloader.py		dataloader.py
main.py		main.py
metrics.py		metrics.py
requirements.txt		requirements.txt
trainer_base.py		trainer_base.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flow Map Language Models:
One-step Language Modeling via Continuous Denoising

News

TL;DR

Overview

How to Run

Install Dependencies

Training

Evaluation

Checkpoints

Pretrained Checkpoints

Baseline Checkpoints

BibTeX

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Flow Map Language Models:One-step Language Modeling via Continuous Denoising

News

TL;DR

Overview

How to Run

Install Dependencies

Training

Evaluation

Checkpoints

Pretrained Checkpoints

Baseline Checkpoints

BibTeX

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Flow Map Language Models:
One-step Language Modeling via Continuous Denoising

Packages