vllm-vcr

Record, play, and inspect vLLM V1 engine-core traces. One binary, three subcommands (the VCR metaphor):

record taps a live vLLM frontend ↔ engine-core link (a transparent ZMQ proxy) and writes a JSONL trace.
play runs a mock engine-core backend that speaks the real ZMQ + msgpack protocol, replaying a trace or simulating from a latency model. No model weights, no GPU. With the nixl feature it also moves simulated KV-cache bytes between prefill and decode over NIXL.
inspect converts benchmark reports, summarizes traces, renders Perfetto timelines, and runs calibration.

It runs behind vLLM's Rust or Python frontend unchanged: the frontend still owns tokenization, chat templates, tool calling, streaming, and OpenAI-compatible request handling; vllm-vcr replaces only the model backend.

Documentation

📖 Full docs: https://neuralmagic.github.io/vllm-vcr/

The site covers architecture, install, the quick start, trace replay and calibration, versioning and conformance, and operations. Source lives in docs/ and is built with mdBook.

Install

Requires Rust 1.85 or newer. From a checkout:

cargo install --path . --locked

That installs the single vllm-vcr binary. See the Install guide for the NIXL-enabled build, the container image, and installing from Git.

Quick start

# Run the mock engine; point a vLLM frontend at the same handshake address.
vllm-vcr play --handshake-address tcp://127.0.0.1:29550 --log-requests

Full walkthrough (frontend wiring, prefill/decode smoke, capture and replay) in the Quick start.

Contributing

Run cargo fmt and cargo clippy --all --benches --tests --examples --all-features before sending a change; CI runs the same plus the per-vLLM-line conformance suite (see .github/workflows).

License

Dual-licensed under Apache-2.0 or MIT, at your option.

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
.cargo		.cargo
.github		.github
ci		ci
conformance		conformance
crates		crates
demo		demo
deploy		deploy
docs		docs
examples		examples
scripts		scripts
src		src
tests		tests
traces		traces
xtask		xtask
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
Dockerfile.tap		Dockerfile.tap
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md
book.toml		book.toml
build.rs		build.rs
compat.toml		compat.toml
deny.toml		deny.toml
entrypoint.sh		entrypoint.sh
justfile		justfile
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vllm-vcr

Documentation

Install

Quick start

Contributing

License

About

Licenses found

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

vllm-vcr

Documentation

Install

Quick start

Contributing

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages