Skip to content

NVIDIA/srt-slurm

Repository files navigation

srtctl

Command-line tool for distributed LLM inference benchmarks on SLURM clusters using TensorRT LLM, SGLang and vLLM. Replace complex shell scripts and 50+ CLI flags with declarative YAML configuration.

Quick Start

# Clone and install
git clone https://github.com/your-org/srtctl.git
cd srtctl
pip install -e .

# One-time setup (downloads NATS/ETCD, creates srtslurm.yaml)
make setup ARCH=aarch64  # or ARCH=x86_64

Documentation

Full documentation: https://srtctl.gitbook.io/srtctl-docs/

Commands

# Submit job(s)
srtctl apply -f config.yaml

# Submit with custom setup script
srtctl apply -f config.yaml --setup-script custom-setup.sh

# Submit with tags for filtering
srtctl apply -f config.yaml --tags experiment,baseline

# Dry-run (validate without submitting)
srtctl dry-run -f config.yaml

# Launch analysis dashboard
uv run streamlit run analysis/dashboard/app.py

About

NVIDIA Inference Benchmarks provide recipes in ready-to-use templates for evaluating platform speed. Validate your platform across specific AI use cases across hardware and software combinations.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages