tokens-per-second

Star

Here are 8 public repositories matching this topic...

enescingoz / mac-llm-bench

Star

Community benchmark database for running LLMs on Apple Silicon Macs

benchmark inference apple-silicon llm llama-cpp local-llm llm-benchmark tokens-per-second

Updated Apr 22, 2026
Shell

rikulauttia / ai-gpu-playground-mac

Star

Hands-on CPU vs GPU benchmarks for Apple Silicon (M-series): PyTorch MPS, TensorFlow-Metal, MLX, and llama.cpp to measure TFLOP/s & tokens/sec and learn why GPUs accelerate training.

education benchmark deep-learning metal tensorflow gpu pytorch matrix-multiplication mps mlx hands-on apple-silicon llamacpp tflops tokens-per-second

Updated Aug 17, 2025
Python

SonnyTaylor / pi-tps

Star

Pi extension that measures and displays model TPS in the status bar

performance extension ai tps pi coding-agent tokens-per-second pi-coding-agent pi-package

Updated Jun 15, 2026
TypeScript

IAcriolla / typhon-stress-test

Star

Local LLM inference benchmarker. Measures TPS, TTFT, and VRAM pressure across context sizes — from 2K to 256K tokens. Works with llama.cpp, Ollama, LM Studio, and any OpenAI-compatible server.

python cli gpu benchmarks inference-optimization llm llama-cpp local-ai ollama tokens-per-second context-window-optimization

Updated May 13, 2026
Python

xavier-hernandez / compare-token-gen-ollama

Sponsor

Star

Lightweight shell script to benchmark token generation speed (tok/s) across Ollama models running in Docker. Auto-discovers all installed models or accepts a custom list via CLI. Uses Ollama's internal eval_duration timing for accurate results — no dependencies beyond curl and awk.

shell docker benchmarking benchmark gpu inference unraid compare token tokenization llm ollama llm-benchmark tokens-per-second ollama-benchmark

Updated Apr 18, 2026
Shell

myAIspeed / cli

Star

Speedtest for AI. Test latency to every major AI provider from your terminal.

Updated Mar 9, 2026
JavaScript

EPSILON0-dev / inference-benchmark

Star

Simple tool for measuring inference engine performance under multi-user load

benchmark inference llm tokens-per-second

Updated May 23, 2026
Python

myAIspeed / action

Star

Test AI provider latency (TTFB, TTFT, TPS) in your CI/CD pipeline. Benchmark OpenAI, Anthropic, Google, and more.

Updated Mar 9, 2026
JavaScript

Improve this page

Add a description, image, and links to the tokens-per-second topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tokens-per-second topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tokens-per-second

Here are 8 public repositories matching this topic...

enescingoz / mac-llm-bench

rikulauttia / ai-gpu-playground-mac

SonnyTaylor / pi-tps

IAcriolla / typhon-stress-test

xavier-hernandez / compare-token-gen-ollama

myAIspeed / cli

EPSILON0-dev / inference-benchmark

myAIspeed / action

Improve this page

Add this topic to your repo