Internal chat playground for testing unreleased "model-raising" (novel safety-pretrained) checkpoints on a single A100. Includes a one-click Claude-Code (Opus) auditor that probes each model with ~100 questions across 10 categories — plus two specialized subagents: canaries (checks for emergent surfacing of training-time canary values) and persona_trigger (EPE-only; checks whether <assistant> / charter-section openers push the model into its reflection persona) — and writes a short summary of weird/safer/broken behaviors.
- vLLM — one OpenAI-compatible subprocess per model on a fixed port. Only one model loaded at a time on the GPU; switches are serialized on an in-flight counter.
- NiceGUI — dashboard with per-model chat, audit launcher, and add-model form.
- claude-agent-sdk — drives an Opus auditor with category subagents and an in-process MCP
ask_modeltool that maintains true multi-turn conversations with the model under test. - OmegaConf + per-file YAML — model registry; new models are added from the dashboard and persisted to
conf/models/*.yaml. - pyngrok — public URL on startup, gated by ngrok's built-in OAuth.
# 1. Install
uv sync # or: pip install -e .
# 2. Authenticate Claude Code with your Max subscription (one-time)
claude login
# 3. Create a .env file in the repo root (auto-loaded by scripts/start.sh)
echo 'NGROK_AUTHTOKEN=...' > .env # optional — without it, no public tunnel
# echo 'DASHBOARD_STORAGE_SECRET=...' >> .env # optional — stable NiceGUI storage key
# 4. Start the dashboard (also opens an ngrok tunnel if the token is set)
bash scripts/start.shThe startup script prints the public URL. Open it, click Load on a model, then Chat or Audit.
Either drop a YAML file into conf/models/ (see baseline_safelm_1p7b.yaml for the schema) or use the Add Model form in the dashboard footer.
conf/ # config.yaml + per-model YAMLs
data/ # audit_questions.yaml + canaries.yaml + audits/{id}.json summaries
logs/ # vllm.log + per-audit Claude logs
src/model_raising_chat/
config.py # Pydantic schemas, registry I/O
supervisor.py # vLLM subprocess lifecycle
audit.py # Claude-Code audit runner + MCP ask_model tool
tokenize.py # local tokenizer that preserves registered special tokens
state.py # module-level singletons
dashboard/
app.py # NiceGUI entry-point + ngrok
layout.py # shared page frame
theme.py # dashboard theming
pages/ # home, chat, audit
scripts/start.sh