Languages: English (current) · 中文
Full manuals: Full English manual · 完整中文手冊
Turn "I have an idea" into "runnable code + a 5-analyst risk review." An AI-native multi-agent research engine for investment research, SaaS product analysis, agent architecture evaluation, and academic paper reproduction.
Input — a one-line idea, an existing project directory, or a paper title Output — a structured analysis report (consensus / disagreement / risks / kill criteria) plus runnable code How — five specialist AI analysts + a quality gate that halts the pipeline when evidence is too thin (instead of confidently producing garbage)
| 🪙 Quant traders / researchers "I have a strategy idea — is it worth pursuing?" → Quant mode: research + backtest + tearsheet |
🛠️ SaaS / product builders "I want a differentiated product — where are the real pain points?" → SaaS mode: market + competitor + adoption barriers |
| 📚 Academic researchers "I want to reproduce and ablate this paper." → Scientist mode: algorithm + reproducibility + benchmark |
🤖 AI agent developers "I'm building an agent — is the architecture sound?" → Agent mode: state boundaries + replay safety |
Already have a project to fix? Any of the four research modes can be paired with Project Path mode — point Crucible at a directory and it will analyse, fix bugs, and apply additive changes only (existing APIs preserved).
→ Full mode reference: README_FULL.md#pipeline-modes
git clone https://github.com/Starlight143/crucible.git
cd crucible
pip install -r requirements.txtThen double-click launch_webui.bat (Windows) — the browser opens automatically.
First-run setup: open the Settings page and paste one API key:
- OpenRouter — recommended; multi-model routing with USD cost tracking
- Alibaba Coding Plan — token-only cost tracking
- Or run Ollama locally — set
LLM_PROVIDER=ollama; no key needed, zero ongoing cost
How do I get an API key?
- OpenRouter (recommended): create an account at openrouter.ai → Keys → Create Key → copy it → paste it into Crucible's Settings page (or set
OPENROUTER_API_KEYin your.env). Add credit to use paid models; several models have free tiers. - Alibaba Coding Plan: sign up at Model Studio, create an API key, paste it in Settings.
- Ollama (no key): install Ollama,
ollama pull <model>, setLLM_PROVIDER=ollama— fully local, no key, no per-token cost.
Your provider key stays in your local .env; Crucible only ever sends it to the provider you chose.
Optional: contribute to / read the shared cloud insight corpus
Crucible can mirror its run-insight ledger to a shared, Cloudflare-backed corpus so contributors accumulate signal together. This is opt-in and access is by request — it is not required to run Crucible (the default backend keeps everything on your own disk).
Request an ingest token (contribute your runs) or a read token (fetch the distilled summaries) via the issue tracker. You'll be given a token to put in .env:
CRUCIBLE_RUN_INSIGHTS_BACKEND=dual
CRUCIBLE_RUN_INSIGHTS_API_URL=<the Worker URL you are given>
CRUCIBLE_RUN_INSIGHTS_API_TOKEN=<your issued token>Ingest tokens are write-only (you cannot read others' raw data); read tokens only ever see curated, distilled output.
That's it. Pick a mode, type your idea (or paste a project path), hit Run.
Prefer the CLI? Click to expand.
# Interactive mode
python run_crucible.py
# Dry-run: scan context without calling LLMs
python run_crucible.py --dry-run
# Self-check only
python run_crucible.py --self-check
# With Direction Debate
python run_crucible.py --direction-debate
# Full production scope with cost tracking
python run_crucible.py --scope production --cost-trace --cost-reportFull flag reference: README_FULL.md, or python run_crucible.py --help.
Production deployment (Gunicorn). Click to expand.
pip install gunicorn
gunicorn --config gunicorn_config.py "webui.app:app"Key env overrides: GUNICORN_BIND (default 0.0.0.0:8080), GUNICORN_WORKERS, GUNICORN_TIMEOUT (default 300s — must exceed your longest pipeline run).
graph LR
A["Your input<br/>idea / path / paper"] --> B["Stage 0<br/>Web research<br/>+ citations"]
B --> C["Stage 1<br/>3 parallel research lanes<br/>Market · Technical · Competitor"]
C --> D{"Stage 2<br/>Direction Debate<br/>(optional)"}
D --> E["Stage 3<br/>5 specialist analysts<br/>+ Gate Controller"]
E -->|insufficient evidence| F["⛔ Pipeline halts"]
E -->|pass| G["Stage 4<br/>CodeGen + Quality Loop"]
G --> H["📦 Report<br/>+ runnable code"]
Five stages. Any stage can halt the pipeline if its quality gate fails — by design.
→ Stage-by-stage breakdown: README_FULL.md#stage-details
- ✅ Every claim is traceable to a cited source — the Research Synthesizer drops unsupported claims to
unknownsor flags them ashallucination_flags - ✅ Every decision is traceable to a specific analyst finding — full evidence chain preserved in output artifacts
- ✅ The pipeline halts automatically when evidence is insufficient — no confident-but-wrong outputs
- ✅ 3 255+ automated tests, 100% passing — covering injection, SSRF, redaction, numerical stability, and cross-process races
- ✅ Backtest data integrity is enforced by default — synthetic fallback data is rejected unless you explicitly opt in (
BACKTEST_REQUIRE_REAL_DATA=1is the default) - ✅ Pydantic-validated outputs at every stage — downstream stages never parse free text
- Free for personal, open-source, and academic use — AGPL v3
- Commercial use — closed-source distribution, proprietary SaaS that does not publish source, embedded use, or any deployment that cannot satisfy AGPL-3.0 requires a commercial license
Commercial inquiries: supervenus928@gmail.com · See COMMERCIAL_LICENSE.md
- 📘 Full English manual · 完整中文手冊
- 🏗️ Architecture
- 📝 CHANGELOG
- 🤝 Contributing
- ⚙️ Configuration reference
See CHANGELOG.md for the full version history.
