Your code is cooked. Pragma knows it. And now, so do you.
Maybe you vibed your way through a project. Maybe you paid someone to build your business website. Maybe ChatGPT wrote 80% of it and it works — kinda — and you're not asking questions.
The problem? That code is probably a security nightmare. Hardcoded secrets, SQL injection waiting to happen, auth that breaks if you look at it wrong. Nobody checked it. And honestly, neither did the AI that generated it.
Pragma fixes that.
Pragma is an agentic security auditor that scans your repo, finds the vulnerabilities, and explains what's wrong — in plain English, not in "refer to CWE-89 for further details." Then it tells you exactly how to fix it, with a working code snippet you can paste in directly.
No security degree required. No dev on speed dial required.
If you can't read code:
- You own a restaurant, a shop, a startup — someone built you a website and you just... hope it's fine
- You hired a freelancer and have no way to verify their work is secure
- You're about to launch something and the words "data breach" keep you up at night
If you can:
- Vibe coders who ship first and ask questions never
- Beginners using Cursor, Copilot, or any AI tool to build real projects
- Anyone who's looked at their own codebase and thought "I hope nobody checks this"
If you've ever deployed something and immediately closed the tab out of fear — this is for you.
your repo
│
▼
[ Semgrep ] ────────────────────────────────────────── deterministic scan
│ no hallucinations,
│ actual findings only
▼
[ RAG Pipeline ] ───────────────────────────────────── pulls the exact
│ ChromaDB · AST-aware chunking code chunks relevant
│ OWASP + CWE knowledge base to each finding
▼
[ Judge Model ] ────────────────────────────────────── explains the bug,
│ Groq · Flash / Pro routing gives you the fix,
│ writes the fixed code
▼
[ Report ] ──────────────────────────────────────────── 3 modes:
CEO · Public · Technical
- Drop your repo in via ZIP upload or GitHub URL
- Semgrep scans it deterministically — not guessing, actually finding
- The agent kicks in — for each finding, retrieves relevant code + security knowledge and hands it to the judge
- You get a clean report — plain English, concrete fixes, copy-paste code snippets
No noise. No "potential vulnerability detected in 47 files, good luck." Just what's wrong, why it matters, and how to fix it.
Pragma generates three personas for every finding so the right person gets the right information:
| Mode | Audience | What you get |
|---|---|---|
| 🛠 Technical | Developers | Root cause, exact fix, corrected code block |
| 💼 CEO | Non-technical founders | Business risk in plain English, high-level action, code snippet to hand off |
| 🌍 Public | Everyone else | Simple analogy, approachable explanation, beginner-friendly fix |
| Layer | Technology |
|---|---|
| API | FastAPI |
| Scanning | Semgrep (p/security-audit · p/python · p/secrets) |
| Embeddings | sentence-transformers/all-MiniLM-L6-v2 (knowledge) · gemini-embedding-001 (code) |
| Vector DB | ChromaDB — two collections, no dimension conflicts |
| Knowledge Base | 46 OWASP cheat sheets · 15 CWE definitions (635 chunks, static) |
| LLM | Groq — Flash for small repos, Pro routing for larger ones |
| Chunking | AST-aware — functions and classes as atomic units, not arbitrary token windows |
| Method | Endpoint | What it does |
|---|---|---|
POST |
/ingest/github |
Ingest a repo by GitHub URL |
POST |
/ingest/zip |
Ingest a repo from a ZIP upload |
POST |
/query |
Query the RAG pipeline directly |
POST |
/audit |
Scan + parse, returns raw findings JSON |
GET |
/audit/report/html |
Full agent loop → persona report in browser |
GET |
/audit/report/download |
Full agent loop → persona report as file download |
🟢 Alpha complete. Frontend next.
| Module | Status |
|---|---|
| AST-aware code chunker | ✅ |
| Embedding + ChromaDB storage | ✅ |
| GitHub + ZIP ingestion | ✅ |
| Semgrep scanner (multi-ruleset) | ✅ |
| OWASP + CWE static knowledge base | ✅ |
| Agent loop — scan → RAG → explain → fix | ✅ |
| Persona-based HTML reports + MD download | ✅ |
| Finding deduplication (highest severity per line) | ✅ |
| Frontend dashboard | 🚧 |
| Snyk dependency scanning | 🚧 |
This is not production-ready. But the core pipeline is real, it runs, and it works.
# Clone and install
git clone https://github.com/msjabata25/pragma
cd pragma
uv sync
# Add your keys
cp .env.example .env
# Fill in GROQ_API_KEY and GEMINI_API_KEY
# Ingest the knowledge base (one-time)
uv run python app/knowledge/ingest.py
# Start the server
uv run fastapi dev app/main.pyPython version:
>=3.12,<3.13— 3.14 breaks ChromaDB + Pydantic. Pin it.
#pragma is a compiler directive — it tells the compiler what to do before it starts running.
Pragma tells you what to do before your code goes live.
Shipping in public. Progress, demos, and the occasional existential crisis.
Built by a robotics freshman who got tired of watching people deploy broken code and call it a feature.