Multi-Agent AI Cybersecurity Platform - Autonomous Vulnerability Detection & Security Analysis
Sentinel uses a centralized LLM powering specialized security agents to scan, simulate attacks, analyze threats, and generate actionable security recommendationsβall orchestrated in a production-grade multi-agent pipeline.
Sentinel is an autonomous cybersecurity platform that combines code scanning, attack simulation, and threat intelligence into a unified multi-agent system. One powerful LLM orchestrates five specialized agents that execute sequentially to deliver comprehensive security insightsβfrom vulnerability discovery to patch recommendationsβin minutes, not weeks.
β¨ Multi-Agent Architecture
- 5 specialized agents working in perfect harmony
- Sequential workflow: Scanner β Threat β Attack β Patch β Report
- Centralized LLM orchestration for consistency and token optimization
π Scanner Agent
- Detects 7+ vulnerability types (SQL Injection, XSS, RCE, etc.)
- CWE/OWASP mappings with exploitability scoring
- Realistic code analysis patterns (Bandit + Semgrep simulation)
βοΈ Attack Simulation Agent
- Plans realistic attack scenarios with MITRE ATT&CK framework
- Calculates success probability and impact scoring
- Maps vulnerabilities to specific exploitation techniques
π― Threat Intelligence Agent
- Classifies threats by severity and category
- Maps findings to security frameworks
- Generates exploitability metrics
π οΈ Patch Generation Agent
- Creates remediation patches with code examples
- Estimates complexity and risk of fixes
- Supports auto-apply recommendations
π Report Generation Agent
- Compiles executive summaries
- Risk scoring (0-100 scale)
- Patch coverage analysis
- Remediation effort estimation
π¨ Modern Frontend
- Next.js 15 with TypeScript
- Real-time scan monitoring
- Interactive dashboards
- Protected routes with JWT authentication
π‘ Production-Ready Backend
- FastAPI with async/await architecture
- SQLAlchemy ORM with async support
- PostgreSQL + Redis ready
- Complete REST API
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (Next.js) β
β Real-time Scan Monitoring & Dashboard UI β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
HTTP/REST API
β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend Server β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β API Gateway & Authentication β β
β ββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ β
β β Orchestrator Service β β
β β (Manages multi-agent workflow execution) β β
β ββββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββββββββββββββββββββββββββββββ β
β β β β β β
β ββββββββΌβββ β β β β
β β Scanner ββββ β β β
β β Agent β β β β
β βββββββββββ ββββββΌβββ β β
β βThreat ββββ β
β βAgent β ββββββββββββ β
β βββββββββ β Attack ββββ β
β β Agent β β β
β ββββββββββββ β β
β ββββββββββββΌβββββ β
β β Patch Agent ββββ β
β βββββββββββββββββ β β
β βββββββββββββββββΌβββ β
β β Report Agent β β
β ββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β Database Layer β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β PostgreSQL β β Redis β β SQLite β β
β β (Production) β β (Cache) β β (Development)β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
[1] SCANNER AGENT (Progress: 5% β 20%)
ββ Input: Repository URL
ββ Process: Code analysis (Bandit + Semgrep simulation)
ββ Output: 7+ Findings with CWE/OWASP mappings
ββ Data: Saved to `findings` table
[2] THREAT AGENT (Progress: 20% β 40%)
ββ Input: Findings from Scanner
ββ Process: Threat classification & severity mapping
ββ Output: Threat types, exploitability scores
ββ Data: Enhanced findings with threat metadata
[3] ATTACK AGENT (Progress: 40% β 60%)
ββ Input: Findings + Threats from previous agents
ββ Process: MITRE ATT&CK mapping, attack planning
ββ Output: Attack scenarios with success probability
ββ Data: Saved to `attacks` table
[4] PATCH AGENT (Progress: 60% β 80%)
ββ Input: Findings from Scanner
ββ Process: Patch generation, complexity assessment
ββ Output: Code fixes with confidence scores
ββ Data: Saved to `patches` table
[5] REPORT AGENT (Progress: 80% β 100%)
ββ Input: Findings, Attacks, Patches
ββ Process: Aggregation, risk scoring, metrics
ββ Output: Executive summary + recommendations
ββ Data: Saved to `reports` table
- Sequential Processing: Each agent builds on previous results
- State Preservation: Database maintains context between agents
- Error Resilience: One agent failure doesn't halt pipeline
- Progress Tracking: Real-time frontend updates (0-100%)
- Async Execution: Non-blocking background jobs
- Token Optimization: Single LLM orchestrates multiple specialized tasks
# Pseudo-code showing LLM routing
class Agent:
def __init__(self, llm, system_prompt):
self.llm = llm # Shared instance
self.system_prompt = system_prompt # Agent-specific instructions
async def execute(self, data):
# LLM understands context from system_prompt + input data
response = await self.llm.chat([
{"role": "system", "content": self.system_prompt},
{"role": "user", "content": data}
])
return responseBenefits:
- Cost Efficient: One LLM instance vs. five separate models
- Consistent: Same underlying reasoning across all agents
- Flexible: Easy to add/modify agent behaviors via prompts
- Stateful: Context flows between agents via database
- Scalable: Single LLM handles all reasoning
Purpose: Detect vulnerabilities in source code
Inputs:
- Repository URL
- Scan ID
Process:
- Bandit analysis (hardcoded secrets, dangerous functions)
- Semgrep analysis (weak crypto, dangerous execution)
- Deduplication to remove false positives
- CWE/OWASP mapping
Outputs:
- Finding records with:
- Vulnerability type
- Severity (Critical/High/Medium/Low)
- File path & line number
- Code snippet
- CWE ID & OWASP category
- Exploitability score (0-1)
- Confidence score (0-1)
Example Finding:
{
"id": "finding-001",
"vulnerability_type": "SQL Injection",
"severity": "critical",
"file_path": "src/database.py",
"line_number": 45,
"cwe_id": "CWE-89",
"cwe_name": "Improper Neutralization of Special Elements used in an SQL Command",
"owasp_category": "A03:2021 - Injection",
"exploitability_score": 0.95,
"code_snippet": "query = f\"SELECT * FROM users WHERE id = {user_id}\""
}Purpose: Classify threats and assess impact
Inputs:
- Findings from Scanner Agent
- CWE/OWASP framework data
Process:
- Match finding type to threat category
- Calculate exploitability (0-1)
- Calculate impact (1-10)
- Assign threat classification
Outputs:
- Finding enrichment with:
- Threat type (injection, auth, crypto, etc.)
- Exploitability metric
- Impact rating
- Threat metadata
Purpose: Plan realistic attack scenarios
Inputs:
- Findings with threat data
- MITRE ATT&CK framework
Process:
- Map vulnerabilities to attack templates
- Calculate success probability (0-1)
- Assign MITRE technique
- Plan attack steps
- Estimate impact
Outputs:
- Attack records with:
- Attack type (SQL Injection, RCE, etc.)
- Attack vector
- Success probability (0-1)
- Impact score (0-10)
- MITRE technique (T1190, T1110, etc.)
- Attack steps (list)
- Prerequisites
- Mitigation strategies
Example Attack:
{
"id": "attack-001",
"attack_type": "SQL Injection",
"success_probability": 0.85,
"impact_score": 9.5,
"mitre_technique": "T1190",
"mitre_tactic": "Initial Access",
"attack_steps": [
"Identify SQL injection parameter",
"Craft malicious SQL payload",
"Execute against database",
"Extract sensitive data"
]
}Purpose: Generate security patches
Inputs:
- Findings from Scanner
- Vulnerability patterns
Process:
- Match finding to patch template
- Generate code fix
- Estimate complexity (simple/moderate/complex)
- Calculate risk (0-1)
- Determine auto-applicability
Outputs:
- Patch records with:
- Original code (vulnerable)
- Patched code (secure)
- Explanation
- Complexity rating
- Risk score
- Can auto-apply flag
- Confidence (0-1)
Example Patch:
{
"id": "patch-001",
"vulnerability_type": "SQL Injection",
"original_code": "query = f\"SELECT * FROM users WHERE id = {user_id}\"",
"patched_code": "query = \"SELECT * FROM users WHERE id = %s\"\ncursor.execute(query, (user_id,))",
"explanation": "Use parameterized queries to prevent SQL injection",
"apply_complexity": "simple",
"can_auto_apply": true,
"confidence": 0.98
}Purpose: Compile comprehensive security report
Inputs:
- All findings, attacks, patches
- Aggregated metrics
Process:
- Aggregate findings by severity
- Calculate risk score (0-100)
- Estimate patch coverage
- Calculate remediation effort
- Generate executive summary
- Create recommendations
Outputs:
- Report record with:
- Overall risk score
- Severity breakdown (critical/high/medium/low)
- Patch coverage %
- Remediation effort (hours)
- Executive summary (text)
- Detailed findings (JSON)
- Recommendations (list)
- Metadata
- Framework: Next.js 15.5.18
- Language: TypeScript (strict mode)
- Styling: Tailwind CSS
- UI Components: Custom + shadcn/ui patterns
- State Management: React hooks + Context API
- HTTP Client: Fetch API + async/await
- Build: Webpack (Next.js default)
- Framework: FastAPI 0.104.1
- Language: Python 3.11+
- Async: asyncio + FastAPI async routes
- Database ORM: SQLAlchemy 2.0+ (async)
- DB Driver: aiosqlite (dev), asyncpg (prod)
- Auth: JWT tokens
- Validation: Pydantic v2
- API Docs: Swagger UI (auto-generated)
- Development: SQLite with aiosqlite
- Production: PostgreSQL with asyncpg
- Caching: Redis (optional)
- ORM Models: SQLAlchemy with async support
- Orchestration: Custom ScanOrchestrator service
- Agent Base: Abstract BaseAgent class
- LLM Integration: Ready for OpenAI/Claude/Ollama
- Frameworks: MITRE ATT&CK, CWE, OWASP mapping
- Container: Docker + Docker Compose
- Port (Frontend): 3000
- Port (Backend): 8000
- Environment: .env configuration
- Python 3.11+
- Node.js 18+
- npm or yarn
- PostgreSQL 12+ (production)
- Redis (optional)
git clone https://github.com/yourusername/sentinel.git
cd sentinel# Create Python virtual environment
cd backend
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create .env file
cp .env.example .env
# Edit .env with your configuration
nano .env
# Run database migrations (if applicable)
# python alembic upgrade head
# Start backend server
uvicorn main:app --reload --host 0.0.0.0 --port 8000# In new terminal, from project root
cd frontend
# Install dependencies
npm install
# Create .env.local file
cp .env.example .env.local
# Edit environment variables
nano .env.local
# Start frontend dev server
npm run dev# Check backend health
curl http://localhost:8000/health
# Frontend should be accessible at
# http://localhost:3000
# Swagger API docs at
# http://localhost:8000/docs# Database
DATABASE_URL=postgresql+asyncpg://user:password@localhost:5432/sentinel
# For development: sqlite+aiosqlite://./sentinel.db
# API Configuration
API_HOST=0.0.0.0
API_PORT=8000
DEBUG=true
# JWT
JWT_SECRET=your-super-secret-key-change-this-in-production
JWT_ALGORITHM=HS256
JWT_EXPIRATION_HOURS=24
# LLM Configuration (when integrating)
OPENAI_API_KEY=sk-your-key-here
LLM_MODEL=gpt-4-turbo
# Redis (optional)
REDIS_URL=redis://localhost:6379/0
# CORS
CORS_ORIGINS=http://localhost:3000,http://localhost:8000
# Logging
LOG_LEVEL=INFO# API Configuration
NEXT_PUBLIC_API_URL=http://localhost:8000
# Feature Flags
NEXT_PUBLIC_ENABLE_DEMO=trueTerminal 1 - Backend:
cd backend
source venv/bin/activate
uvicorn main:app --reloadTerminal 2 - Frontend:
cd frontend
npm run devThen visit: http://localhost:3000
# From project root
docker-compose up -d
# View logs
docker-compose logs -f
# Stop
docker-compose downPOST /api/auth/login - Login with credentials
POST /api/auth/register - Register new account
GET /api/auth/me - Get current user
GET /api/scans - List all scans (paginated)
POST /api/scans - Create new scan (triggers orchestration)
GET /api/scans/{scan_id} - Get scan details with metrics
GET /api/scans/{scan_id}/findings - Get findings for scan
GET /api/findings/{finding_id} - Get specific finding
GET /api/scans/{scan_id}/attacks - Get attack scenarios
GET /api/attacks/{attack_id} - Get specific attack
GET /api/scans/{scan_id}/patches - Get patches for scan
GET /api/patches/{patch_id} - Get specific patch
GET /api/scans/{scan_id}/report - Get comprehensive report
GET /health - Health check
sentinel/
βββ frontend/ # Next.js frontend
β βββ src/
β β βββ app/ # Pages (dashboard, scan, threat, etc.)
β β βββ components/ # React components
β β βββ hooks/ # Custom React hooks
β β βββ lib/ # Utilities
β β βββ services/ # API service layer
β β βββ types/ # TypeScript interfaces
β βββ public/ # Static assets
β βββ package.json
β βββ next.config.ts
β βββ tsconfig.json
β βββ tailwind.config.js
β βββ .env.example
β
βββ backend/ # FastAPI backend
β βββ app/
β β βββ agents/ # Agent implementations
β β β βββ base.py # BaseAgent abstract class
β β β βββ scanner_agent.py
β β β βββ threat_agent.py
β β β βββ attack_agent.py
β β β βββ patch_agent.py
β β β βββ report_agent.py
β β β βββ __init__.py
β β βββ api/ # API routes
β β β βββ scans.py
β β β βββ findings.py
β β β βββ attacks.py
β β β βββ patches.py
β β β βββ reports.py
β β β βββ auth.py
β β β βββ __init__.py
β β βββ models/ # Database models
β β β βββ orm.py # SQLAlchemy models
β β β βββ schemas.py # Pydantic schemas
β β β βββ __init__.py
β β βββ services/ # Business logic
β β β βββ orchestrator.py # Multi-agent orchestration
β β β βββ auth_service.py
β β β βββ __init__.py
β β βββ utils/ # Utilities
β β β βββ helpers.py
β β β βββ cwe_mapping.py
β β β βββ mitre_mapping.py
β β β βββ __init__.py
β β βββ database.py # Database connection
β β βββ config.py # Configuration
β β βββ __init__.py
β βββ main.py # Entry point
β βββ requirements.txt # Python dependencies
β βββ .env.example
β βββ pytest.ini
β
βββ docs/ # Documentation
β βββ ARCHITECTURE.md # System architecture
β βββ INSTALLATION.md # Setup guide
β βββ API_DOCUMENTATION.md # API reference
β βββ MULTI_AGENT_EXPLANATION.md # Agent details
β βββ JUDGE_VALIDATION_GUIDE.md # Hackathon guide
β
βββ screenshots/ # UI screenshots
βββ demo/ # Demo scripts
βββ docker-compose.yml # Docker configuration
βββ Dockerfile # Docker image
βββ README.md # This file
βββ .gitignore # Git ignore rules
βββ .env.example # Environment template
βββ LICENSE # MIT License
- β Scans code for vulnerabilities
- β Simulates realistic attacks
- β Generates patches
- β Creates reports
- β Execute actual attacks
- β Modify source code without approval
- β Store sensitive credentials
- β Bypass authentication
- Environment Variables: All secrets in .env (gitignored)
- JWT Tokens: Short expiration + refresh tokens
- Database: Use PostgreSQL with SSL in production
- CORS: Restricted to trusted origins only
- Input Validation: Pydantic schema validation on all inputs
- API Keys: Rotate regularly, use scoped keys
- Run with test credentials (see .env.example)
- No real data is harmed
- Everything is reversible
- See JUDGE_VALIDATION_GUIDE.md for details
- Real LLM integration (OpenAI GPT-4, Anthropic Claude)
- Advanced ML-based vulnerability detection
- Automated patch application with rollback
- CI/CD pipeline integration
- Team collaboration features
- Compliance reporting (SOC 2, ISO 27001)
- Custom rule engine
- Mobile app
- Slack/Teams notifications
- Historical trend analysis
Challenge: Ensuring agents execute in sequence while maintaining state Solution: Built ScanOrchestrator service with database-backed context
Challenge: Preventing runaway token usage with 5 agents Solution: Centralized LLM with specialized system prompts instead of separate models
Challenge: SQLAlchemy async with multiple agents running simultaneously Solution: AsyncSession management with proper connection pooling
Challenge: Tracking progress from background agent execution Solution: REST API with polling + WebSocket ready architecture
- Agent Orchestration: Proper sequencing matters more than parallel execution
- LLM Efficiency: One powerful model beats five specialized ones
- State Management: Database as agent communication layer works well
- Async Architecture: Critical for handling multiple agents at scale
- DevX: Good error messages and logging saves debugging time
- Async/await throughout (no blocking I/O)
- Connection pooling (50+ concurrent requests)
- Redis caching for frequent queries
- Database indexing on scan_id, finding_id, status
- Pagination on list endpoints
- Docker containers with health checks
- Kubernetes orchestration ready
- Auto-scaling based on scan queue
- CDN for static frontend assets
- Separate read replicas for reporting
For Security Teams:
- Automated vulnerability discovery
- Realistic attack simulation
- Actionable remediation patches
- Executive reporting
For Developers:
- Easy to understand codebase
- Extensible agent framework
- Production-ready FastAPI backend
- Modern React frontend
For Enterprises:
- Scalable to thousands of scans
- Compliant architecture (SOC 2 ready)
- Cost-effective (single LLM vs. multiple tools)
- Integrates with existing pipelines
Built collaboratively by:
- Sambhav Jain
- Yug Agrawal
- Divi Chopra
Made with β€οΈ for the Hackathon Community
Stars β are appreciated! Fork and contribute!