SecRAG-X integrates enterprise assets, CVEs, CWEs, and MITRE ATT&CK techniques into a unified Neo4j Knowledge Graph, combined with FAISS vector search and Ollama LLMs for context-aware cybersecurity reasoning and risk assessment.
graph TD
A[๐ค User / Browser Dashboard] --> B[๐ Flask API - server.py]
B --> C[๐ง Reasoning Engine - explane.py]
C --> D[(๐๏ธ Neo4j Knowledge Graph)]
C --> E[๐ FAISS Vector Store]
C --> F[๐ค Ollama LLM + Embeddings]
D --> G[CVEs / CWEs / CPEs]
D --> H[Assets / Network Topology]
D --> I[MITRE ATT&CK Techniques]
graph LR
UserQuery["๐ค User Query"] --> LLM["๐ค Llama 3 (Ollama)"]
LLM --> KG[("๐๏ธ Neo4j Knowledge Graph")]
LLM --> VS[("๐ FAISS Vector Store")]
KG --> RAG["๐ก๏ธ RAG Reasoning Response"]
VS --> RAG
| Feature | Description |
|---|---|
| ๐๏ธ Knowledge Graph | Neo4j graph of assets, software, CVEs, CWEs, network topology, and MITRE ATT&CK |
| ๐ Hybrid Retrieval | FAISS vector search + graph traversal for accurate, contextual answers |
| ๐ค Local LLM | Ollama-backed reasoning โ fully offline, no API keys needed |
| ๐ก๏ธ Intent Detection | Safe handling of vague, unsafe, or out-of-scope security queries |
| ๐ Live Dashboard | Browser UI with graph visualization, risk summaries, and asset drilldowns |
| ๐งช Test Suite | Tests for API, graph schema, alignment, reasoning, and no-graph fallback |
| Feature | Traditional Tools | SecRAG-X |
|---|---|---|
| Vulnerability Analysis | Isolated | Graph-based contextual |
| Attack Mapping | Limited | Integrated MITRE ATT&CK |
| Query Handling | Manual filtering | Natural language |
| Semantic Retrieval | โ | FAISS-based |
| AI Reasoning | โ | Ollama-powered |
| Visualization | Basic dashboards | Interactive graph |
| Layer | Technology | Purpose |
|---|---|---|
| Language Model | Llama 3 (via Ollama) | Local cybersecurity reasoning & explanation |
| Embeddings | Nomic Embed Text | Semantic embeddings for local document retrieval |
| Graph Database | Neo4j | Knowledge Graph for CVEs, CWEs, assets, and MITRE ATT&CK |
| Vector Store | FAISS | Semantic similarity retrieval over offline documentation |
| Backend Framework | Flask (Python) | REST API endpoints for reasoning and queries |
| Frontend UI | HTML5 / CSS3 / Vanilla JS | Interactive browser dashboard with D3.js graph visualization |
The system has been benchmarked and verified against a comprehensive cybersecurity dataset:
| Metric | Target Value | Verified Status |
|---|---|---|
| Vulnerability Nodes (CVEs) | ~60,000 | โ 59,210 populated |
| Weakness Nodes (CWEs) | ~1,000 | โ 969 populated |
| Attack Techniques (MITRE ATT&CK) | ~700 | โ 691 populated |
| Enterprise Assets | 50 | โ 50 mock assets linked |
| Relationships (Edges) | ~120,000+ | โ 122,877 edges mapped |
| Intent Detection Accuracy | >95% | โ 100% in reliability tests |
| Multi-Hop Reasoning Depth | Up to 4 Hops | โ Asset โ Software โ CVE โ CWE โ ATT&CK |
| Reliability Test Suite Pass Rate | 100% | โ 186/186 test cases passing |
| Hallucination Rate | 0.0% | โ Zero (restricted to graph-grounded evidence) |
secRAG-X/
โโโ ๐ static/ โ Browser dashboard (HTML/CSS/JS)
โโโ ๐ฅ๏ธ server.py โ Flask API and graph endpoints
โโโ ๐ง explane.py โ Main reasoning and intent engine
โโโ ๐ฅ data_ingest.py โ Neo4j ingestion pipeline
โโโ ๐๏ธ build_knowledge.py โ FAISS knowledge base builder
โโโ ๐ vector_store.py โ Embedding and vector search helpers
โโโ โ๏ธ rag_engine.py โ Lightweight RAG wrapper
โโโ ๐บ๏ธ mapping_engine.py โ Graph mapping utilities
โโโ ๐ข asset.py โ Mock enterprise asset generator
โโโ ๐ network_topology.py โ Mock topology/SBOM generator
โโโ ๐งช test_*.py โ Validation and regression tests
โโโ ๐ requirements.txt โ Python dependencies
โโโ ๐ .env.example โ Environment variable template
โโโ ๐ LICENSE โ MIT License
Clone the repository
git clone https://github.com/JENITH47/secRAG-X.git
cd secRAG-XInstall dependencies
pip install -r requirements.txtStart Neo4j and configure credentials
docker run -d --name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/your_password_here \
neo4j:latest
cp .env.example .env
# Edit .env with your Neo4j credentialsPull Ollama models and build the knowledge graph
ollama pull llama3
ollama pull nomic-embed-text
python data_ingest.py
python build_knowledge.pyLaunch the server
python server.pyOpen http://localhost:5000 in your browser.
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/ask |
Answer a natural language security question |
| GET | /api/summary |
Graph totals: assets, CVEs, weaknesses, attacks |
| GET | /api/risks |
Highest-risk assets |
| GET | /api/attacks |
Likely attack techniques |
| GET | /api/exposure |
Attack exposure metrics |
| GET | /api/asset/<id> |
Detailed context for a single asset |
Example request:
curl -X POST http://localhost:5000/api/ask \
-H "Content-Type: application/json" \
-d '{"question": "Which systems are most vulnerable?"}'๐ฌ Which systems are most vulnerable?
๐ฌ Why is SRV-042 risky?
๐ฌ Is SRV-010 at risk from ransomware?
๐ฌ Compare the risk between SRV-010 and SRV-015.
๐ฌ Which software creates the most exposure?
๐ฌ Show me the top 10 risks.
๐ฌ Is MySQL causing issues?
๐ฌ What is our attack exposure?
Run the test suite after Neo4j is populated:
python test_api.py
python test_graph_schema.py
python test_alignment.py
python test_no_graph.pyFor the full 11-section reliability test:
python test_full_system.py- Real-time threat intelligence integration
- Live intrusion detection support
- Automated cybersecurity response mechanisms
- Large-scale distributed deployment
- Real-time network traffic analysis
- Large/generated datasets and vector index files are excluded from git.
- Keep production credentials out of source control โ use
.envfor local configuration. - The
.env.examplefile shows all required environment variables.
- Fork the repository
- Create a feature branch (
git checkout -b feat/your-feature) - Commit your changes (
git commit -m 'feat: add new feature') - Push to the branch (
git push origin feat/your-feature) - Open a Pull Request
Jenith
โญ Star this repo if you find it useful!