🧠 Cortex

An agentic assistant that thinks over your own documents — retrieve, reason, and calculate.

Cortex is a small but complete agentic RAG application. An LLM sits inside a LangGraph loop and decides, turn by turn, whether to answer directly, search a local knowledge base, or run a calculation — then keeps looping until it has a final answer. A Streamlit front-end shows every tool call live, so you can actually see the agent reason.

Built as a hands-on study of how tool-calling agents and retrieval fit together in one graph.

What it does

📚 Retrieval over your own docs — drop .txt, .pdf, or .docx files into a folder, embed them once, and the agent can semantically search them.
🧮 Calculator tools — add, multiply, divide (with a divide-by-zero guard).
🔁 Self-directed agent loop — the model picks tools on its own and stops when it's done; a hard cap prevents runaway loops.
👀 Glass-box UI — a sidebar trace lists each tool the agent called, the arguments it passed, and what came back.
⌨️ Headless CLI — prefer the terminal? Run it without the UI, interactively or one-shot.

How it works

        ┌──────────────┐
        │  user input  │
        └──────┬───────┘
               ▼
        ┌──────────────┐  wants a tool?  ┌──────────────┐
   ┌───▶│   llm_call   ├────────────────▶│   tool_node  │
   │    │   (ChatGroq) │                 │ search/math  │
   │    └──────┬───────┘◀────────────────┴──────────────┘
   │           │            tool result
   │  no tool  │
   └───────────┘
               ▼
        ┌──────────────┐
        │ final answer │
        └──────────────┘

Reasoning / tool choice → Groq (openai/gpt-oss-20b) for fast inference.
Embeddings → Google Gemini (gemini-embedding-001).
Vector store → FAISS, persisted to disk so you embed once and query forever.

Tools the agent can reach

Tool	What it does
`search_docs`	Semantic search over the FAISS knowledge base (top-3)
`add`	Adds two integers
`multiply`	Multiplies two integers
`divide`	Divides two integers (safe on `÷ 0`)

Project layout

.
├── agent.py        # The LangGraph agent: state, tools, and the graph wiring
├── app.py          # Streamlit chat UI with a live tool-trace sidebar
├── ingest.py       # Build the FAISS index from your documents (run once)
├── main.py         # Terminal CLI — interactive chat or single-shot question
├── sample_docs/    # Source documents (.txt / .pdf / .docx)
├── faiss_index/    # Generated vector index (created by ingest.py)
├── pyproject.toml  # Dependencies (managed with uv)
└── .env.example    # Template for your API keys

Getting started

Requirements

Python 3.12+
uv for dependency management (or plain pip)

1 · Install

git clone https://github.com/Ranu92/Cortex.git
cd Cortex
uv sync

2 · Add your API keys

Copy the template and fill in your own keys:

cp .env.example .env

GOOGLE_API_KEY=...   # from https://aistudio.google.com/apikey
GROQ_API_KEY=...     # from https://console.groq.com/keys

.env is git-ignored — your keys stay local and never get committed.

3 · Build the knowledge base

Put your documents in sample_docs/, then embed them:

uv run python ingest.py

This reads every .txt, .pdf, and .docx in sample_docs/, splits them into ~500-character chunks on paragraph boundaries, embeds them with Gemini, and writes the FAISS index to faiss_index/.

4 · Run it

Web UI:

uv run streamlit run app.py

Then open http://localhost:8501.

Or the terminal:

uv run python main.py                  # interactive chat
uv run python main.py "What is RAG?"   # one-shot question

Try asking

"What is RAG?" or "Explain how transformers work" → triggers search_docs
"What's 42 multiplied by 7?" → triggers multiply
"If an embedding has 768 dimensions, how many numbers is that across 5 documents?" → search and math in one turn

In the web UI, watch the sidebar fill in with each tool call as the agent works.

Managing documents

Supported formats: .txt, .pdf, .docx.

To add, remove, or update knowledge:

Change the files in sample_docs/.
Re-run uv run python ingest.py.

The index is rebuilt from scratch every time, so deleting a file only takes effect after you re-run ingest.py — the old vectors live in faiss_index/ until then.

ℹ️ Scanned/image-only PDFs won't yield text (they'd need OCR). Files with no extractable text are skipped and reported during ingest.

Tech stack

Package	Role
`langgraph`	Agent graph & control flow
`langchain` core	LLM and tool abstractions
`langchain-groq`	Groq chat model
`langchain-google-genai`	Gemini embeddings
`langchain-community`	FAISS integration
`faiss-cpu`	Vector similarity search
`pypdf` / `docx2txt`	PDF and Word text extraction
`streamlit`	Web interface
`python-dotenv`	Loads keys from `.env`

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
faiss_index		faiss_index
sample_docs		sample_docs
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
agent.py		agent.py
app.py		app.py
experiments.ipynb		experiments.ipynb
flow.excalidraw		flow.excalidraw
ingest.py		ingest.py
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Cortex

What it does

How it works

Tools the agent can reach

Project layout

Getting started

Requirements

1 · Install

2 · Add your API keys

3 · Build the knowledge base

4 · Run it

Try asking

Managing documents

Tech stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Cortex

What it does

How it works

Tools the agent can reach

Project layout

Getting started

Requirements

1 · Install

2 · Add your API keys

3 · Build the knowledge base

4 · Run it

Try asking

Managing documents

Tech stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages