A FastAPI service that accepts a GitHub repository URL and returns an LLM-generated summary: what the project does, what technologies it uses, and how it's structured.
- Python 3.10+
git clone <your-repo-url>
cd <repo-folder>python -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windowspip install -r requirements.txtexport NEBIUS_API_KEY=your_key_here # macOS/Linux
set NEBIUS_API_KEY=your_key_here # WindowsOptionally, set a GitHub token to raise the API rate limit from 60 to 5000 requests/hour:
export GITHUB_TOKEN=your_github_tokenuvicorn main:app --reloadThe server starts at http://localhost:8000.
curl -X POST http://localhost:8000/summarize \
-H "Content-Type: application/json" \
-d '{"github_url": "https://github.com/psf/requests"}'Expected response:
{
"summary": "...",
"technologies": ["Python", "..."],
"structure": "..."
}Create a zip of the project (exclude .env and venv/):
zip -r solution.zip . --exclude ".env" --exclude "venv/*" --exclude ".git/*" --exclude "__pycache__/*"Model used: deepseek-ai/DeepSeek-V3-0324-fast via Nebius Token Factory.
DeepSeek-V3 is the default model on Nebius Token Factory, with strong reasoning ability and reliable structured JSON output. The -fast variant gives lower latency without meaningful quality loss for summarization tasks.
Context Management: Instead of sending all files, we fetch the full recursive directory tree first. This gives the LLM a structural "map" of the project with minimal tokens.
Heuristic Scoring: Files are ranked based on a tier system:
- Tier 0–1: README and dependency manifests (highest architectural signal)
- Tier 2–3: Entry points and source code
- Tier 4–5: Config files and everything else
Within each tier, shallower files rank higher — main.py beats src/api/v2/main.py.
Smart Filtering: Binaries, lock files, generated/minified files, and noisy directories (node_modules, venv, .git) are excluded entirely before any ranking happens.
Concurrency: Selected files are fetched in parallel using an async semaphore (max 5 at a time) — faster than sequential fetching while staying within GitHub's rate limits.