Skip to content

SprintStartProject/sprintstart-ai

Repository files navigation

sprintstart-ai

The AI and RAG pipeline service for SprintStart, an AI-assisted onboarding and knowledge-retrieval platform for software development teams.

Prerequisites

  • Python 3.12+
  • uv
  • Ollama running locally with the required models pulled:
ollama pull llama3.2
ollama pull nomic-embed-text

Getting Started

Local

# 1. Install dependencies
uv sync

# 2. Configure environment
cp .env.example .env
# Edit .env and fill in the values

# 3. Run the service
uv run python -m src.main

The service runs on port 8000. Interactive docs are available at /docs.

Docker

# 1. Configure environment
cp .env.example .env
# Edit .env and fill in the values

# 2. Start the service
docker-compose up --build

The service runs on port 8000.

OLLAMA_BASE_URL is automatically overridden to http://host.docker.internal:11434 inside the container, so no manual change is needed.

Environment Variables

Variable Description
LLM_BACKEND LLM backend to use. Currently only ollama is supported.
OLLAMA_BASE_URL Base URL of the Ollama instance. Use http://host.docker.internal:11434 when running via Docker with Ollama on the host.
OLLAMA_MODEL Chat model to use for generation.
OLLAMA_EMBED_MODEL Embedding model to use for ingestion and retrieval.
CHROMA_PATH Path for ChromaDB persistent storage. If unset, an in-memory store is used and data will not persist.

API Endpoints

Method Path Description
GET /api/v1/health Reports service health including LLM backend status. Returns 503 if Ollama is unreachable.
POST /api/v1/ingest Parses, chunks, and embeds a document and stores it in the vector store. Re-ingesting the same artifact_id replaces existing chunks.
POST /api/v1/chat Retrieves relevant chunks and streams a generated answer as Server-Sent Events (SSE).
POST /api/v1/title Generates a short descriptive title from a user prompt using an LLM and respecting the given max character length.

Chat SSE stream

The /api/v1/chat endpoint streams newline-delimited JSON events:

Event type Description
token A single token fragment of the answer
citation A source chunk used to generate the answer
done Signals the end of the stream
error Emitted on failure instead of the above

Running Tests

uv run pytest

With coverage:

uv run pytest --cov=src --cov-report=term-missing