Skip to content

Aafimalek/csv_chat

Repository files navigation

QueryCSV

QueryCSV is a privacy-first CSV analysis app. The browser keeps the dataset local in Pyodide, while the backend sends only the schema and the user question to an NVIDIA-hosted LLM that generates Python code.

What Changed

  • NVIDIA now powers code generation through the chat completions API.
  • The backend exposes a streamed POST /generate/stream endpoint for incremental generation status.
  • The frontend waits for final code before executing it in Pyodide.
  • Failed executions trigger automatic repair attempts against the backend.
  • The UI now shows a local schema snapshot, prompt suggestions, run metadata, and follow-up explanation actions.

Architecture

  1. The user uploads a CSV file.
  2. Pyodide loads it locally and creates a df DataFrame in the browser.
  3. The frontend sends only columns and question to the backend.
  4. The backend calls NVIDIA, normalizes the response, validates the generated code, and streams generation events.
  5. The frontend executes the final code against the local dataframe.
  6. If execution fails, the frontend sends the traceback back to /repair for up to two automatic fixes.

API

POST /generate

Compatibility endpoint that returns one final code string.

Request:

{
  "columns": ["price", "category"],
  "question": "Show the average price by category"
}

Response:

{
  "code": "print(df.groupby('category')['price'].mean())"
}

POST /generate/stream

Streams app-level SSE events:

  • start
  • code_delta
  • done
  • error

POST /repair

Accepts the original question, the current code, and a traceback. Returns repaired code.

POST /explain

Returns a plain-English explanation for generated code or observed results.

Backend Configuration

Set these in backend/.env or your deployment environment:

NVIDIA_API_KEY=your_rotated_key_here
NVIDIA_API_URL=https://integrate.api.nvidia.com/v1/chat/completions
NVIDIA_MODEL=qwen/qwen3.5-122b-a10b
NVIDIA_MAX_TOKENS=4096
NVIDIA_TEMPERATURE=0
NVIDIA_TOP_P=1
NVIDIA_ENABLE_THINKING=false

Important:

  • Do not hardcode API keys in source files.
  • If a key was pasted into chat, logs, or committed anywhere, rotate it before reuse.

Local Development

Backend

cd backend
python -m venv venv
.\venv\Scripts\Activate.ps1
pip install -r requirements.txt
uvicorn main:app --reload

Frontend

cd frontend
npm install
npm run dev

The frontend uses NEXT_PUBLIC_API_URL and defaults to http://localhost:8000.

Deployment

render.yaml is configured for:

  • a Python backend service
  • a Node frontend service
  • NVIDIA environment variables on the backend

You still need to provide NVIDIA_API_KEY in the Render dashboard or another secret manager.

Current UX Features

  • Local schema snapshot with dtype, null counts, and sample values
  • Prompt suggestions derived from the local schema
  • Streamed generation status
  • Automatic repair attempts for failed executions
  • Generated code viewer
  • Result and plot rendering
  • Follow-up actions for explaining code and results
  • Session persistence through LocalStorage and IndexedDB

Verification

Backend tests live under backend/tests.

Recommended checks:

cd backend
pytest

cd ../frontend
npm run lint
npm run build

About

chat with csv

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors