Skip to content

randhana/NL2SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NL2SQL API

A REST API that converts natural language questions into SQL queries using RAG (Retrieval-Augmented Generation) and executes them against a MySQL database.

Overview

This application combines vector search with large language models to translate natural language queries into SQL statements. It uses ChromaDB for semantic search of database schema information and Ollama for SQL generation.

Features

  • Natural language to SQL conversion
  • Vector-based schema retrieval using ChromaDB
  • Integration with Ollama LLM (llama3.1:8b)
  • MySQL database query execution
  • RESTful API interface

Architecture

  1. Schema Indexing: Database schema is embedded and stored in ChromaDB
  2. Query Processing: Natural language questions are matched against relevant schema using vector search
  3. SQL Generation: Ollama LLM generates SQL queries based on the retrieved schema context
  4. Execution: Generated SQL is executed against the MySQL database

Prerequisites

  • Python 3.9+
  • MySQL database
  • Ollama with llama3.1:8b model
  • Virtual environment (recommended)

Installation

  1. Clone the repository:
git clone "https://github.com/randhana/NL2SQL.git"
cd NL2SQL
  1. Create and activate virtual environment:
python -m venv env
source env/bin/activate  # On Windows: env\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure database connection in config.py:
MYSQL_CONFIG = {
    "host": "localhost",
    "user": "your_username",
    "password": "your_password",
    "database": "your_database"
}
  1. Generate schema embeddings:
cd Schema
python generate_schema_docs.py

Usage

  1. Start the Flask API:
python app.py
  1. The API will be available at http://localhost:5001

API Endpoints

POST /nl2sql

Convert natural language to SQL and execute query.

Request:

{
    "question": "Show me all employees in the sales department"
}

Response:

{
    "sql": "SELECT * FROM employees WHERE department = 'sales'",
    "result": [...]
}

GET /

Health check endpoint.

Response:

{
    "status": "NL2SQL API running"
}

Project Structure

NL2SQL/
├── app.py                 # Main Flask application
├── config.py             # Configuration settings
├── db.py                 # Database connection utilities
├── requirements.txt      # Python dependencies
├── Schema/
│   ├── generate_schema_docs.py  # Schema embedding generator
│   └── chroma_store/            # Vector database storage
└── env/                  # Virtual environment

Dependencies

  • Flask: Web framework
  • ChromaDB: Vector database for schema storage
  • PyMySQL: MySQL database connector
  • Requests: HTTP client for Ollama API
  • python-dotenv: Environment variable management

Configuration

The application uses the following configuration files:

  • config.py: Database and API endpoint settings
  • requirements.txt: Python package dependencies

License

This project is licensed under the MIT License.

About

A REST API that converts natural language questions into SQL queries using RAG (Retrieval-Augmented Generation) and executes them against a MySQL database.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages