A REST API that converts natural language questions into SQL queries using RAG (Retrieval-Augmented Generation) and executes them against a MySQL database.
This application combines vector search with large language models to translate natural language queries into SQL statements. It uses ChromaDB for semantic search of database schema information and Ollama for SQL generation.
- Natural language to SQL conversion
- Vector-based schema retrieval using ChromaDB
- Integration with Ollama LLM (llama3.1:8b)
- MySQL database query execution
- RESTful API interface
- Schema Indexing: Database schema is embedded and stored in ChromaDB
- Query Processing: Natural language questions are matched against relevant schema using vector search
- SQL Generation: Ollama LLM generates SQL queries based on the retrieved schema context
- Execution: Generated SQL is executed against the MySQL database
- Python 3.9+
- MySQL database
- Ollama with llama3.1:8b model
- Virtual environment (recommended)
- Clone the repository:
git clone "https://github.com/randhana/NL2SQL.git"
cd NL2SQL- Create and activate virtual environment:
python -m venv env
source env/bin/activate # On Windows: env\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Configure database connection in
config.py:
MYSQL_CONFIG = {
"host": "localhost",
"user": "your_username",
"password": "your_password",
"database": "your_database"
}- Generate schema embeddings:
cd Schema
python generate_schema_docs.py- Start the Flask API:
python app.py- The API will be available at
http://localhost:5001
Convert natural language to SQL and execute query.
Request:
{
"question": "Show me all employees in the sales department"
}Response:
{
"sql": "SELECT * FROM employees WHERE department = 'sales'",
"result": [...]
}Health check endpoint.
Response:
{
"status": "NL2SQL API running"
}NL2SQL/
├── app.py # Main Flask application
├── config.py # Configuration settings
├── db.py # Database connection utilities
├── requirements.txt # Python dependencies
├── Schema/
│ ├── generate_schema_docs.py # Schema embedding generator
│ └── chroma_store/ # Vector database storage
└── env/ # Virtual environment
- Flask: Web framework
- ChromaDB: Vector database for schema storage
- PyMySQL: MySQL database connector
- Requests: HTTP client for Ollama API
- python-dotenv: Environment variable management
The application uses the following configuration files:
config.py: Database and API endpoint settingsrequirements.txt: Python package dependencies
This project is licensed under the MIT License.