🤖 Hybrid AI Chatbot with Semantic Memory & LLM Learning

A production‑style hybrid AI chatbot that combines classical NLP, machine‑learning intent classification, semantic vector memory, and an LLM fallback used strictly as a teacher.

The system is designed to improve over time by learning from real conversations, discovering new intents, and retraining safely — without blindly trusting LLM outputs.

🎯 Project Goals

Build a realistic AI assistant architecture (not a toy chatbot)
Minimize LLM usage while still benefiting from it
Enable offline semantic memory and fast local inference
Create a safe learning loop from conversations
Follow professional ML + Git practices

✨ Key Features

Intent‑based chatbot (fast, local, inexpensive)
Semantic vector memory using FAISS (offline & persistent)
Knowledge base lookup for deterministic answers
Hinglish → English auto‑translation
LLM fallback only when the bot fails
LLM answers saved and reused for training
Automatic intent discovery from conversations
Safe retraining pipeline (no blind auto‑learning)
Memory importance scoring & controlled forgetting

This is a hybrid AI architecture, similar to how real assistants are built in production systems.

🧠 How the Chatbot Thinks (Decision Order)

Semantic Vector Memory (FAISS)
Knowledge Base Lookup
Intent Classification Model
LLM Fallback (Teacher Mode)

The LLM is never always‑on. It is only used when the system cannot confidently respond.

📁 Project Structure

chatbot-ai/
│
├── app/                # Runtime chatbot logic
│   ├── chatbot_core.py
│   ├── vector_memory.py
│   ├── llm_fallback.py
│   ├── knowledge_graph.py
│   └── __init__.py
│
├── training/           # Offline learning & improvement
│   ├── train_chatbot.py
│   ├── discover_intents.py
│   ├── auto_append_intents.py
│   ├── auto_generate_responses.py
│   └── llm_to_intents.py
│
├── data/               # Knowledge & memory (partially git‑ignored)
│   ├── intents.json
│   ├── knowledge.json
│   └── README.md
│
├── model/              # Trained models (generated locally)
│   └── README.md
│
├── .gitignore
├── requirements.txt
└── README.md

🛠 Installation

python -m pip install -r requirements.txt

Some components require an additional spaCy model:

python -m spacy download en_core_web_sm

🚀 Running the Chatbot

From the project root:

python -m app.chatbot_text

To exit:

quit

🧪 LLM Fallback – Teacher Mode

Activated only when intent + memory + knowledge fail
Generates a response using an LLM
Question + answer are stored in data/llm_memory.json
These examples are later converted into training data

Environment variable required:

setx OPENAI_API_KEY "your_api_key_here"

The LLM teaches the bot, then steps back.

📚 Learning Pipeline (After Conversations)

After chatting with the bot, improve it using:

python training/discover_intents.py
python training/auto_append_intents.py
python training/auto_generate_responses.py
python training/train_chatbot.py

What happens in this pipeline

Unrecognized queries are clustered
New intents are created automatically
Safe, neutral responses are generated
Intent model is retrained
Future LLM usage decreases

🧠 Semantic Memory

Uses FAISS + sentence‑transformers
Fully offline after installation
Persistent across restarts
Importance‑weighted storage
Automatic forgetting of low‑value memories

This allows the chatbot to remember concepts, not just exact phrases.

🔐 Files Not Committed to Git

Runtime and personal data are excluded for safety:

chat history
unrecognized queries
LLM memory logs
FAISS index files
trained model files

This keeps the repository clean and safe to share.

📌 Notes

Designed for learning and experimentation
Follows real‑world AI system patterns
Emphasizes safety, reproducibility, and clarity
LLM improves the bot over time instead of replacing it

📄 License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Hybrid AI Chatbot with Semantic Memory & LLM Learning

🎯 Project Goals

✨ Key Features

🧠 How the Chatbot Thinks (Decision Order)

📁 Project Structure

🛠 Installation

🚀 Running the Chatbot

🧪 LLM Fallback – Teacher Mode

📚 Learning Pipeline (After Conversations)

What happens in this pipeline

🧠 Semantic Memory

🔐 Files Not Committed to Git

📌 Notes

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
data		data
tests		tests
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🤖 Hybrid AI Chatbot with Semantic Memory & LLM Learning

🎯 Project Goals

✨ Key Features

🧠 How the Chatbot Thinks (Decision Order)

📁 Project Structure

🛠 Installation

🚀 Running the Chatbot

🧪 LLM Fallback – Teacher Mode

📚 Learning Pipeline (After Conversations)

What happens in this pipeline

🧠 Semantic Memory

🔐 Files Not Committed to Git

📌 Notes

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages