Skip to content

abhishek130904/DatabricksHackathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

10 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŒ‰ Vidya Setu โ€” AI-Powered Adaptive Learning Platform

Adaptive quiz engine ยท 10 Indian languages ยท RAG-powered tutoring ยท Accessibility-first

๐Ÿš€ Quick Start ยท ๐ŸŽฎ Demo ยท โœจ Features ยท ๐Ÿ—๏ธ Architecture


๐Ÿ“– What is Vidya Setu?

Vidya Setu (Bridge of Knowledge) is an AI-powered learning platform that bridges gaps in educationโ€”empowering disabled and underserved learners with adaptive quizzes and a RAG-based tutor that understands their study material and responds in their language and learning style.

It combines a ML-powered adaptive quiz engine with a RAG-based personal tutor that reads your study material and answers questions in your language, adapted to your accessibility needs.

๐Ÿ† Built for the Databricks Hackathon 2026 โ€” demonstrating real-world AI for social good in Indian education.


โœจ Features

Feature Description
๐Ÿง  Adaptive Quiz Engine FAISS + Sentence Transformers select questions by difficulty and weak topics
๐Ÿ“ˆ Real-time Difficulty Adjustment Automatically upgrades Easy โ†’ Medium โ†’ Hard based on performance
๐ŸŒ 10 Indian Languages English, Hindi, Tamil, Telugu, Bengali, Marathi, Gujarati, Kannada, Malayalam, Punjabi
๐Ÿ”Š Text-to-Speech Sarvam AI reads questions and answers aloud in your chosen language
๐Ÿ“„ RAG-Based PDF Tutor Upload any textbook/notes โ†’ ask questions โ†’ get cited answers
โ™ฟ Accessibility Profiles Adapted outputs for ADHD, Dyslexia, Visual Impairment, Hearing Impairment
๐Ÿ“Š Live Analytics Sidebar Topic mastery, difficulty breakdown, accuracy, and focus recommendations
๐ŸŽฏ Source Citations Every RAG answer cites the exact page number from your PDF

๐Ÿ—๏ธ Architecture

flowchart TD
    User([๐Ÿ‘ค Student]) --> App[๐Ÿ–ฅ๏ธ Streamlit App]

    App --> QuizEngine[๐Ÿง  Adaptive Quiz Engine]
    App --> RAGEngine[๐Ÿ“„ RAG Learning Engine]
    App --> TTS[๐Ÿ”Š Sarvam AI TTS]

    subgraph QuizEngine[๐Ÿง  Adaptive Quiz Engine]
        direction LR
        ST[Sentence Transformers\nall-mpnet-base-v2] --> FAISS[FAISS\nVector Index]
        FAISS --> Retriever[Smart Question\nRetriever]
        Retriever --> Difficulty[Difficulty\nPredictor]
    end

    subgraph RAGEngine[๐Ÿ“„ RAG Learning Engine]
        direction LR
        PDF[PDF Upload\npypdf] --> Chunker[Text Chunker]
        Chunker --> BGE[BGE-small\nEmbeddings]
        BGE --> Similarity[Cosine\nSimilarity Search]
        Similarity --> LLM[LLaMA 3.3 70B\nDatabricks Serving]
    end

    subgraph Translation[๐ŸŒ Multilingual Layer]
        M2M100[M2M100 418M\nFacebook]
    end

    App --> Translation
    LLM --> App
    TTS --> App
Loading

๐Ÿ› ๏ธ Tech Stack

๐Ÿค– AI / ML

Component Technology
Large Language Model LLaMA 3.3 70B via Databricks Model Serving
Quiz Embeddings sentence-transformers/all-mpnet-base-v2
RAG Embeddings BAAI/bge-small-en-v1.5
Vector Search FAISS (IndexFlatIP)
Translation Facebook M2M100 418M
Text-to-Speech Sarvam AI Bulbul v3

๐Ÿ—๏ธ Platform & Infrastructure

Component Technology
Deployment Databricks Apps
Model Serving Databricks Model Serving Endpoints
Frontend Streamlit
PDF Processing pypdf

๐Ÿš€ Setup & Run

Prerequisites

  • Python 3.9+
  • Databricks workspace with Model Serving enabled
  • Sarvam AI API key (get one here)

1. Clone & Install

git clone https://github.com/abhishek130904/DatabricksHackathon.git
cd DatabricksHackathon
pip install -r requirements.txt

2. Configure API Keys

Edit app.py and set your credentials:

# Databricks (auto-configured inside Databricks Apps)
DATABRICKS_HOST  = "https://<your-workspace>.azuredatabricks.net"
DATABRICKS_TOKEN = "your_databricks_token"

# Sarvam AI
SARVAM_API_KEY = "your_sarvam_api_key"

๐Ÿ’ก Inside Databricks Apps, authentication is handled automatically โ€” no token needed.

3. Run Locally

streamlit run app.py

App opens at http://localhost:8501

4. Deploy on Databricks Apps

# Import project to Databricks workspace
databricks workspace import-dir . /Workspace/Users/<your-email>/vidya-setu

Then in the Databricks UI:

  1. Navigate to Databricks โ†’ Apps
  2. Click Create App
  3. Select /Workspace/Users/<your-email>/vidya-setu
  4. Set environment variables (SARVAM_API_KEY)
  5. Click Deploy โ€” allow 8โ€“10 minutes for large ML models to load

๐ŸŽฎ Demo Walkthrough

๐Ÿง  Demo 1: Adaptive Quiz Engine

1. Open app โ†’ Select subject (Physics / Math / Chemistry)
2. Answer 3 easy questions correctly
3. Watch difficulty auto-upgrade: EASY โ†’ MEDIUM
4. Switch language to Hindi using the Language dropdown
5. Click ๐Ÿ”Š "Read Aloud" โ€” hear the question in Hindi
6. Check sidebar โ†’ see Topic Mastery and Weak Topic focus update live

๐Ÿ“„ Demo 2: RAG-Powered Personalized Learning

1. Click "Personalized Learning" tab
2. Upload any PDF (textbook chapter, lecture notes, etc.)
3. Select Accessibility Profile โ†’ "ADHD"
4. Ask: "What is the main concept in this document?"
5. Observe: answer broken into numbered steps (ADHD adaptation)
6. Click ๐Ÿ”Š "Listen" โ€” audio plays with duration shown
7. Expand ๐Ÿ“š Sources โ†’ see exact page numbers cited
8. Switch profile to "Dyslexia" โ†’ ask again โ†’ notice simpler language

โšก Demo 3: Quick Prompts to Impress Judges

"Explain Newton's Laws in simple terms"
"What are the key formulas mentioned in this chapter?"
"Summarize the important points from page 3"
"Give me a step-by-step explanation of this concept"

๐Ÿ“ Project Structure

vidya-setu/
โ”œโ”€โ”€ app.py                 # Main Streamlit application
โ”œโ”€โ”€ pcmDataset.csv         # Physics/Chemistry/Math quiz dataset
โ”œโ”€โ”€ requirements.txt       # Python dependencies
โ”œโ”€โ”€ app.yaml               # Databricks Apps configuration
โ””โ”€โ”€ README.md

๐Ÿ› Troubleshooting

Issue Fix
LLM endpoint not found Run pip install --upgrade databricks-sdk and verify endpoint name
TTS returns 0:00 duration Raw PCM is auto-wrapped in WAV header โ€” ensure speech_sample_rate: 8000 in payload
Translation model slow M2M100 loads once and is cached โ€” wait for first load (~2 min)
Slow startup (8โ€“10 min) Normal โ€” large ML models (M2M100, Sentence Transformers) load on cold start
PDF embeddings fail Ensure pypdf and sentence-transformers are installed; try a text-based PDF
FAISS import error Run pip install faiss-cpu (or faiss-gpu on GPU instances)

๐Ÿ“ฆ requirements.txt

streamlit>=1.32.0
pandas
numpy
sentence-transformers
faiss-cpu
scikit-learn
transformers
torch
pypdf
databricks-sdk
requests

โ™ฟ Accessibility Profiles

Profile Adaptation Strategy
๐Ÿ”ต Default Balanced explanation with examples
๐Ÿ‘๏ธ Visual Impairment Rich text descriptions, no visual references, full verbal context
๐Ÿ‘‚ Hearing Impairment Complete written output, no audio-dependent phrasing
๐Ÿ“– Dyslexia Short sentences, simple vocabulary, bullet points
๐Ÿง  ADHD Numbered steps, concise chunks, high-focus structure

๐ŸŒ Supported Languages

Language Native Script TTS Support
English English โœ…
Hindi เคนเคฟเค‚เคฆเฅ€ โœ…
Tamil เฎคเฎฎเฎฟเฎดเฏ โœ…
Telugu เฐคเฑ†เฐฒเฑเฐ—เฑ โœ…
Bengali เฆฌเฆพเฆ‚เฆฒเฆพ โœ…
Marathi เคฎเคฐเคพเค เฅ€ โœ…
Gujarati เช—เซเชœเชฐเชพเชคเซ€ โœ…
Kannada เฒ•เฒจเณเฒจเฒก โœ…
Malayalam เดฎเดฒเดฏเดพเดณเด‚ โœ…
Punjabi เจชเฉฐเจœเจพเจฌเฉ€ โœ…

๐Ÿ‘จโ€๐Ÿ’ป Contributors

Abhishek Raj
IIT Indore
@abhishek130904
Purvi Jain
IIT Indore
Adarsh Rai
IIT Indore
Lakshya Rishi
IIT Indore


๐Ÿ“„ License

Open-source for educational purposes. Built for Databricks Hackathon 2026.


If Vidya Setu helped you, please โญ star the repo!

GitHub stars

Bridging knowledge gaps with AI โ€” one student at a time.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages