Skip to content

rajeshkanaka/OCR-Devnagari

॥ श्री गणेशाय नमः ॥



Typing SVG

Cost Savings   Crash Safe   Mantra Detection


Quick Start    Documentation    Star


Python Gemini EasyOCR UV License

Stars Forks Watchers


OCR-Devnagari Demo

Processing a 1000-page tantric manuscript with crash-safe resume capability




🌟 Why OCR-Devnagari?

😫 The Problem

Ancient Sanskrit and Hindi manuscripts—tantras, stotras, and sacred texts—are being lost to time. Existing OCR tools:

Issue Impact
❌ Can't handle complex conjuncts संयुक्ताक्षर destroyed
❌ Destroys mantras ॐ ह्रीं श्रीं corrupted
❌ Costs a fortune $10+ per manuscript
❌ Crashes lose work Hours of progress gone

🎯 The Solution

OCR-Devnagari combines local OCR speed with Gemini AI accuracy:

┌─────────────────────────────┐
│                             │
│   📜 1000-page Manuscript   │
│                             │
│   Before: $10+ cost         │
│   After:  $1 cost           │
│                             │
│   ✨ 90% Savings ✨          │
│                             │
│   Zero data loss on         │
│   crash or interrupt        │
│                             │
└─────────────────────────────┘



✨ Features

Feature Description
🔀 Multi-Engine Support 5 OCR backends to choose from
🧠 Smart Hybrid Mode EasyOCR + Gemini for optimal results
🕉️ Mantra Detection Auto-detect and preserve sacred text
High Performance Async concurrent workers
💾 Crash-Safe Resume from any interruption
📊 Live Progress Real-time tracking with ETA
🛡️ Graceful Shutdown Ctrl+C saves all work
🧹 Memory Efficient Handles 1000+ page PDFs
Response Validation Rejects invalid OCR results



⚡ Quick Start

📦 Installation

# Clone the repository
git clone https://github.com/rajeshkanaka/OCR-Devnagari.git
cd OCR-Devnagari

# Install with UV (recommended)
uv sync && uv pip install easyocr

# Or with pip
pip install -r requirements.txt && pip install easyocr

🔑 Configure API (for Gemini features)

# Option A: Vertex AI (Recommended for production)
export GOOGLE_CLOUD_PROJECT="your-project"
export GOOGLE_CLOUD_LOCATION="global"
export GOOGLE_GENAI_USE_VERTEXAI=1

# Option B: API Key (Quick setup)
export GEMINI_API_KEY="your-key"

🚀 Run!

# 🔥 Hybrid mode — 90% savings, maximum accuracy
python -m ocr_hindi ocr manuscript.pdf --pages "all"

# 🆓 100% FREE local processing
python -m ocr_hindi ocr manuscript.pdf -e easyocr

# 💎 Premium Gemini mode for critical documents
python -m ocr_hindi ocr manuscript.pdf -e gemini



💰 Cost Comparison

💸 How much can you save?

❌ Traditional Approach

✅ With OCR-Devnagari

Metric Value
📄 Pages 1000
💸 Cost ~$10-15
🔄 API Calls 1000
⏱️ Time ~45 min
🛡️ On Crash LOSE ALL

Metric Value
📄 Pages 1000
💸 Cost ~$1-2
🔄 API Calls ~100-150
⏱️ Time ~90 min
🛡️ On Crash Resume ✓

🏆 Engine Comparison

Engine Cost Accuracy Speed Best For
🔀 hybrid ~$0.30/1K ⭐⭐⭐⭐⭐ ⚡⚡⚡ Recommended
🆓 easyocr FREE ⭐⭐⭐⭐ ⚡⚡ Budget-conscious
🆓 marker FREE ⭐⭐⭐⭐⭐ ⚡⚡⚡ Structured PDFs
🆓 tesseract FREE ⭐⭐⭐ ⚡⚡⚡⚡ Simple documents
💎 gemini ~$2/1K ⭐⭐⭐⭐⭐ ⚡⚡⚡⚡ Critical accuracy



🏗️ Architecture

"Write once, crash anywhere, resume everywhere"

                              ┌─────────────────────────────────────────┐
                              │           📄 PDF Input                  │
                              └───────────────────┬─────────────────────┘
                                                  │
                                                  ▼
┌─────────────────────────────────────────────────────────────────────────────────────┐
│                              🔀 INTELLIGENT ROUTING                                 │
├─────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                     │
│    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐     │
│    │  hybrid  │    │ easyocr  │    │  marker  │    │tesseract │    │  gemini  │     │
│    │ DEFAULT  │    │   FREE   │    │   FREE   │    │   FREE   │    │ PREMIUM  │     │
│    └────┬─────┘    └──────────┘    └──────────┘    └──────────┘    └──────────┘     │
│         │                                                                           │
│         ▼                                                                           │
│    ┌─────────────────────────────────────────────────────────────────────────┐      │
│    │                     🧠 HYBRID DECISION ENGINE                           │      │
│    │                                                                         │      │
│    │   ┌─────────────┐         ┌─────────────────┐         ┌─────────────┐   │      │
│    │   │  EasyOCR    │ ──────▶ │ Confidence Check│ ──────▶ │   Mantra    │   │      │
│    │   │    FREE     │         │     < 85% ?     │         │  Detected?  │   │      │
│    │   └─────────────┘         └─────────────────┘         └──────┬──────┘   │      │
│    │                                    │                         │          │      │
│    │                                    ▼                         ▼          │      │
│    │                           ┌───────────────────────────────────────┐     │      │
│    │                           │        💎 Gemini 2.0 Flash            │      │      │
│    │                           │   • thinking_level: "low"             │      │     │
│    │                           │   • media_resolution: "high"          │      │     │
│    │                           │   • Token tracking for cost           │      │     │
│    │                           └───────────────────────────────────────┘      │     │
│    └─────────────────────────────────────────────────────────────────────────┘      │
│                                                                                     │
└─────────────────────────────────────────────────────────────────────────────────────┘
                                                  │
                                                  ▼
┌─────────────────────────────────────────────────────────────────────────────────────┐
│                              🛡️ CRASH-SAFE PIPELINE                                 │
├─────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                     │
│    ┌──────────┐     ┌──────────────┐     ┌──────────────┐     ┌──────────────┐      │
│    │   OCR    │────▶│    Cache     │────▶│   Progress   │────▶│   Release    │      │
│    │ Process  │     │ Atomic Write │     │   Update     │     │   Memory     │      │
│    └──────────┘     │ page_NNN.txt │     └──────────────┘     └──────────────┘      │
│                     └──────────────┘                                                │
│                                                                                     │
│    On interrupt (Ctrl+C) or crash:                                                  │
│    ┌─────────────────────────────────────────────────────────────────────────┐      │
│    │  ✓ All cached pages preserved    ✓ Resume skips completed pages         │      │
│    │  ✓ No duplicate API charges      ✓ Output merged from cache             │      │ 
│    └─────────────────────────────────────────────────────────────────────────┘      │
│                                                                                     │
└─────────────────────────────────────────────────────────────────────────────────────┘
                                                  │
                                                  ▼
                              ┌─────────────────────────────────────────┐
                              │   📝 Markdown Output + 💰 Cost Report    │
                              └─────────────────────────────────────────┘



🕉️ Mantra Detection

Intelligent detection of sacred text patterns ensures mantras are always verified with maximum accuracy


बीज मन्त्र
Seed Syllables

ॐ    ह्रीं   श्रीं
क्लीं   ऐं    हुं

मन्त्र समाप्ति
Sacred Endings

स्वाहा   नमः   फट्
वौषट्   हुं   ठः

श्लोक चिह्न
Verse Markers

॥१॥  ॥२॥  ॥३॥
॥ इति ॥

विभाग सूचक
Section Indicators

विनियोग  न्यास
ध्यान   कवच



📖 Usage Examples

🔀 Hybrid Mode (Recommended)

# Process entire manuscript with intelligent routing
python -m ocr_hindi ocr sacred_text.pdf --pages "all"

# Adjust confidence threshold (higher = more Gemini verification)
python -m ocr_hindi ocr sacred_text.pdf --confidence 0.90

# Disable mantra verification for faster processing
python -m ocr_hindi ocr sacred_text.pdf --no-verify-mantras

# Process specific page ranges
python -m ocr_hindi ocr sacred_text.pdf --pages "1-100,200-250"

# Use more workers for faster processing
python -m ocr_hindi ocr sacred_text.pdf --workers 10

🆓 Free Local Processing

# EasyOCR — Good Hindi/Devanagari support, no API needed
python -m ocr_hindi ocr book.pdf -e easyocr

# Marker — Best for structured books and PDFs
python -m ocr_hindi ocr book.pdf -e marker

# Tesseract — Fast, requires system installation
python -m ocr_hindi ocr book.pdf -e tesseract

💎 Premium Gemini Mode

# Maximum accuracy for critical manuscripts
python -m ocr_hindi ocr rare_manuscript.pdf -e gemini

# With high concurrency
python -m ocr_hindi ocr rare_manuscript.pdf -e gemini --workers 15

🛠️ Utility Commands

# List all available engines with details
python -m ocr_hindi engines

# Validate your setup (dependencies + authentication)
python -m ocr_hindi validate

# View PDF information
python -m ocr_hindi info manuscript.pdf

# Dry run — see what would be processed
python -m ocr_hindi ocr manuscript.pdf --dry-run

# Resume interrupted processing
python -m ocr_hindi ocr manuscript.pdf --resume



⚙️ Configuration

Option Description Default
-e, --engine OCR engine (hybrid, easyocr, marker, tesseract, gemini) hybrid
-p, --pages Page range (all, 1-50, 1,5,10-20) interactive
-w, --workers Concurrent workers (1-20) 5
-c, --confidence Hybrid threshold (0.0-1.0) 0.85
--verify-mantras Verify mantra pages with Gemini true
-r, --resume Resume from previous progress false
-n, --dry-run Preview without processing false
--dpi PDF rendering quality 200



📁 Output Structure

your_manuscript/
├── 📄 manuscript.pdf                        # Original file
├── 📝 manuscript_unicode.md                 # ✨ Final output (Devanagari text)
├── 📋 ocr_manuscript_20240120_143022.log    # Processing log
├── 📊 .ocr_progress_manuscript.json         # Resume state
└── 📂 .ocr_cache_manuscript/                # 🛡️ Crash-safe cache
    ├── page_0001.txt                        #    Individual page cache
    ├── page_0001.meta.json                  #    Page metadata
    ├── page_0002.txt
    └── ...



📊 Performance Benchmarks

Mode 1000 Pages Throughput Cost Notes
🔀 Hybrid ~90 min ~11 ppm ~$1 Best value
🆓 EasyOCR ~120 min ~8 ppm $0 100% free
🆓 Marker ~60 min ~16 ppm $0 Structured PDFs
💎 Gemini ~45 min ~22 ppm ~$10 Max accuracy

ppm = pages per minute • Tested on M1 MacBook Pro with 10 workers




🔧 Troubleshooting

❌ "poppler not found"
# macOS
brew install poppler

# Ubuntu/Debian
sudo apt-get install poppler-utils

# Windows - Download from:
# https://github.com/oschwartz10612/poppler-windows/releases
❌ "EasyOCR not installed"
uv pip install easyocr
# or
pip install easyocr
❌ "Tesseract not installed"
# macOS
brew install tesseract tesseract-lang

# Ubuntu/Debian
sudo apt install tesseract-ocr tesseract-ocr-hin tesseract-ocr-san

# Windows - Download installer from:
# https://github.com/UB-Mannheim/tesseract/wiki
❌ Authentication errors
# Verify Vertex AI setup
gcloud auth application-default login
gcloud config set project YOUR_PROJECT_ID

# Or use API key instead
export GEMINI_API_KEY="your-api-key-here"

# Test authentication
python -m ocr_hindi validate
❌ Rate limiting (429 errors)
# Reduce concurrent workers
python -m ocr_hindi ocr book.pdf --workers 3

# The system will automatically retry with exponential backoff
❌ High memory usage
# Reduce workers (each worker holds images in memory)
python -m ocr_hindi ocr book.pdf --workers 2

# Or process in smaller batches
python -m ocr_hindi ocr book.pdf --pages "1-100"
python -m ocr_hindi ocr book.pdf --pages "101-200" --resume



🤝 Contributing

Contributions are what make the open source community amazing!


🐛 Bug Reports
Open an Issue

💡 Feature Ideas
Start a Discussion

🔧 Pull Requests
Fork & Submit PR

📖 Documentation
Help improve docs


# Fork, clone, and create a branch
git clone https://github.com/YOUR_USERNAME/OCR-Devnagari.git
cd OCR-Devnagari
git checkout -b feature/amazing-feature

# Make your changes, then
git commit -m "Add amazing feature"
git push origin feature/amazing-feature

# Open a Pull Request 🎉



📜 License

MIT License — Free for personal and commercial use

See LICENSE for details




🙏 Acknowledgments

This project stands on the shoulders of giants


Gemini   EasyOCR   Tesseract   Marker



॥ सर्वे भवन्तु सुखिनः ॥

May all beings be happy


Om



Built with ❤️ for the Sanskrit & Vaidik community- by RajeshKanaka


Star



About

Production-grade OCR for Hindi, Sanskrit & Devanagari manuscripts

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages