📄 Smart OCR Tool

A modern, fast, and powerful web-based OCR (Optical Character Recognition) tool that allows you to extract text from images and PDF documents instantly. Supports 12 languages including English, Hindi, and Arabic. Built with a focus on ease of use, speed, and a premium user experience.

🌐 Live Demo: arungupta1526.github.io/ocr-tool/

📚 Table of Contents

📄 Smart OCR Tool

📸 Screenshots

Upload Interface

Users can drag and drop images or PDF files directly into the upload area.

OCR Processing

Real-time progress tracking for each file being processed.

Extracted Text Results

View, copy, or download the extracted text after processing.

✨ Features

🖼️ Image OCR: Extract text from PNG, JPG, JPEG, and WebP images.
📄 PDF Support: Full support for multi-page PDF documents. Each page is processed individually.
🌍 Multi-Language Support: Run OCR in 12 different languages (English, Hindi, Arabic, French, German, Chinese, etc.).
📰 Multi-Column Layouts: Perfectly extract text from 2-sided or 3-column PDFs/images by preserving reading order.
🚀 Real-time Progress: Track the OCR progress for each file with visual progress bars.
⛔ Cancel Processing: Cancel any individual file's OCR mid-way without stopping others.
💾 Download as Text: Download the extracted text as a .txt file for easy editing and sharing.
📋 Instant Copy: Copy extracted text to your clipboard with a single click (includes "Copied!" feedback).
✨ Modern UI: A clean, responsive interface with smooth animations and dark mode support.
🛠️ Privacy First: All processing happens locally in your browser using WebAssembly. Your files are never uploaded to a server.

🏗 Architecture

Smart OCR Tool runs entirely in the browser with no backend.

Architecture Diagram

Browser UI (React)
        ↓
PDF.js → Canvas
        ↓
Tesseract.js (WASM OCR)
        ↓
Extracted Text

Processing Flow

User Upload → Select Language & Layout → File Queue → OCR Engine → Extracted Text

User uploads images or PDFs.
User selects desired OCR language (e.g., English, Hindi) and Column Layout (1, 2, or 3 columns).
Files enter a processing queue.
If a file is a PDF:
- PDF.js renders pages to a canvas.
Canvas images are passed to Tesseract.js.
Tesseract performs OCR using WebAssembly.
Extracted text is displayed in the results panel.

User File
   ↓
Upload Queue
   ↓
PDF.js Rendering
   ↓
Canvas Image
   ↓
Tesseract.js OCR
   ↓
Extracted Text

⚡ Performance Considerations

Per-File Abort Control

Each OCR job uses an AbortController so individual files can be cancelled without stopping the entire queue.

High Resolution Rendering

PDF pages are rendered at 2× scale before OCR to improve recognition accuracy.

Memory Management

Canvas bitmaps are released after processing each page to avoid memory leaks when processing large PDFs.

Local Processing

All OCR runs locally using WebAssembly, eliminating network latency and ensuring full privacy.

🚀 Tech Stack

Framework: React 19 + TypeScript
Build Tool: Vite
OCR Engine: Tesseract.js
PDF Handling: PDF.js
Styling: Tailwind CSS v4
Animations: Framer Motion
Icons: Lucide React

🛠️ Installation & Setup

Clone the repository:

git clone https://github.com/arungupta1526/ocr-tool.git
cd ocr-tool

Install dependencies:
```
npm install
```
Run the development server:
```
npm run dev
```
Build for production:
```
npm run build
```

📖 How to Use

Upload: Drag and drop your images or PDF files into the upload area, or click to browse.
Language & Layout: Select your language and the document's column layout (1, 2, or 3 columns) before processing.
Process: Once your files are in the queue, click the "Start OCR" button.
Cancel (optional): Click the ✕ button next to any file to cancel its processing individually.
Review: Switch to the "Results" tab to view the extracted text for each file.
Copy or Download: Click "Copy" to copy text to clipboard, or "Text" to download as a .txt file.

🎯 Use Cases

Extract text from scanned documents
Convert image-based PDFs to editable text
Quickly copy text from screenshots
OCR for research papers or notes

🤝 Contributing

Contributions are welcome! If you have ideas for improvements or new features, feel free to open an issue or submit a pull request.

🐳 Docker

Want to self-host or run this in a container? See the Docker Guide for full instructions including Dockerfile, build, run, and Docker Compose setup.

📜 License

Distributed under the MIT License. See LICENSE for more information.

📞 Commercial Support

If your company needs custom OCR features, integrations, accuracy improvements or enterprise use: Contact: arungupta1526@gmail.com or LinkedIn

Made with ❤️ by Arun Gupta

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
public		public
screenshots		screenshots
src		src
.gitignore		.gitignore
DOCKER.md		DOCKER.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 Smart OCR Tool

📚 Table of Contents

📸 Screenshots

Upload Interface

OCR Processing

Extracted Text Results

✨ Features

🏗 Architecture

Architecture Diagram

Processing Flow

⚡ Performance Considerations

Per-File Abort Control

High Resolution Rendering

Memory Management

Local Processing

🚀 Tech Stack

🛠️ Installation & Setup

📖 How to Use

🎯 Use Cases

🤝 Contributing

🐳 Docker

📜 License

📞 Commercial Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📄 Smart OCR Tool

📚 Table of Contents

📸 Screenshots

Upload Interface

OCR Processing

Extracted Text Results

✨ Features

🏗 Architecture

Architecture Diagram

Processing Flow

⚡ Performance Considerations

Per-File Abort Control

High Resolution Rendering

Memory Management

Local Processing

🚀 Tech Stack

🛠️ Installation & Setup

📖 How to Use

🎯 Use Cases

🤝 Contributing

🐳 Docker

📜 License

📞 Commercial Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages