A modern, fast, and powerful web-based OCR (Optical Character Recognition) tool that allows you to extract text from images and PDF documents instantly. Supports 12 languages including English, Hindi, and Arabic. Built with a focus on ease of use, speed, and a premium user experience.
π Live Demo: arungupta1526.github.io/ocr-tool/
- π Smart OCR Tool
Users can drag and drop images or PDF files directly into the upload area.
Real-time progress tracking for each file being processed.
View, copy, or download the extracted text after processing.
- πΌοΈ Image OCR: Extract text from PNG, JPG, JPEG, and WebP images.
- π PDF Support: Full support for multi-page PDF documents. Each page is processed individually.
- π Multi-Language Support: Run OCR in 12 different languages (English, Hindi, Arabic, French, German, Chinese, etc.).
- π° Multi-Column Layouts: Perfectly extract text from 2-sided or 3-column PDFs/images by preserving reading order.
- π Real-time Progress: Track the OCR progress for each file with visual progress bars.
- β Cancel Processing: Cancel any individual file's OCR mid-way without stopping others.
- πΎ Download as Text: Download the extracted text as a
.txtfile for easy editing and sharing. - π Instant Copy: Copy extracted text to your clipboard with a single click (includes "Copied!" feedback).
- β¨ Modern UI: A clean, responsive interface with smooth animations and dark mode support.
- π οΈ Privacy First: All processing happens locally in your browser using WebAssembly. Your files are never uploaded to a server.
Smart OCR Tool runs entirely in the browser with no backend.
Browser UI (React)
β
PDF.js β Canvas
β
Tesseract.js (WASM OCR)
β
Extracted Text
User Upload β Select Language & Layout β File Queue β OCR Engine β Extracted Text
- User uploads images or PDFs.
- User selects desired OCR language (e.g., English, Hindi) and Column Layout (1, 2, or 3 columns).
- Files enter a processing queue.
- If a file is a PDF:
- PDF.js renders pages to a canvas.
- Canvas images are passed to Tesseract.js.
- Tesseract performs OCR using WebAssembly.
- Extracted text is displayed in the results panel.
User File
β
Upload Queue
β
PDF.js Rendering
β
Canvas Image
β
Tesseract.js OCR
β
Extracted Text
Each OCR job uses an AbortController so individual files can be cancelled without stopping the entire queue.
PDF pages are rendered at 2Γ scale before OCR to improve recognition accuracy.
Canvas bitmaps are released after processing each page to avoid memory leaks when processing large PDFs.
All OCR runs locally using WebAssembly, eliminating network latency and ensuring full privacy.
- Framework: React 19 + TypeScript
- Build Tool: Vite
- OCR Engine: Tesseract.js
- PDF Handling: PDF.js
- Styling: Tailwind CSS v4
- Animations: Framer Motion
- Icons: Lucide React
-
Clone the repository:
git clone https://github.com/arungupta1526/ocr-tool.git cd ocr-tool -
Install dependencies:
npm install
-
Run the development server:
npm run dev
-
Build for production:
npm run build
- Upload: Drag and drop your images or PDF files into the upload area, or click to browse.
- Language & Layout: Select your language and the document's column layout (1, 2, or 3 columns) before processing.
- Process: Once your files are in the queue, click the "Start OCR" button.
- Cancel (optional): Click the β button next to any file to cancel its processing individually.
- Review: Switch to the "Results" tab to view the extracted text for each file.
- Copy or Download: Click "Copy" to copy text to clipboard, or "Text" to download as a
.txtfile.
- Extract text from scanned documents
- Convert image-based PDFs to editable text
- Quickly copy text from screenshots
- OCR for research papers or notes
Contributions are welcome! If you have ideas for improvements or new features, feel free to open an issue or submit a pull request.
Want to self-host or run this in a container? See the Docker Guide for full instructions including Dockerfile, build, run, and Docker Compose setup.
Copyright (c) 2026 Arun Gupta
Distributed under the MIT License. See LICENSE for more information.
If your company needs custom OCR features, integrations, accuracy improvements or enterprise use: Contact: arungupta1526@gmail.com or LinkedIn
Made with β€οΈ by Arun Gupta


