arabic-ocr

Star

Here are 17 public repositories matching this topic...

craneset / ocr-data

Star

In this repository, OCR-related datasets are available.

ocr datasets optical-character-recognition arabic ocr-arabic ocr-dataset arabic-ocr arabic-ocr-dataset

Updated Jan 6, 2026

MohammedNasserAhmed / arabic-pdf-chat

Star

Arabic Chat with PDF is a user-friendly application that lets you interact with Arabic PDF documents. Powered by advanced language models, OCR, and vector search, it allows you to upload PDFs, ask questions, and receive accurate Arabic responses 🚀

ocr chat-application ocr-text-reader ocr-python rag rag-chatbot arabic-ocr

Updated Nov 20, 2024
Python

OmarSamirz / Fine-Tuning-an-Arabic-OCR-Model-using-Tesseract-5.0

Star

This research aims to fine-tune an Arabic OCR model using Tesseract 5.0, enhancing text recognition accuracy through extensive data collection, preprocessing, and image generation. By leveraging advanced training techniques and data augmentation, we achieve significant improvements in word error rates (WER).

ocr tesseract tesseract-ocr ocr-model arabic-ocr arabic-ocr-model arabic-tesseract-ocr fine-tune-arabic-model fine-tune-arabic-tesseract-ocr-model fine-tune-arabic-ocr-model fine-tune-ocr

Updated Apr 4, 2025
Jupyter Notebook

OussamaBenSlama / Alef-OCR-Image2Html

Star

Alef-OCR-Image2Html, an OCR model designed to transform Arabic documents including historical texts, scanned pages, and handwritten materials into structured and semantic HTML.

ocr ocr-recognition arabic-ocr arabic-ocr-model

Updated Nov 4, 2025
Jupyter Notebook

HasanBGit / Ketaba-OCR-LoRA

Star

Official code for "Ketaba-OCR at AR-MS NakbaNLP 2026" — QLoRA fine-tuning of a specialized HTR model with Linear+Boost ensemble for Arabic manuscript recognition. 1st place per-line (CER 0.082) and 3rd place official leaderboard at NakbaNLP 2026 (LREC 2026).

ensemble dora handwritten-text-recognition peft shared-task vision-language-model qlora arabic-ocr qwen2-vl lrec2026 arabic-manuscripts nakbanlp

Updated Mar 5, 2026
Python

PRADUMAN-KR / OCR_model-HugginFace

Star

Optical Character Recognition, OCR pipeline, Arabic OCR, Deep Learning OCR, Computer Vision text extraction, Text recognition system, AI document processing, Multilingual OCR, Transformer OCR, OCR benchmarking, Bounding box detection, Ground truth evaluation.

opencv paddlepaddle paddleocr hugginface arabic-ocr ai-document-processing ocr-pipeline deep-learning-ocr computer-vision-text-extraction paddleocr-v5

Updated May 20, 2026
Python

HasanBGit / QARI-OCR-LoRA

Star

Additional experimental model for NakbaNLP 2026 Shared Task (AR-MS) — LoRA/DoRA fine-tuning of Qari-OCR (Qwen2-VL-2B) for Arabic handwritten manuscript recognition on the Omar Al-Saleh Memoir Collection (1951-1965).

lora dora handwritten-text-recognition peft shared-task vision-language-model arabic-ocr qwen2-vl lrec2026 arabic-manuscripts nakbanlp

Updated Mar 5, 2026
Python

logiccrafterdz / nassij

Star

Nassij V3: High-accuracy Arabic PDF-to-DOCX converter with direct digital extraction (NassijScanner) and cryptographic linguistic integrity verification (Merkle proofs).

python ocr document-conversion offline-first arabic-language pdf-processing word-document privacy-focused paddleocr tashkeel rtl-support pdf-to-docx arabic-ocr ligature-handling data-digitization

Updated May 17, 2026
Python

Abd-alrhman1 / multilingual-ocr-toolkit

Star

Multilingual OCR with per-region script routing for Arabic + Latin. Built for MENA documents.

multilingual ocr computer-vision tesseract text-detection arabic-nlp mena streamlit easyocr script-detection arabic-ocr

Updated May 7, 2026
Python

lAvArt / arabic-book-corpus-platform

Star

OCR-first Arabic book corpus platform with citation-grade APIs

ocr nextjs postgresql minio full-text-search computational-linguistics digital-humanities arabic text-corpus fastify arabic-language lexicography corpus-search bullmq document-ai arabic-ocr citation-search

Updated Feb 21, 2026
TypeScript

mohamedkhamis / AQMAR

Star

Local Python pipeline + bilingual SPA archiving the @AqmarTofan Telegram channel — Telethon, ffmpeg, EasyOCR (Arabic+English), openpyxl, Alpine.js.

python github-pages spa ocr telegram ffmpeg arabic openpyxl telethon tailwindcss telegram-scraper alpine-js easyocr arabic-ocr aqmar-tofan

Updated Jun 2, 2026
Python

wiameadnane / arabic-handwriting-ocr

Star

A deep learning-based handwritten Arabic OCR system using ResNet50 + BiLSTM + Attention with CTC decoding. Achieves 96.3% character accuracy and 80% word accuracy on the IFN/ENIT dataset, featuring a PyQt6 desktop GUI for real-time inference. Supports both greedy and beam search decoding.

computer-vision deep-learning lstm attention-mechanism resnet-50 ctc-loss arabic-ocr

Updated Apr 25, 2026
Jupyter Notebook

zenmakhlouf / arabic-bill-field-extractor

Star

Local Arabic OCR field extraction for utility bills with PaddleOCR, FastAPI, CLI, and validation.

ocr computer-vision fastapi paddleocr document-ai utility-bills arabic-ocr

Updated Apr 25, 2026
Python

youssefelzedy / GateGuard-AI

Star

Arabic Plate Recognition System

python ai yolo plate-recognition arabic-ocr arabic-ocr-model

Updated Jun 26, 2025
Python

GhaziRiyadh / book_to_word

Star

An AI-powered OCR and document processing system designed to convert Arabic PDF books and images into high-quality, editable scientific text layouts

react python ocr document-conversion gemini fastapi pdf-processing gpt4 arabic-ocr

Updated Apr 11, 2026
TypeScript

harissaninja / Manazir-OCR

Star

Fork of h9-tec/Manazir-OCR — Arabic-first multi-model OCR framework. Patched for API-only install (torch/transformers moved to optional local-models extra).

python fork optical-character-recognition api-only arabic-pdf arabic-ocr ocr-framework

Updated May 25, 2026
Python

seemafarrukh22-byte / arabic-manuscript-nlp

Star

End-to-end Arabic manuscript digitization and AI summarization pipeline for digital humanities research.

python opencv ocr transformers tesseract digital-humanities arabic-nlp nlp-pipeline huggingface arabic-ocr

Updated May 24, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the arabic-ocr topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the arabic-ocr topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arabic-ocr

Here are 17 public repositories matching this topic...

craneset / ocr-data

MohammedNasserAhmed / arabic-pdf-chat

OmarSamirz / Fine-Tuning-an-Arabic-OCR-Model-using-Tesseract-5.0

OussamaBenSlama / Alef-OCR-Image2Html

HasanBGit / Ketaba-OCR-LoRA

PRADUMAN-KR / OCR_model-HugginFace

HasanBGit / QARI-OCR-LoRA

logiccrafterdz / nassij

Abd-alrhman1 / multilingual-ocr-toolkit

lAvArt / arabic-book-corpus-platform

mohamedkhamis / AQMAR

wiameadnane / arabic-handwriting-ocr

zenmakhlouf / arabic-bill-field-extractor

youssefelzedy / GateGuard-AI

GhaziRiyadh / book_to_word

harissaninja / Manazir-OCR

seemafarrukh22-byte / arabic-manuscript-nlp

Improve this page

Add this topic to your repo