An Optical Character Recognition (OCR) and Document Layout Analysis library written in Rust.
cargo add oar-ocrWith GPU support:
cargo add oar-ocr --features cudaWith auto-download of model files from ModelScope:
cargo add oar-ocr --features auto-downloadBare file names passed to the builders are then fetched from ModelScope into $OAR_HOME (default ~/.oar) and verified against their expected SHA-256. See docs/models.md for the exact path resolution rules.
use oar_ocr::prelude::*;
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize the OCR pipeline
let ocr = OAROCRBuilder::new(
"pp-ocrv5_mobile_det.onnx",
"pp-ocrv5_mobile_rec.onnx",
"ppocrv5_dict.txt",
)
.build()?;
// Load an image
let image = load_image(Path::new("document.jpg"))?;
// Run prediction
let results = ocr.predict(vec![image])?;
// Process results
for text_region in &results[0].text_regions {
if let Some((text, confidence)) = text_region.text_with_confidence() {
println!("Text: {} ({:.2})", text, confidence);
}
}
Ok(())
}PP-OCRv6 ships in three sizes. Pass the bare file names below and enable the auto-download feature to fetch them automatically or point to local paths:
| Size | Detection | Recognition | Dictionary |
|---|---|---|---|
| tiny | pp-ocrv6_tiny_det.onnx |
pp-ocrv6_tiny_rec.onnx |
ppocrv6_tiny_dict.txt |
| small | pp-ocrv6_small_det.onnx |
pp-ocrv6_small_rec.onnx |
ppocrv6_dict.txt |
| medium | pp-ocrv6_medium_det.onnx |
pp-ocrv6_medium_rec.onnx |
ppocrv6_dict.txt |
use oar_ocr::prelude::*;
use oar_ocr::domain::tasks::TextDetectionConfig;
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// PP-OCRv6 "tiny" — swap in small/medium (with ppocrv6_dict.txt) for higher accuracy.
let ocr = OAROCRBuilder::new(
"pp-ocrv6_tiny_det.onnx",
"pp-ocrv6_tiny_rec.onnx",
"ppocrv6_tiny_dict.txt",
)
// Official PP-OCRv6 detection defaults.
.text_detection_config(TextDetectionConfig {
score_threshold: 0.2,
box_threshold: 0.45,
unclip_ratio: 1.4,
..Default::default()
})
.build()?;
let image = load_image(Path::new("document.jpg"))?;
let results = ocr.predict(vec![image])?;
for region in &results[0].text_regions {
if let Some((text, confidence)) = region.text_with_confidence() {
println!("{text} ({confidence:.2})");
}
}
Ok(())
}use oar_ocr::prelude::*;
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize structure analysis pipeline
let structure = OARStructureBuilder::new("pp-doclayout_plus-l.onnx")
.with_table_classification("pp-lcnet_x1_0_table_cls.onnx")
.with_table_structure_recognition("slanet_plus.onnx", "wireless")
.table_structure_dict_path("table_structure_dict_ch.txt")
.with_ocr(
"pp-ocrv5_mobile_det.onnx",
"pp-ocrv5_mobile_rec.onnx",
"ppocrv5_dict.txt"
)
.build()?;
// Analyze document
let result = structure.predict("document.jpg")?;
// Output Markdown
println!("{}", result.to_markdown());
Ok(())
}For advanced document understanding using Vision-Language Models (like PaddleOCR-VL, PaddleOCR-VL-1.5, PaddleOCR-VL-1.6, GLM-OCR, HunyuanOCR, and MinerU2.5), check out the oar-ocr-vl crate.
- Usage Guide - Detailed API usage, builder patterns, GPU configuration
- Pre-trained Models - Model download links and recommended configurations
The examples/ directory contains complete examples for various tasks:
# General OCR
cargo run --example ocr -- --help
# Document Structure Analysis
cargo run --example structure -- --help
# Layout Detection
cargo run --example layout_detection -- --help
# Table Structure Recognition
cargo run --example table_structure_recognition -- --helpThis project builds upon the excellent work of several open-source projects:
-
ort: Rust bindings for ONNX Runtime by pykeio. This crate provides the Rust interface to ONNX Runtime that powers the efficient inference engine in this OCR library.
-
PaddleOCR: Baidu's awesome multilingual OCR toolkits based on PaddlePaddle. This project utilizes PaddleOCR's pre-trained models, which provide excellent accuracy and performance for text detection and recognition across multiple languages.
-
Candle: A minimalist ML framework for Rust by Hugging Face. We use Candle to implement Vision-Language model inference.