Interfaze

logo

Beta

pricing

docs

blog

sign in

Paddleocr Onnx

Paddleocr Onnx by monkt, a image-to-text model with OCR capabilities. Understand and compare OCR features, benchmarks, and capabilities.

Comparison

FeaturePaddleocr OnnxInterfaze
Input Modalities

image

image, text, audio, video, document

Native OCRYesYes
Long Document ProcessingNoYes
Language Support

48 partial

162+

Native Speech-to-TextNoYes
Native Object DetectionNoYes
Guardrail ControlsNoYes
Context Input Size

unknown

1M

Tool CallingNo

Tool calling supported + built in browser, code execution and web search

OCR Capabilities

FeaturePaddleocr OnnxInterfaze
Text Bounding BoxesYesYes
Confidence ScoresNoYes
Dense Image ProcessingNoYes
Low Quality ImagesNoYes
Handwritten TextNoYes
Charts, Tables & EquationsNoYes

Scaling

FeaturePaddleocr OnnxInterfaze
Scaling

Self-hosted/Provider-hosted with quantization

Unlimited

View model card on Hugging Face

Multilingual OCR models from PaddleOCR, converted to ONNX format for production deployment.

Use as a complete pipeline: Integrate with monkt.com for end-to-end document processing.

Source: PaddlePaddle PP-OCRv5 Collection
Format: ONNX (optimized for inference)
License: Apache 2.0


Overview

16 models covering 48+ languages:

  • 11 PP-OCRv5 models (latest, highest accuracy)
  • 5 PP-OCRv3 models (legacy, additional language support)

Quick Start

Download from HuggingFace

pip install huggingface_hub rapidocr-onnxruntime
from huggingface_hub import hf_hub_download


det_path = hf_hub_download("monkt/paddleocr-onnx", "detection/v5/det.onnx")
rec_path = hf_hub_download("monkt/paddleocr-onnx", "languages/english/rec.onnx")
dict_path = hf_hub_download("monkt/paddleocr-onnx", "languages/english/dict.txt")


from rapidocr_onnxruntime import RapidOCR
ocr = RapidOCR(det_model_path=det_path, rec_model_path=rec_path, rec_keys_path=dict_path)
result, elapsed = ocr("document.jpg")
from huggingface_hub import snapshot_download


snapshot_download("monkt/paddleocr-onnx", allow_patterns=["detection/v5/*", "languages/latin/*"])


snapshot_download("monkt/paddleocr-onnx", allow_patterns=["detection/v3/*", "languages/arabic/*"])
git clone https://huggingface.co/monkt/paddleocr-onnx
cd paddleocr-onnx

Basic Usage

from rapidocr_onnxruntime import RapidOCR

ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/english/rec.onnx",
    rec_keys_path="languages/english/dict.txt"
)

result, elapsed = ocr("document.jpg")
for line in result:
    print(line[1][0])  # Extracted text

Available Models

PP-OCRv5 Recognition Models

Language GroupPathLanguagesAccuracySize
Englishlanguages/english/English85.25%7.5 MB
Latinlanguages/latin/French, German, Spanish, Italian, Portuguese, + 27 more84.7%7.5 MB
East Slaviclanguages/eslav/Russian, Bulgarian, Ukrainian, Belarusian81.6%7.5 MB
Koreanlanguages/korean/Korean88.0%13 MB
Chinese/Japaneselanguages/chinese/Chinese, Japanese-81 MB
Thailanguages/thai/Thai82.68%7.5 MB
Greeklanguages/greek/Greek89.28%7.4 MB

PP-OCRv3 Recognition Models (Legacy)

Language GroupPathLanguagesVersionSize
Devanagarilanguages/hindi/Hindi, Marathi, Nepali, Sanskritv38.6 MB
Arabiclanguages/arabic/Arabic, Urdu, Persian/Farsiv38.6 MB
Tamillanguages/tamil/Tamilv38.6 MB
Telugulanguages/telugu/Teluguv38.6 MB

Detection Models

ModelPathVersionSize
PP-OCRv5 Detectiondetection/v5/det.onnxv584 MB
PP-OCRv3 Detectiondetection/v3/det.onnxv32.3 MB

Note: Use v5 detection with v5 recognition models. Use v3 detection with v3 recognition models.

Preprocessing Models (Optional)

ModelPathPurposeAccuracySize
Document Orientationpreprocessing/doc-orientation/Corrects rotated documents (0°, 90°, 180°, 270°)99.06%6.5 MB
Text Line Orientationpreprocessing/textline-orientation/Corrects upside-down text (0°, 180°)98.85%6.5 MB
Document Unwarpingpreprocessing/doc-unwarping/Fixes curved/warped documents-30 MB

Language Support

PP-OCRv5 Languages (40+)

Latin Script (32 languages): English, French, German, Spanish, Italian, Portuguese, Dutch, Polish, Czech, Slovak, Croatian, Bosnian, Serbian, Slovenian, Danish, Norwegian, Swedish, Icelandic, Estonian, Lithuanian, Hungarian, Albanian, Welsh, Irish, Turkish, Indonesian, Malay, Afrikaans, Swahili, Tagalog, Uzbek, Latin

Cyrillic: Russian, Bulgarian, Ukrainian, Belarusian

East Asian: Chinese (Simplified, Traditional), Japanese (Hiragana, Katakana, Kanji), Korean

Southeast Asian: Thai

Other: Greek

PP-OCRv3 Languages (8)

South Asian: Hindi, Marathi, Nepali, Sanskrit, Tamil, Telugu

Middle Eastern: Arabic, Urdu, Persian/Farsi


Usage Examples

from rapidocr_onnxruntime import RapidOCR


ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/english/rec.onnx",
    rec_keys_path="languages/english/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/latin/rec.onnx",
    rec_keys_path="languages/latin/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/eslav/rec.onnx",
    rec_keys_path="languages/eslav/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/korean/rec.onnx",
    rec_keys_path="languages/korean/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/chinese/rec.onnx",
    rec_keys_path="languages/chinese/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/thai/rec.onnx",
    rec_keys_path="languages/thai/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/greek/rec.onnx",
    rec_keys_path="languages/greek/dict.txt"
)
from rapidocr_onnxruntime import RapidOCR


ocr = RapidOCR(
    det_model_path="detection/v3/det.onnx",
    rec_model_path="languages/hindi/rec.onnx",
    rec_keys_path="languages/hindi/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v3/det.onnx",
    rec_model_path="languages/arabic/rec.onnx",
    rec_keys_path="languages/arabic/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v3/det.onnx",
    rec_model_path="languages/tamil/rec.onnx",
    rec_keys_path="languages/tamil/dict.txt"
)


ocr = RapidOCR(
    det_model_path="detection/v3/det.onnx",
    rec_model_path="languages/telugu/rec.onnx",
    rec_keys_path="languages/telugu/dict.txt"
)

Full Pipeline with Preprocessing

Preprocessing models improve accuracy on rotated or distorted documents:

from rapidocr_onnxruntime import RapidOCR


ocr = RapidOCR(
    det_model_path="detection/v5/det.onnx",
    rec_model_path="languages/english/rec.onnx",
    rec_keys_path="languages/english/dict.txt",
    # Optional preprocessing
    use_angle_cls=True,
    angle_cls_model_path="preprocessing/textline-orientation/PP-LCNet_x1_0_textline_ori.onnx"
)

result, elapsed = ocr("rotated_document.jpg")

When to use preprocessing:

  • Document Orientation (doc-orientation/): Scanned documents with unknown rotation (0°/90°/180°/270°)
  • Text Line Orientation (textline-orientation/): Upside-down text lines (0°/180°)
  • Document Unwarping (doc-unwarping/): Curved pages, warped documents, camera photos

Performance impact: +10-30% accuracy on distorted images, minimal speed overhead.


Repository Structure

. ├── detection/ │ ├── v5/ │ │ ├── det.onnx # 84 MB - PP-OCRv5 detection │ │ └── config.json │ └── v3/ │ ├── det.onnx # 2.3 MB - PP-OCRv3 detection │ └── config.json │ ├── languages/ │ ├── english/ │ │ ├── rec.onnx # 7.5 MB │ │ ├── dict.txt │ │ └── config.json │ ├── latin/ # 32 languages │ ├── eslav/ # Russian, Bulgarian, Ukrainian, Belarusian │ ├── korean/ │ ├── chinese/ # Chinese, Japanese │ ├── thai/ │ ├── greek/ │ ├── hindi/ # Hindi, Marathi, Nepali, Sanskrit (v3) │ ├── arabic/ # Arabic, Urdu, Persian (v3) │ ├── tamil/ # Tamil (v3) │ └── telugu/ # Telugu (v3) │ └── preprocessing/ ├── doc-orientation/ ├── textline-orientation/ └── doc-unwarping/

Model Selection

Document LanguageModel Path
Englishlanguages/english/
French, German, Spanish, Italian, Portugueselanguages/latin/
Russian, Bulgarian, Ukrainian, Belarusianlanguages/eslav/
Koreanlanguages/korean/
Chinese, Japaneselanguages/chinese/
Thailanguages/thai/
Greeklanguages/greek/
Hindi, Marathi, Nepali, Sanskritlanguages/hindi/ + detection/v3/
Arabic, Urdu, Persian/Farsilanguages/arabic/ + detection/v3/
Tamillanguages/tamil/ + detection/v3/
Telugulanguages/telugu/ + detection/v3/

Technical Specifications

  • Framework: PaddleOCR → ONNX
  • ONNX Opset: 11
  • Precision: FP32
  • Input Format: RGB images (dynamic size)
  • Inference: CPU/GPU via onnxruntime

Detection Model

  • Input: (batch, 3, height, width) - dynamic
  • Output: Text bounding boxes

Recognition Model

  • Input: (batch, 3, 32, width) - height fixed at 32px
  • Output: CTC logits → decoded with dictionary

Performance

Accuracy (PP-OCRv5)

ModelAccuracyDataset
Greek89.28%2,799 images
Korean88.0%5,007 images
English85.25%6,530 images
Latin84.7%3,111 images
Thai82.68%4,261 images
East Slavic81.6%7,031 images

FAQ

Q: Which version should I use?
A: Use PP-OCRv5 models for best accuracy. Use PP-OCRv3 only for South Asian languages not available in v5.

Q: Can I mix v5 and v3 models?
A: No. Use detection/v5/det.onnx with v5 recognition models, and detection/v3/det.onnx with v3 recognition models.

Q: GPU acceleration?
A: Install onnxruntime-gpu instead of onnxruntime for 10x faster inference.

Q: Commercial use?
A: Yes. Apache 2.0 license allows commercial use.


Credits



License: Apache 2.0

Want more deterministic results?