Trba
Trba by baudm, a image-to-text model with OCR capabilities. Understand and compare OCR features, benchmarks, and capabilities.
Comparison
| Feature | Trba | Interfaze |
|---|---|---|
| Input Modalities | image | image, text, audio, video, document |
| Native OCR | Yes | Yes |
| Long Document Processing | No | Yes |
| Language Support | unknown | 162+ |
| Native Speech-to-Text | No | Yes |
| Native Object Detection | No | Yes |
| Guardrail Controls | No | Yes |
| Context Input Size | unknown | 1M |
| Tool Calling | No | Tool calling supported + built in browser, code execution and web search |
OCR Capabilities
| Feature | Trba | Interfaze |
|---|---|---|
| Text Bounding Boxes | No | Yes |
| Confidence Scores | No | Yes |
| Dense Image Processing | No | Yes |
| Low Quality Images | No | Yes |
| Handwritten Text | No | Yes |
| Charts, Tables & Equations | No | Yes |
Scaling
| Feature | Trba | Interfaze |
|---|---|---|
| Scaling | Self-hosted/Provider-hosted with quantization | Unlimited |
View model card on Hugging Face
TRBA model pre-trained on various real STR datasets at image size 128x32.
Disclaimer: this model card was not written by the original authors.
Model description
TODO
Intended uses & limitations
You can use the model for STR on images containing Latin characters (62 case-sensitive alphanumeric + 32 punctuation marks).
How to use
TODO
BibTeX entry and citation info
@InProceedings{Baek_2019_ICCV,
author = {Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk},
title = {What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {10},
year = {2019}
}