Get Started
Examples
Concepts
Resources
Integrations
copy markdown
Extract text and bounds with confidence scores from dense images and large documents including handwritten text, printed documents, screenshots, and other visual content.
OpenAI SDK
Vercel AI SDK
LangChain SDK
Bounding boxes mapped to the image

JSON output
object contains the extracted information defined in the schema. precontext contains the raw metadata such as bounding boxes and confidence scores.
Document: https://arxiv.org/pdf/2602.04101
OpenAI SDK
Vercel AI SDK
LangChain SDK
JSON output
The output is truncated for this example.
Running OCR as a tasks with <task>ocr</task> in the system message make it cheaper and faster with a fixed structured output that's pre-defined.
Learn more about running a task.
OpenAI SDK
Vercel AI SDK
LangChain SDK
JSON output