Extract text and objects from PDFs, scanned documents, and images with per-word bounding boxes and confidence scores. 100+ languages, layout-aware, and structured output.
This is a limited preview and does not reflect the full model's capabilities. Sign up to access the full experience.
full docs ->
OpenAI-compatible chat completion API — drop in your existing SDK.
OpenAI SDK
Vercel AI SDK
LangChain SDK
import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";
const interfaze = new OpenAI({
baseURL: "https://api.interfaze.ai/v1",
apiKey: "<your-api-key>"
});
const InvoiceSchema = z.object({
vendor: z.string(),
date: z.string(),
items: z.array(z.object({
description: z.string(),
amount: z.string()
})),
total: z.string(),
});
const response = await interfaze.chat.completions.create({
model: "interfaze-beta",
messages: [{
role: "user",
content: [
{ type: "text", text: "Extract the invoice details" },
{ type: "image_url", image_url: { url: "https://example.com/invoice.pdf" } },
],
}],
response_format: zodResponseFormat(InvoiceSchema, "invoice"),
});
console.log(response.choices[0].message.content);Invoices & receipts
Line items, totals, tax, vendor details
IDs & passports
Names, DOB, document numbers, photos
Forms & applications
Field-level extraction with labels
Contracts & legal
Clauses, parties, dates, signatures
Academic papers
Equations, citations, figures, tables
Historical scans
Degraded text, old typefaces, handwriting
run tasks docs ->
Skip the full model and run OCR directly — faster, cheaper, and returns raw bounding boxes with confidence scores.
// System prompt triggers raw OCR mode
{ role: "system", content: "<task>ocr</task>" }
// Returns sections with bounding boxes:
{
"extracted_text": "California\nDRIVER LICENSE\n...",
"sections": [{
"lines": [{
"text": "California",
"bounds": { "top_left": { "x": 63, "y": 89 }, ... },
"average_confidence": 0.99,
"words": [{ "text": "California", "confidence": 0.99 }]
}]
}],
"width": 698,
"height": 525
}full breakdown ->
Interfaze leads on olmOCR with 85.7% overall accuracy across tables, old scans, math, multi-column, and more.
| Model | olmOCR | OCRBench V2 |
|---|---|---|
| Interfaze | 85.7% | 70.7% |
| Gemini-3-Flash | 75.3% | 55.8% |
| Gemini-3.5-Flash | 82.3% | 63.9% |
| Claude-Sonnet-4.6 | 73.9% | 54.7% |
| GPT-5.4-Mini | 80.1% | 52.7% |
| Grok-4.3 | 81.9% | 54.7% |
pricing details ->
Input tokens
$1.50 / MTok
Output tokens
$3.50 / MTok
Caching
Included
all faqs ->