Interfaze

logo

Beta

pricing

help

docs

blog

sign in

Document intelligence like you trained your own OCR model

Extract text and objects from PDFs, scanned documents, and images with per-word bounding boxes and confidence scores. 100+ languages, layout-aware, and structured output.

Free OCR demo

This is a limited preview and does not reflect the full model's capabilities. Sign up to access the full experience.

What you get

  • Per-word and per-line bounding boxes with confidence scores
  • 100+ languages including mixed-language documents
  • Layout-aware: tables, multi-column, headers & footers
  • Native PDF and image support (PNG, JPEG, WebP, TIFF)
  • Structured output extraction with any JSON schema
  • Precontext metadata: raw OCR data alongside model responses
  • Old scans, handwriting, math equations, degraded documents
  • <5s latency for single-page OCR tasks

Works with any AI SDK

full docs ->

OpenAI-compatible chat completion API — drop in your existing SDK.

OpenAI SDK

Vercel AI SDK

LangChain SDK

import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

const interfaze = new OpenAI({
    baseURL: "https://api.interfaze.ai/v1",
    apiKey: "<your-api-key>"
});

const InvoiceSchema = z.object({
    vendor: z.string(),
    date: z.string(),
    items: z.array(z.object({
        description: z.string(),
        amount: z.string()
    })),
    total: z.string(),
});

const response = await interfaze.chat.completions.create({
    model: "interfaze-beta",
    messages: [{
        role: "user",
        content: [
            { type: "text", text: "Extract the invoice details" },
            { type: "image_url", image_url: { url: "https://example.com/invoice.pdf" } },
        ],
    }],
    response_format: zodResponseFormat(InvoiceSchema, "invoice"),
});

console.log(response.choices[0].message.content);

Use cases

Invoices & receipts

Line items, totals, tax, vendor details

IDs & passports

Names, DOB, document numbers, photos

Forms & applications

Field-level extraction with labels

Contracts & legal

Clauses, parties, dates, signatures

Academic papers

Equations, citations, figures, tables

Historical scans

Degraded text, old typefaces, handwriting

Run task mode

run tasks docs ->

Skip the full model and run OCR directly — faster, cheaper, and returns raw bounding boxes with confidence scores.

// System prompt triggers raw OCR mode
{ role: "system", content: "<task>ocr</task>" }

// Returns sections with bounding boxes:
{
  "extracted_text": "California\nDRIVER LICENSE\n...",
  "sections": [{
    "lines": [{
      "text": "California",
      "bounds": { "top_left": { "x": 63, "y": 89 }, ... },
      "average_confidence": 0.99,
      "words": [{ "text": "California", "confidence": 0.99 }]
    }]
  }],
  "width": 698,
  "height": 525
}

OCR benchmarks

full breakdown ->

Interfaze leads on olmOCR with 85.7% overall accuracy across tables, old scans, math, multi-column, and more.

ModelolmOCROCRBench V2
Interfaze85.7%70.7%
Gemini-3-Flash75.3%55.8%
Gemini-3.5-Flash82.3%63.9%
Claude-Sonnet-4.673.9%54.7%
GPT-5.4-Mini80.1%52.7%
Grok-4.381.9%54.7%

Input tokens

$1.50 / MTok

Output tokens

$3.50 / MTok

Caching

Included

Start extracting text from documents

Free tier available. No credit card required.