Interfaze

logo

Beta

pricing

docs

blog

sign in

Get Started

Introduction

Examples

Vision

Concepts

Resources

Projects

Integrations

API Reference

Chat Completion API

copy markdown

Base URL

https://api.interfaze.ai/v1

Authentication

All requests must be authenticated with an API key passed in the Authorization header as a Bearer token.

Authorization: Bearer <your-api-key>

Create and manage API keys from the dashboard.

Create chat completion

Creates a model response for the given chat conversation.

POST https://api.interfaze.ai/v1/chat/completions

Request headers

HeaderRequiredDescription
AuthorizationYesBearer <your-api-key>
Content-TypeYesapplication/json
x-show-additional-infoNoSet to true to include precontext inline at the start of a streamed response. Defaults to false.

Example request

cURL

curl https://api.interfaze.ai/v1/chat/completions \
  -H "Authorization: Bearer $INTERFAZE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "interfaze-beta",
    "messages": [
      { "role": "user", "content": "Who is the founder of Interfaze?" }
    ]
  }'

Body parameters

Summary of fields accepted on the request body. Each parameter is fully documented below.

ParameterTypeRequiredDefault
modelstringYes
messagesarrayYes
streambooleanNofalse
response_formatobjectNo{type: "text"}
toolsarrayNo
tool_choicestring | objectNo"auto"
reasoning_effortstringNooff
max_tokensintegerNo32000
temperaturenumberNo1
top_pnumberNo1

model

Type: string  ·  Required

The id of the model to use. Currently the only supported value is interfaze-beta.

{ "model": "interfaze-beta" }

messages

Type: array  ·  Required

A list of messages making up the conversation so far. Each item is a message object.

FieldTypeRequiredDescription
rolestringYesOne of system, user, assistant, or tool.
contentstring | arrayYesEither a plain string or an array of content parts for multimodal input.
namestringNoOptional name of the participant.
tool_callsarrayAssistant onlyTool calls the model wants the caller to execute. See Function calling.
tool_call_idstringTool messages onlyThe id of the tool call this message is responding to.
{
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Who is the founder of Interfaze?" }
  ]
}

Content parts

When content is an array, each item is a typed part. Mix and match to send multimodal input (text + images + files in a single message).

Text part
FieldTypeRequiredDescription
typestringYesMust be "text".
textstringYesThe text content.
{
  "type": "text",
  "text": "Extract the details from this ID"
}
Image part
FieldTypeRequiredDescription
typestringYesMust be "image_url".
image_urlobjectYesWrapper object containing the image source.
image_url.urlstringYesPublicly accessible URL or base64 data URL (data:image/jpeg;base64,...).
{
  "type": "image_url",
  "image_url": {
    "url": "https://r2public.jigsawstack.com/interfaze/examples/id.jpg"
  }
}
File part

For PDFs, audio, video, and other documents.

FieldTypeRequiredDescription
typestringYesMust be "file".
fileobjectYesWrapper object containing the file source.
file.filenamestringYesFilename including extension. Used for MIME inference.
file.file_datastringYesPublicly accessible URL or base64 data URL (data:<mime>;base64,...).
{
  "type": "file",
  "file": {
    "filename": "report.pdf",
    "file_data": "https://example.com/report.pdf"
  }
}

See Handling files for size limits and SDK-specific helpers.

stream

Type: boolean  ·  Default: false

If true, partial message deltas are sent as server-sent events as they are generated. The stream terminates with data: [DONE]. See the Streaming section for the chunk format and Streaming for SDK examples.

{ "stream": true }

response_format

Type: object  ·  Default: plain text

Constrains the model output to a specific format. Follows the OpenAI structured output specification.

FieldTypeRequiredDescription
typestringYes"text" (default) or "json_schema".
json_schemaobjectWhen type is "json_schema"The JSON schema configuration.

When type is "json_schema", json_schema accepts:

FieldTypeRequiredDescription
namestringYesIdentifier for the schema (e.g. "id_schema").
schemaobjectYesA valid JSON Schema definition describing the output shape.
strictbooleanNoWhether to strictly enforce the schema. Defaults to false.
{
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "id_schema",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "first_name": { "type": "string" },
          "last_name": { "type": "string" }
        },
        "required": ["first_name", "last_name"],
        "additionalProperties": false
      }
    }
  }
}

See Structured Outputs for full examples.

tools

Type: array  ·  Optional

A list of tools (functions) the model may call. Follows the OpenAI function calling schema.

Each tool object:

FieldTypeRequiredDescription
typestringYesAlways "function".
functionobjectYesThe function definition.
function.namestringYesUnique function name (snake_case recommended).
function.descriptionstringNoWhat the function does. Helps the model decide when to call it.
function.parametersobjectNoJSON Schema describing the function's arguments.
{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_horoscope",
        "description": "Get today's horoscope for an astrological sign.",
        "parameters": {
          "type": "object",
          "properties": {
            "sign": {
              "type": "string",
              "description": "An astrological sign like Taurus or Aquarius"
            }
          },
          "required": ["sign"]
        }
      }
    }
  ]
}

See Function calling for the full multi-turn flow.

tool_choice

Type: string | object  ·  Default: "auto"

Controls which (if any) tool the model calls.

ValueBehavior
"auto"The model decides whether and which tool to call.
"none"The model will not call any tool.
"required"The model must call at least one tool.
{"type": "function", "function": {"name": "<name>"}}Forces the model to call the specified function.
{
  "tool_choice": { "type": "function", "function": { "name": "get_horoscope" } }
}

reasoning_effort

Type: string  ·  Default: off

Enables extended reasoning. The model spends more compute and thinking tokens before producing a final answer.

ValueDescription
"low"Light reasoning pass.
"medium"Moderate reasoning.
"high"Deep reasoning. Recommended for math, science, complex agents.

When set, the response contains a reasoning field with the model's thinking trace. In streaming mode, reasoning tokens stream first inside <think>...</think> tags.

{ "reasoning_effort": "high" }

See Reasoning for details.

max_tokens

Type: integer  ·  Default: 32000

The maximum number of tokens to generate in the completion. Hard upper bound of 32,000 tokens. Input tokens + max_tokens cannot exceed the 1M context window.

{ "max_tokens": 1024 }

temperature

Type: number  ·  Default: 1

Sampling temperature between 0 and 2. Higher values produce more random output, lower values make the output more focused and deterministic. Use 0 together with a seed for the most reproducible results.

{ "temperature": 0.2 }

top_p

Type: number  ·  Default: 1

Nucleus sampling. The model considers only the tokens whose cumulative probability mass is top_p. 0.1 means only the top 10% of probability mass is sampled from. Generally only adjust one of temperature or top_p.

{ "top_p": 0.9 }

System prompt extensions

Interfaze recognizes two special XML-style directives inside the system message that change how the model behaves.

<task> — run a single built-in task

Run a specialized part of the model directly without invoking the full LLM. Faster, cheaper, and returns a fixed precontext schema. The response_format must be an open/empty JSON schema (any type).

<task>ocr</task>
TaskDescription
ocrOptical character recognition on images and documents
object_detectionDetect objects in images
gui_detectionDetect GUI elements in images
web_searchWeb search
scraperExtract structured data from web pages
speech_to_textSpeech to text transcription
translateTranslation

See Run Tasks.

<guard> — content safety guardrails

Block or flag unsafe content matching one or more safety codes.

<guard>S1, S2, S3, S10, S12_IMAGE</guard>

Supported codes: S1, S1_IMAGE, S2, S3, S4, S5, S6, S7, S8, S9, S10, S11, S12, S12_IMAGE, S13, S14, S15_IMAGE, ALL.

See Guardrails for the full list and behavior.

Response object

A successful non-streaming response returns a chat completion object that follows the OpenAI shape with one Interfaze-specific addition: the top-level precontext field.

FieldTypeDescription
idstringUnique identifier for the completion.
objectstringAlways chat.completion.
modelstringThe model used.
choicesarrayA list of completion choices. Each contains an index, message, and finish_reason.
usageobjectToken usage: prompt_tokens, completion_tokens, total_tokens.
reasoningstringPresent when reasoning_effort is set. The model's thinking trace.
precontextarrayRaw outputs from any internal tasks the model ran (OCR, STT, web search, etc.). See precontext.
vcachebooleanWhether the response was served from the model's verified cache.

Example response

{
  "id": "interfaze-1775270750639",
  "object": "chat.completion",
  "model": "interfaze-beta",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"name\":\"Yoeven D Khemlani\"}"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10512,
    "completion_tokens": 7056,
    "total_tokens": 17568
  },
  "vcache": false,
  "precontext": [
    {
      "name": "web_search",
      "result": [
        {
          "title": "Interfaze | Y Combinator",
          "url": "https://www.ycombinator.com/companies/interfaze",
          "description": "AI model built for deterministic developer tasks."
        }
      ]
    }
  ]
}

Streaming

Set "stream": true in the request body to receive deltas as server-sent events. Each chunk follows the OpenAI streaming format:

data: {"id":"...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hel"},"finish_reason":null}]}

data: {"id":"...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"lo"},"finish_reason":null}]}

data: [DONE]

Streaming with precontext

By default, streamed responses do not include precontext. To opt in, add the header:

x-show-additional-info: true

When enabled, a single chunk containing the precontext is emitted before the main response begins. It is wrapped in XML tags inside the content delta so you can parse it from the stream:

<precontext>
{
  "name": "ocr",
  "result": { ... }
}
</precontext>

Streaming with reasoning

When reasoning_effort is set and stream is true, reasoning tokens are delivered first, wrapped in <think> tags:

<think>
Thinking step 1...
Thinking step 2...
</think>

Final answer begins here...

See Reasoning and Streaming for more.

Errors

Errors follow the OpenAI error shape and use standard HTTP status codes.

{
  "error": {
    "message": "Invalid API key provided.",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}
StatusTypeMeaning
400invalid_request_errorThe request body is malformed or missing required fields.
401authentication_errorThe API key is missing, invalid, or revoked.
402insufficient_quotaYour account has insufficient credits. Top up from the dashboard.
413payload_too_largeRequest body or input file exceeds size limits. See Limits.
429rate_limit_errorYou exceeded the rate limit. Retry with exponential backoff.
500internal_errorServer-side failure. Safe to retry.
503service_unavailableTemporary capacity issue. Retry with backoff.

Previous

n8n Integration