Interfaze

Beta

pricing

help

docs

blog

Get Started

Introduction

Examples

Vision

OCR (Image & Document)

Object Detection

GUI Detection

Web

Scraping

Audio

Speech-to-Text (STT)

Speaker Diarization

Translation

Code Sandboxing

Guardrails

Concepts

Precontext

Run Tasks

Structured Outputs

Reasoning

Streaming

Function Calling

Handling Files

Resources

Lowering costs & improving speed

Limits

Security

Supported Languages

FAQs

Projects

Interfaze as tools

Postgres LLM

Integrations

OpenAI SDK

Vercel AI SDK

Langchain SDK

n8n Integration

API Reference

Chat Completion API

Lowering costs & improving performance

copy markdown

Using Run Tasks

Run tasks allows you to programmatically run parts of the model or built-in tools without activating the full model making it significantly faster and cheaper.

Learn more about running a task.

This is great if you're using Interfaze to extract raw outputs that you can then map to a structure you like using code.

Doing more in one request with precontext

For example, if you're planning to transcribe an audio, then translate it then classify it, you can do it in one request. That request will output a field called precontext that contains the raw data for each of the specialized tasks allowing you to get full structured transcription, translation and classification all in one request instead of three separate requests.

Learn more about precontext.

This reduces the cost if you're doing multiple tasks in one request.

Large files as URLs

Passing a file URL in the prompt is significantly faster and cheaper than passing the file as a base64 encoded string.

While there are use cases for passing files as base64, it can affect speed for larger files having the raw base64 string being passed through the network. URLs on the other hand will be fetched and read at inference time making it faster and cheaper.

For examples if you're looking to transcribe a 1 hours audio file, uploading the file as base64 would take longer than the transcription itself. Using a URL would be faster 100x faster.

Learn more about handling files.

Structured outputs

Use the structured output feature to get strong consistency and type safety in your responses. Don't pass a JSON Schema as a string in the prompt which will require to model to keep retrying to get the correct output leading to higher costs.

Structured outputs are faster and cheaper than passing a JSON Schema as a string in the prompt.

Learn more about structured outputs.

Handling Files

Limits