Interfaze

logo

Beta

pricing

docs

blog

sign in

Get Started

Introduction

Examples

Vision

Concepts

Resources

Integrations

Lowering costs & improving performance

copy markdown

Using Run Tasks

Run tasks allows you to programmatically run parts of the model or built-in tools without activating the full model making it significantly faster and cheaper.

Learn more about running a task.

This is great if you're using Interfaze to extract raw outputs that you can then map to a structure you like using code.

Doing more in one request with precontext

For example, if you're planning to transcribe an audio, then translate it then classify it, you can do it in one request. That request will output a field called precontext that contains the raw data for each of the specialized tasks allowing you to get full structured transcription, translation and classification all in one request instead of three separate requests.

Learn more about precontext.

This reduces the cost if you're doing multiple tasks in one request.

Large files as URLs

Passing a file URL in the prompt is significantly faster and cheaper than passing the file as a base64 encoded string.

While there are use cases for passing files as base64, it can affect speed for larger files having the raw base64 string being passed through the network. URLs on the other hand will be fetched and read at inference time making it faster and cheaper.

For examples if you're looking to transcribe a 1 hours audio file, uploading the file as base64 would take longer than the transcription itself. Using a URL would be faster 100x faster.

Learn more about handling files.

Structured outputs

Use the structured output feature to get strong consistency and type safety in your responses. Don't pass a JSON Schema as a string in the prompt which will require to model to keep retrying to get the correct output leading to higher costs.

Structured outputs are faster and cheaper than passing a JSON Schema as a string in the prompt.

Learn more about structured outputs.