copy markdown
We are introducing Interfaze-beta, the best AI trained for developer tasks, achieving outstanding results on reliability, consistency, and structured output. It outperforms SOTA models in tasks like OCR, web scraping, web search, coding, classification, and more.
It is a unified system based on the MoE architecture that routes to a suite of small models trained for specific tasks on custom infrastructure, giving the right amount of context and control to carry out it's tasks as effectively as possible.
Interfaze is OpenAI chat API compatible, which means it works with every AI SDK out of the box by swapping out the base URL and a few keys.
https://api.interfaze.ai/v1
interfaze-beta
<INTERFAZE_API_KEY>
get your key hereOpenAI SDK on NodeJS:
OpenAI SDK
Learn more about setting up interfaze with your own configuration & favourite SDK (temperature, structured response, & reasoning) here.
The MoE architecture allows for native delegation to expert models based on the task and objective of the input prompt. Each expert model is trained to handle multiple file types, from audio, image, PDF, and even text-based documents like CSV, JSON, and more. The process is then combined with infrastructure tools, like web search or code run, to either validate the output or reduce hallucination. A more powerful reasoning/thinking model is an optional step that could be activated based on the complexity of the task. The final output is then processed to either be confined to a set JSON structure or a pure text output.
Interfaze performs strongly on directed multi-turn tasks, reasoning, multimodality understanding, and perception-heavy tasks.
Interfaze's goal isn’t to be the most knowledgeable scientific model, but instead to be the best developer-focused model, which means we’re comparing to models that fall in the same bracket as Claude Sonnet 4, GPT4.1, GPT-5 (low reasoning), with a good balance of speed, quality, and cost.
Interfaze scores in tasks involving multimodal inputs are in the top with 90% accuracy (for ChartQA, AI2D) and take the second spot for MMMU, trailing Gemini-2.5-Pro (thinking) by only 5%, while outperforming other candidates like Claude-Sonnet-4-Thinking, Claude-Opus-4-Thinking & GPT 4.1.
When it comes to math, we top the table with a score of 90% on the American Invitational Mathematics Examination 2025 (AIME 2025). We outperform GPT-4.1, GPT-5-Minimal, and the Claude Family, including both thinking and non-thinking variants, while trailing behind Gemini 2.5-Pro by 3% on GPQA-Diamond, which requires PhD-level problem-solving.
For coding (LiveCodeBench v5), our numbers are strong, especially compared to SoTA models (where we outperform GPT-4.1, Claude Sonnet 4, Gemini-2.5-Flash, GPT-5 Minimal).
Interfaze-Beta outperforms other SoTA LLMs for perception tasks. A few good attempts would be:
You can find the code generated by Interfaze here.
Through the following real world use-cases we will be importing the interfaze_client
from commons.py
:
Output:
Output:
Consider the following receipt from Walmart
Output:
Output:
“Connecting the dots” by Steve Jobs at Stanford University Commencement 2005.
Interfaze can verify the complex code within isolated code sandbox, adding a safety check to differentiate between unsafe code, edge-cases and real-world logic.
Consider the following:
Output:
Interfaze provides configurable content safety guardrails. It allows you to automatically detect and filter potentially harmful or inappropriate content in both text and images, ensuring your applications maintain appropriate content standards.
Consider the following example where we place guardrails to reject requests for:
Output:
All guardrails are documented here.
We're continuously improving Interfaze based on your feedback, and you can help shape the future of LLMs for developers. If you have any feedback, please reach out, yoeven@jigsawstack.com, or join the Discord.