Interfaze

logo

Beta

pricing

help

docs

blog

sign in

All leaderboards

Quality vs Price

The green region highlights the "most attractive quadrant"

Model summary

Mean quality across the 9 benchmarks and blended $/M tokens at a 1:1 input:output weighting, sorted by quality. Gemini-3-Flash and Interfaze sit in the most attractive quadrant on the chart above.

ModelMean qualityBlended $/MTok
Interfaze80.4%$2.500
Gemini-3.5-Flash77.1%$5.250
Gemini-3-Flash74.4%$1.750
Claude-Sonnet-4.669.1%$9.000
Grok-4.364.7%$1.875
GPT-5.4-Mini62.5%$2.625

Pricing reference

Public list prices used to compute the blended axis. Blended price weights input 50% and output 50% — a 1:1 baseline that weights input and output costs equally.

ModelInput $/MTokOutput $/MTok
Interfaze$1.50$3.50
Gemini-3-Flash$0.50$3.00
Gemini-3.5-Flash$1.50$9.00
Claude-Sonnet-4.6$3.00$15.00
GPT-5.4-Mini$0.75$4.50
Grok-4.3$1.25$2.50

Methodology

Quality. Each model's score on each of the 9 benchmarks on this leaderboard, averaged. VoxPopuli is flipped from WER to (1 − WER) so direction is consistent across the cohort. Models without an audio modality contribute their mean across the other 8 benchmarks — they're not penalized for missing VoxPopuli.

Price. Public list prices in USD per million tokens, blended at 50% input / 50% output. Caching, batching, volume discounts, and per-modality pricing (e.g. Gemini's separate audio rate, image token packing) are excluded because they vary per workload — list prices keep the comparison apples-to-apples. Effective per-task cost can differ for reasoning-heavy or multimodal workloads.