Fastest API Response Times: GPT, Claude, Gemini, Mistral Benchmarked

We benchmarked GPT-4, Claude 3.5, Gemini 1.5, and Mistral’s open models for API speed. Here’s which LLM delivers the fastest response time—and why it matters for real-world apps.

Latestly AI
April 27, 2025

Top AI Tools | Home | Advertise

For developers building with LLMs, speed is often as critical as accuracy.

Whether you're powering:

…latency determines user experience.

In this benchmark, we tested four major LLM APIs for response time across typical workloads, including:

We selected top-performing models from each major provider:

Model	Provider	Version Tested
GPT-4 Turbo	OpenAI	`gpt-4-0613` via OpenAI API
Claude 3.5 Sonnet	Anthropic	`claude-3.5-sonnet-2024-06-20`
Gemini 1.5 Pro	Google	`gemini-1.5-pro-latest` via Vertex AI
Mixtral 8x7B	Mistral	Open-weight via Fireworks.ai

Each model was tested with a set of prompts across:

All models used standard rate limits, and benchmarks were run on a fast, stable connection.

Test Case	GPT-4 Turbo	Claude 3.5	Gemini 1.5	Mixtral (8x7B)
First Token (ms)	350–500 ms	300–400 ms	400–600 ms	250–300 ms
Full Output (100 tokens)	1.6–1.9s	1.2–1.5s	1.7–2.0s	0.9–1.1s
Full Output (500 tokens)	4.8–5.4s	4.1–4.8s	5.6–6.0s	2.3–2.8s
Streaming Start	~500 ms	~400 ms	~600 ms	~300 ms

Winner in raw speed: Mixtral (Mistral 8x7B)
Best performance among proprietary models: Claude 3.5 Sonnet

Mixtral is fast and lightweight, making it ideal for latency-sensitive workloads (chatbots, agents, voice interfaces).
Claude 3.5 delivers consistently low latency and near-GPT quality output, especially on long-form completions.
GPT-4 Turbo, while slower than others, is still within acceptable bounds for non-realtime use and excels in reliability.
Gemini 1.5 had the slowest response times, particularly in longer completions. Its latency improved with smaller contexts, but still lagged in speed.

Even a 500ms speed gain can significantly improve retention and perceived intelligence in your product.

Marco Fazio Editor,
Latestly AI,
Forbes 30 Under 30

We hope you enjoyed this Latestly AI edition.
📧 Got an AI tool for us to review or do you want to collaborate?
Send us a message and let us know!

Was this edition forwarded to you? Sign up here

Top AI Tools | Home | Advertise