DigitalbyDefault.ai
Mercury logo

Mercury

World's first commercial diffusion LLM — 1,000+ tokens per second

4.5(920 reviews)
Developer Tools

About

Mercury by Inception Labs is the first commercial-scale diffusion large language model, fundamentally changing how AI generates code and text. Rather than predicting tokens one at a time, Mercury generates a coarse draft and then refines it in parallel across multiple tokens simultaneously, reaching 1,000+ tokens per second on NVIDIA H100s—five times faster than speed-optimized autoregressive models. Mercury 2, launched February 2026, adds full reasoning capabilities while maintaining sub-two-second latency. On Copilot Arena, Mercury Coder Mini ties for second place, outperforming GPT-4o Mini and Gemini Flash. Ideal for teams with high-throughput coding needs, CI pipelines, or latency-sensitive IDE integrations where response speed directly affects developer flow.

Key Features

Diffusion-based parallel token generation
1,000+ tokens per second on H100s
Reasoning-capable (Mercury 2)
5x faster than speed-optimised LLMs
SWE-bench competitive performance
OpenAI-compatible API

Integrations

GitHubVS CodeCursorContinueJetBrains

Reviews

No reviews yet. Be the first to share your experience.

Free / from $19/mo
freemium plan
Visit WebsiteNeed help implementing?Compare with other apps
CategoryDeveloper Tools
Pricingfreemium
Rating4.5/5
Reviews920
StatusVerified

Related Reading