Skip to main content
Whisper-large transcription is live. Embeddings & image generation in private beta. See live jobs →
Jobs / Transcription · Available now serving 312 jobs/sec

The transcription endpoint, dispatched in 2.1 seconds.

A drop-in for OpenAI's /v1/audio/transcriptions endpoint. Same model weights. Same response shape. A fifth of the price, ten times the concurrency.

Start with $25 free → Read the docs · OpenAI-SDK compatible
Live · /v1/audio/transcriptions · global fleet streaming
23:47:18job_9f2a81whisper-large-v3 · 42m audio · en2.1s
23:47:17job_9f2a7cwhisper-turbo · 8.2s clip · auto0.4s
23:47:15job_9f2a6ewhisper-large-v3 · 1h12m · en · diarize3.8s
23:47:14job_9f2a5dwhisper-medium · 14m · es1.6s
23:47:12job_9f2a4cwhisper-large-v3 · 6m podcast · en1.9s
23:47:11job_9f2a3awhisper-turbo · 23s clip · auto0.5s
23:47:09job_9f2a2fwhisper-large-v3 · 9m meeting · en · diarize2.4s
23:47:08job_9f2a1dwhisper-medium · 38m · fr2.1s
23:47:06job_9f2a0cwhisper-large-v3 · 2h interview · en4.8s
23:47:04job_9f29fewhisper-turbo · 12s · en0.4s
The numbers

Latency that holds up. Accuracy that matches.

Measured against OpenAI's published spec. We publish raw numbers because they're competitive on every dimension that matters.

Word error rate · LibriSpeech test-clean
5.4%
whisper-large-v3 · same weights OpenAI ships
Acorn
5.4%
OpenAI
5.6%
p50 latency · 30s audio file
2.1s
measured over 7d of production traffic
Acorn
2.1s
OpenAI
3.4s
Price per minute of audio
$0.0012
vs OpenAI $0.0060 · 80% saved
Acorn
$0.0012
OpenAI
$0.0060
Models

Three weights. One endpoint.

Pass the model name in your request — the rest is the same JSON. No version pinning headaches; old aliases keep working.

Recommended
whisper-large-v3

The accurate one.

Same weights OpenAI ships. Best for interviews, podcasts, anything with real noise or multiple speakers. Use this unless you have a reason not to.

$0.0012/ min audio
5.4% WER2.1s p50
Try it →
5× throughput
whisper-medium

The cheap one.

Half the latency, half the price, slightly higher WER. The right choice for clean recordings — meeting bots, voicemail, lecture captures with a good mic.

$0.0006/ min audio
7.8% WER0.9s p50
Try it →
Lowest latency
whisper-turbo

The fast one.

Acorn-tuned MLX build for clip-length audio. Tuned for ≤30s segments where you need a transcript back immediately — voice memo apps, live chunked streams.

$0.0009/ min audio
6.9% WER0.4s p50
Try it →
Migration

Two lines of diff, then you're routing to Acorn.

If your code uses the OpenAI SDK today, change the base URL and the key. Everything else — payload shape, error codes, retries — is identical.

Before · OpenAI
from openai import OpenAI

client = OpenAI(
  api_key=os.environ["OPENAI_API_KEY"],
)

resp = client.audio.transcriptions.create(
  model="whisper-1",
  file=open("interview.mp3", "rb"),
  response_format="verbose_json",
)
After · Acorn · same SDK
from openai import OpenAI

client = OpenAI(
  api_key=os.environ["ACORN_API_KEY"],
  base_url="https://api.acorncompute.com/v1",
)

resp = client.audio.transcriptions.create(
  model="whisper-large-v3",
  file=open("interview.mp3", "rb"),
  response_format="verbose_json",
)
SDKs tested on every release: openai-python ≥ 1.30 openai-node ≥ 4.50 openai-go ≥ 0.8 openai-ruby ≥ 0.4

An audio file in. A transcript back in two seconds.

$25 of credit lands in your account at sign-up — about 350 hours of whisper-large transcription. No card, no sales call.