How to Use OpenAI Whisper on Mac (No Terminal Required)

OpenAI Whisper is the open-source speech recognition model that powers most modern Mac dictation apps. You don't need to touch a terminal to use it — but it helps to understand what's actually running on your Mac when you do.

What is Whisper?

Whisper is an automatic speech recognition (ASR) model that OpenAI released in late 2022 and updated through 2024 with large-v3 and turbo variants. It's open weights — meaning anyone can download the model and run it locally — and it supports 90+ languages with state-of-the-art accuracy.

Three things matter about Whisper:

It's open. Apps don't pay per-minute API fees the way they would for a hosted speech-recognition service. That's why local Whisper apps can charge $5/mo instead of $15.
It runs on your machine. With GPU acceleration on Apple Silicon, it's faster than real-time. No cloud round-trip.
It's accurate. Whisper is competitive with or exceeds commercial speech recognition (Dragon, Apple, Google) on most benchmarks, especially for technical and multilingual content.

Whisper model sizes

Whisper ships in several sizes, each with a different accuracy-vs-speed-vs-RAM tradeoff:

| Model | Size on disk | Approx RAM | Speed (M2) | Best for | |---|---|---|---|---| | tiny | 75 MB | ~1 GB | Very fast | Quick voice memos, casual dictation | | base | 142 MB | ~1.5 GB | Fast | Default for most users | | small | 466 MB | ~2.5 GB | Moderate | Better accuracy, technical vocabulary | | medium | 1.5 GB | ~5 GB | Slow on smaller Macs | Higher accuracy, longer phrases | | large-v3 | 3.0 GB | ~10 GB | Slowest | Maximum accuracy, complex audio |

For most everyday use on Apple Silicon, small or medium is the sweet spot. If you're dictating short phrases for fast text input, base is plenty.

How to run Whisper on Mac

Option 1: Use a Mac app that wraps Whisper (recommended)

The path of least resistance. These apps download the model for you, handle GPU acceleration, and give you a hotkey to dictate into any app:

JustVoice — local Whisper with code mode, custom vocabulary, and snippets. Free tier available. Download here.
Superwhisper — local Whisper with AI modes. Multi-platform.
MacWhisper — file transcription specialist with diarisation.
VoiceInk — open-source Whisper dictation.

For the JustVoice setup specifically:

Download JustVoice.
Install it (drag to Applications).
Grant microphone and accessibility permissions when prompted.
JustVoice ships with base by default. Free tier uses base. Plus and Pro unlock all model sizes including large-v3.
Hold your hotkey (default: Right Option), speak, release. The audio is transcribed locally on your Mac's GPU and the text appears at your cursor.

That's it. No Terminal, no Python, no Homebrew.

Option 2: Run Whisper from the command line (for tinkerers)

If you'd rather use the official Python implementation:

``bash pip install -U openai-whisper brew install ffmpeg whisper your-audio.mp3 --model small `

This works but isn't a great daily-driver — there's no hotkey integration, no system-wide dictation, and the Python implementation isn't as fast as the optimised Metal-accelerated builds in apps like JustVoice.

Option 3: whisper.cpp for the truly minimalist

whisper.cpp is a C++ port that runs on Apple Silicon with Metal acceleration. It's what most Mac apps actually wrap under the hood.

Same caveat: powerful, but you're still building the rest of the app — hotkey, app-targeting, text insertion, custom vocabulary — yourself.

GPU acceleration on Apple Silicon

The reason Whisper feels fast on Mac is the Metal Performance Shaders (MPS) backend, which runs the model on the GPU instead of the CPU. The speedup over CPU-only is roughly 5–10x on Apple Silicon.

If you're using a Mac app that wraps Whisper, GPU acceleration is on by default. If you're running the Python version, it's CPU-only unless you use the MLX or PyTorch-MPS variant.

On Intel Macs, Whisper runs CPU-only and is meaningfully slower. It still works for dictation, just don't expect real-time.

Whisper accuracy: what to expect

On clean audio, native English, modern microphone:

tiny: ~10–15% word error rate (WER) on technical content
base: ~5–8% WER
small: ~3–5% WER
medium: ~2–4% WER
large-v3`: ~1–3% WER

For comparison, professional human transcribers operate around 2–4% WER. The largest Whisper models are competitive with humans on clean audio.

Whisper degrades gracefully with noise, accents, and overlapping speech — but it does degrade. A noisy coffee shop with a bad microphone will hurt accuracy regardless of model size.

Custom vocabulary: the unsung accuracy hack

Whisper has no concept of "your vocabulary" out of the box. But the apps that wrap it do — JustVoice, Superwhisper, and similar tools let you add custom terms (drug names, library names, project nouns, character names) that the app injects as prompt context. Once added, those terms transcribe correctly.

This is the single biggest accuracy improvement most users can make. Add 50 of your most-used technical or domain-specific terms and Whisper stops mangling them.

Should you run Whisper locally or use a cloud service?

Local Whisper:

Privacy: audio never leaves your Mac
Cost: no per-minute fees
Latency: no network round-trip
Reliability: works offline

Cloud Whisper / hosted ASR:

No model download or local compute requirements
Sometimes faster on very old hardware
Cross-platform / cross-device sync (depending on service)

For most Mac users in 2026, local wins on every axis except cross-platform sync.

Get started

The fastest path: Download JustVoice for Mac. It's free, ships with a Whisper model, and works five minutes after install.

If you'd rather research more options first, see our comparison of every major Mac dictation app.

---

How to dictate on Mac (complete guide) → Vibe coding by voice on Mac →