Why voice typing pairs well with AI agents
Talking to an AI agent rewards long, specific prompts — and long, specific prompts are exactly what nobody types. Voice dictation removes the keystroke tax on prompt length, which is why developers using Cursor and Claude Code with voice typing routinely write 200-word feature specs where they used to write 30-word fragments and then iterate. The longer prompt one-shots the change. The shorter prompt creates three rounds of clarification.There's a secondary effect that takes a week to feel and is hard to explain to anyone who hasn't tried it: voice keeps you in flow. Switching from "thinking about code" to "typing about code" engages a different motor loop. Speaking out loud doesn't. Senior devs who pair-program describe the same effect — talking through a problem with another person clarifies it. Talking to an AI agent does the same thing, just with a faster typist on the other end.
This post walks through setup for the four AI agents that matter most in 2026: Cursor, Claude Code, ChatGPT desktop, and GitHub Copilot Chat. It assumes you have a Mac dictation app installed (we use JustVoice, but the patterns transfer to any system-wide voice typing tool).
Setup with Cursor
Cursor is the easiest AI agent to dictate into because it treats the chat panel like any other text field — your dictation app's hotkey just works. The only Cursor-specific tuning is the hotkey collision check and turning on code mode for the cases where you're dictating identifiers and not prose.Step by step:
- Open Cursor → Settings → Keyboard Shortcuts. Search for any binding on Right Option, F5, or whatever hotkey you've assigned to your dictation app. Cursor doesn't claim Right Option by default but its Vim mode and some extensions do.
- In your dictation app, set a hotkey that's comfortable to hold. Right Option is the JustVoice default and works well one-handed.
- Open the Cursor chat panel (Cmd-L on Mac), put your cursor in the prompt field, hold the dictation hotkey, and start talking.
- Enable code mode for technical dictation. In JustVoice this is a Pro feature that detects when your foreground app is an IDE and switches the cleanup pass to preserve identifiers, fix snake_case vs camelCase, and avoid auto-capitalizing function names. You'll notice it the first time you dictate "use effect hook" and get
useEffectinstead of "Use Effect." - Add a custom vocabulary entry for your project's domain terms — see the table below.
Setup with Claude Code (CLI)
Claude Code runs in your terminal, which means voice dictation lands in iTerm2, Ghostty, Terminal.app, or whichever terminal emulator you use — there's nothing Claude Code-specific to configure. The wrinkle is that terminals strip a lot of formatting, and your dictation app's auto-capitalization can fight with your shell.Concrete setup:
- Make sure your terminal emulator has accessibility permissions granted to your dictation app. (System Settings → Privacy & Security → Accessibility.)
- Install Claude Code (
npm install -g @anthropic-ai/claude-codeor via Homebrew) and runclaudein your project directory. - At the prompt, hold your dictation hotkey and speak. The text lands at your cursor.
- Turn on IDE detection if your dictation app supports it. JustVoice's Pro tier detects iTerm2, Ghostty, Terminal, and Warp by default and applies code mode automatically.
- For long prompts, dictate into your terminal but compose in a scratch file (
vim /tmp/spec.mdor your editor of choice) and then paste. You'll catch transcription errors before Claude Code commits to them.
PLAN.md file and have Claude Code read it.
The official Claude Code documentation has more on the CLI itself. Voice typing is just a normal text input on top.
Setup with ChatGPT desktop
The ChatGPT Mac app is the easiest case in this list — it's a normal Mac app with a normal text field, so any system-wide dictation hotkey works without configuration. No setup tax.The one thing to know: ChatGPT's own built-in voice mode (the floating mic) is voice-to-voice, optimized for spoken conversation. That's a different workflow than voice-to-text. If you want long, dictated prompts that you can review and edit before sending, use a system-wide dictation app and ignore ChatGPT's built-in mic. If you want a free-form spoken conversation with audio output, use the built-in mic.
Quick setup:
- Install ChatGPT for Mac (
chatgpt.com/downloadif you don't have it). - Cmd-Space ChatGPT, click in the prompt field.
- Hold your dictation hotkey, speak, release. Edit if needed, hit Return to send.
- Add the GPT-specific vocabulary from the table below — "system prompt," "tool use," and so on.
Setup with GitHub Copilot Chat
GitHub Copilot Chat lives inside VS Code (and Visual Studio, JetBrains, etc.) — voice typing into Copilot Chat is identical to voice typing into VS Code's editor pane, with one caveat about focus. The Copilot Chat panel sometimes loses focus when the language server reindexes; if your dictation lands in the wrong field, click back into the chat input and try again.Setup:
- In VS Code, install the GitHub Copilot Chat extension if it isn't already.
- Cmd-I opens inline chat; Ctrl-Cmd-I opens the chat panel. Either accepts dictated input.
- With your cursor in the chat input, hold the dictation hotkey and speak.
- Code mode matters more here than in Cursor or Claude Code because Copilot Chat is more sensitive to identifier formatting in its prompts. "Refactor the user auth module" vs "Refactor the userAuth module" can change which file Copilot opens.
Custom vocabulary for AI prompts
Whisper is good but not magic — it doesn't know your tech stack's proper nouns until you teach it. A 30-entry custom vocabulary list will eliminate roughly 80% of the corrections you make in a typical AI-coding session.The terms below are the ones we add by default for any developer using JustVoice with AI agents:
| Term as spoken | Target output | Why it matters |
|---|---|---|
| system prompt | system prompt | Whisper sometimes hears "systems prompt" |
| scratchpad | scratchpad | Often transcribed as "scratch pad" |
| rubric | rubric | Frequently misheard as "Reuben" |
| k-shot | k-shot | Hyphenation and the "k" letter often dropped |
| n-shot | n-shot | Same as above |
| few-shot | few-shot | Often becomes "fewshot" or "few shots" |
| tool use | tool use | "Tool" sometimes becomes "to" |
| tool calling | tool calling | Same |
| context window | context window | Reliable but worth pinning |
| token budget | token budget | "Tokens" vs "token" matters |
| prompt injection | prompt injection | Often becomes "prompt inject" |
| inference | inference | Confused with "in France" sometimes |
| LLM | LLM | Pin so it doesn't become "L L M" with spaces |
| MCP | MCP | Tends to become "MCP server" if not pinned |
| Anthropic | Anthropic | Often becomes "anthropic" lowercase |
| Cursor | Cursor | Capitalize when it's the app name |
| pnpm | pnpm | Always lowercase, hard to dictate |
| TypeScript | TypeScript | Whisper splits as "Type Script" |
| useEffect | useEffect | Hard for any STT without code mode |
| useState | useState | Same |
| Tauri | Tauri | Niche; pin it |
| Supabase | Supabase | Splits as "supa base" without pinning |
Code dictation mode: when to enable, when to disable
Code mode tells your dictation app to skip the "fix grammar and capitalization" pass and instead preserve identifiers, hyphens, underscores, and the absence of trailing periods. Turn it on when you're dictating code or identifiers; turn it off when you're dictating prose.JustVoice's code mode auto-enables when your foreground app is detected as an IDE (Cursor, VS Code, Xcode, JetBrains, Terminal, iTerm2, Ghostty). For most users this Just Works. The two cases where you want manual control:
- You're dictating a prose comment inside a code file. Code mode will under-punctuate. Hold a modifier or use a different hotkey to force prose mode.
- You're dictating code into a chat app (Slack, Discord). The IDE detection won't fire, so explicitly enable code mode if you want backticks and identifier preservation.
Real-world workflow examples
The flagship workflow is: dictate a 150–250 word feature spec into Cursor or Claude Code, edit it for ten seconds, hit submit, and read the resulting PR.A real example, lightly edited from one of our own sessions:
> "Update the audio capture pipeline so that when the user holds the hotkey for less than 200 milliseconds we treat that as a no-op and don't fire a transcription request. Add a configurable threshold in settings, default 200, and surface it under Settings → Hotkeys → 'Minimum hold duration'. Add a Rust unit test that confirms the no-op path. Don't change the long-press path. Update the changelog."
Typing that takes about 90 seconds. Dictating it takes 25. The Cursor / Claude Code response is essentially identical either way — but the dictation version saves 65 seconds and tends to be more complete because the friction of typing each clause selects against detail.
A second pattern that's grown on us: dictate the test cases first. "Write a test that confirms X. Write a test that confirms Y. Write a test that confirms Z. Now implement the function that makes those tests pass." Easier to dictate than to type, and AI agents love test-first prompts.
Limits and tips
A short list of things we wish someone had told us:
- Whisper hallucinates on silence. If you hold the hotkey but don't speak, you'll occasionally get a transcribed phrase from the model's training data. Most apps (JustVoice included) gate this with a voice-activity check, but it's worth knowing.
- Background music breaks accuracy. Whisper was trained on a lot of music-overlaid speech and is decent, but lyrics will leak into your transcription. Pause Spotify before long dictation sessions.
- Don't dictate passwords or API keys. Obvious in hindsight. Worth saying.
- Re-record, don't edit. A 5-second redo is faster than a 20-second cursor-and-keyboard cleanup. Use it.