JustVoice

Voice typing for AI agents: Claude Code, Cursor, ChatGPT, and Copilot Chat

How to set up voice dictation for AI coding agents — Cursor, Claude Code, ChatGPT desktop, GitHub Copilot Chat — with hotkeys, custom vocabulary, and code-mode tips that actually work.

Why voice typing pairs well with AI agents

Talking to an AI agent rewards long, specific prompts — and long, specific prompts are exactly what nobody types. Voice dictation removes the keystroke tax on prompt length, which is why developers using Cursor and Claude Code with voice typing routinely write 200-word feature specs where they used to write 30-word fragments and then iterate. The longer prompt one-shots the change. The shorter prompt creates three rounds of clarification.

There's a secondary effect that takes a week to feel and is hard to explain to anyone who hasn't tried it: voice keeps you in flow. Switching from "thinking about code" to "typing about code" engages a different motor loop. Speaking out loud doesn't. Senior devs who pair-program describe the same effect — talking through a problem with another person clarifies it. Talking to an AI agent does the same thing, just with a faster typist on the other end.

This post walks through setup for the four AI agents that matter most in 2026: Cursor, Claude Code, ChatGPT desktop, and GitHub Copilot Chat. It assumes you have a Mac dictation app installed (we use JustVoice, but the patterns transfer to any system-wide voice typing tool).

Setup with Cursor

Cursor is the easiest AI agent to dictate into because it treats the chat panel like any other text field — your dictation app's hotkey just works. The only Cursor-specific tuning is the hotkey collision check and turning on code mode for the cases where you're dictating identifiers and not prose.

Step by step:

  1. Open Cursor → Settings → Keyboard Shortcuts. Search for any binding on Right Option, F5, or whatever hotkey you've assigned to your dictation app. Cursor doesn't claim Right Option by default but its Vim mode and some extensions do.
  2. In your dictation app, set a hotkey that's comfortable to hold. Right Option is the JustVoice default and works well one-handed.
  3. Open the Cursor chat panel (Cmd-L on Mac), put your cursor in the prompt field, hold the dictation hotkey, and start talking.
  4. Enable code mode for technical dictation. In JustVoice this is a Pro feature that detects when your foreground app is an IDE and switches the cleanup pass to preserve identifiers, fix snake_case vs camelCase, and avoid auto-capitalizing function names. You'll notice it the first time you dictate "use effect hook" and get useEffect instead of "Use Effect."
  5. Add a custom vocabulary entry for your project's domain terms — see the table below.
A non-obvious workflow tip: dictate the spec, then read it before submitting. Cursor (and Claude Code) will commit to whatever you wrote. A 200-word voice prompt with three contradictory sentences produces a 200-line PR with three contradictory sections. Voice doesn't lower the bar for prompt clarity; it raises the throughput.

Setup with Claude Code (CLI)

Claude Code runs in your terminal, which means voice dictation lands in iTerm2, Ghostty, Terminal.app, or whichever terminal emulator you use — there's nothing Claude Code-specific to configure. The wrinkle is that terminals strip a lot of formatting, and your dictation app's auto-capitalization can fight with your shell.

Concrete setup:

  1. Make sure your terminal emulator has accessibility permissions granted to your dictation app. (System Settings → Privacy & Security → Accessibility.)
  2. Install Claude Code (npm install -g @anthropic-ai/claude-code or via Homebrew) and run claude in your project directory.
  3. At the prompt, hold your dictation hotkey and speak. The text lands at your cursor.
  4. Turn on IDE detection if your dictation app supports it. JustVoice's Pro tier detects iTerm2, Ghostty, Terminal, and Warp by default and applies code mode automatically.
  5. For long prompts, dictate into your terminal but compose in a scratch file (vim /tmp/spec.md or your editor of choice) and then paste. You'll catch transcription errors before Claude Code commits to them.
Claude Code is unusually good at "implement this spec" prompts that are 150–300 words long. That's the sweet spot for voice. Anything shorter and you may as well type; anything longer and you should probably write a PLAN.md file and have Claude Code read it.

The official Claude Code documentation has more on the CLI itself. Voice typing is just a normal text input on top.

Setup with ChatGPT desktop

The ChatGPT Mac app is the easiest case in this list — it's a normal Mac app with a normal text field, so any system-wide dictation hotkey works without configuration. No setup tax.

The one thing to know: ChatGPT's own built-in voice mode (the floating mic) is voice-to-voice, optimized for spoken conversation. That's a different workflow than voice-to-text. If you want long, dictated prompts that you can review and edit before sending, use a system-wide dictation app and ignore ChatGPT's built-in mic. If you want a free-form spoken conversation with audio output, use the built-in mic.

Quick setup:

  1. Install ChatGPT for Mac (chatgpt.com/download if you don't have it).
  2. Cmd-Space ChatGPT, click in the prompt field.
  3. Hold your dictation hotkey, speak, release. Edit if needed, hit Return to send.
  4. Add the GPT-specific vocabulary from the table below — "system prompt," "tool use," and so on.
ChatGPT's web app works the same way, except dictation is more reliable in the desktop app because it doesn't fight with browser extensions or web-based voice input.

Setup with GitHub Copilot Chat

GitHub Copilot Chat lives inside VS Code (and Visual Studio, JetBrains, etc.) — voice typing into Copilot Chat is identical to voice typing into VS Code's editor pane, with one caveat about focus. The Copilot Chat panel sometimes loses focus when the language server reindexes; if your dictation lands in the wrong field, click back into the chat input and try again.

Setup:

  1. In VS Code, install the GitHub Copilot Chat extension if it isn't already.
  2. Cmd-I opens inline chat; Ctrl-Cmd-I opens the chat panel. Either accepts dictated input.
  3. With your cursor in the chat input, hold the dictation hotkey and speak.
  4. Code mode matters more here than in Cursor or Claude Code because Copilot Chat is more sensitive to identifier formatting in its prompts. "Refactor the user auth module" vs "Refactor the userAuth module" can change which file Copilot opens.
Copilot Chat is the AI agent where voice typing pays off least dramatically — Copilot is tuned for shorter inline edits, where typing speed isn't really the bottleneck. The big wins are still in Cursor and Claude Code, where prompt length matters most.

Custom vocabulary for AI prompts

Whisper is good but not magic — it doesn't know your tech stack's proper nouns until you teach it. A 30-entry custom vocabulary list will eliminate roughly 80% of the corrections you make in a typical AI-coding session.

The terms below are the ones we add by default for any developer using JustVoice with AI agents:

Term as spokenTarget outputWhy it matters
system promptsystem promptWhisper sometimes hears "systems prompt"
scratchpadscratchpadOften transcribed as "scratch pad"
rubricrubricFrequently misheard as "Reuben"
k-shotk-shotHyphenation and the "k" letter often dropped
n-shotn-shotSame as above
few-shotfew-shotOften becomes "fewshot" or "few shots"
tool usetool use"Tool" sometimes becomes "to"
tool callingtool callingSame
context windowcontext windowReliable but worth pinning
token budgettoken budget"Tokens" vs "token" matters
prompt injectionprompt injectionOften becomes "prompt inject"
inferenceinferenceConfused with "in France" sometimes
LLMLLMPin so it doesn't become "L L M" with spaces
MCPMCPTends to become "MCP server" if not pinned
AnthropicAnthropicOften becomes "anthropic" lowercase
CursorCursorCapitalize when it's the app name
pnpmpnpmAlways lowercase, hard to dictate
TypeScriptTypeScriptWhisper splits as "Type Script"
useEffectuseEffectHard for any STT without code mode
useStateuseStateSame
TauriTauriNiche; pin it
SupabaseSupabaseSplits as "supa base" without pinning
For each app you use frequently — your IDE, your shell, your auth provider — add a custom vocabulary entry. The five-minute investment compounds.

Code dictation mode: when to enable, when to disable

Code mode tells your dictation app to skip the "fix grammar and capitalization" pass and instead preserve identifiers, hyphens, underscores, and the absence of trailing periods. Turn it on when you're dictating code or identifiers; turn it off when you're dictating prose.

JustVoice's code mode auto-enables when your foreground app is detected as an IDE (Cursor, VS Code, Xcode, JetBrains, Terminal, iTerm2, Ghostty). For most users this Just Works. The two cases where you want manual control:

  • You're dictating a prose comment inside a code file. Code mode will under-punctuate. Hold a modifier or use a different hotkey to force prose mode.
  • You're dictating code into a chat app (Slack, Discord). The IDE detection won't fire, so explicitly enable code mode if you want backticks and identifier preservation.

Real-world workflow examples

The flagship workflow is: dictate a 150–250 word feature spec into Cursor or Claude Code, edit it for ten seconds, hit submit, and read the resulting PR.

A real example, lightly edited from one of our own sessions:

> "Update the audio capture pipeline so that when the user holds the hotkey for less than 200 milliseconds we treat that as a no-op and don't fire a transcription request. Add a configurable threshold in settings, default 200, and surface it under Settings → Hotkeys → 'Minimum hold duration'. Add a Rust unit test that confirms the no-op path. Don't change the long-press path. Update the changelog."

Typing that takes about 90 seconds. Dictating it takes 25. The Cursor / Claude Code response is essentially identical either way — but the dictation version saves 65 seconds and tends to be more complete because the friction of typing each clause selects against detail.

A second pattern that's grown on us: dictate the test cases first. "Write a test that confirms X. Write a test that confirms Y. Write a test that confirms Z. Now implement the function that makes those tests pass." Easier to dictate than to type, and AI agents love test-first prompts.

Limits and tips

A short list of things we wish someone had told us:

  • Whisper hallucinates on silence. If you hold the hotkey but don't speak, you'll occasionally get a transcribed phrase from the model's training data. Most apps (JustVoice included) gate this with a voice-activity check, but it's worth knowing.
  • Background music breaks accuracy. Whisper was trained on a lot of music-overlaid speech and is decent, but lyrics will leak into your transcription. Pause Spotify before long dictation sessions.
  • Don't dictate passwords or API keys. Obvious in hindsight. Worth saying.
  • Re-record, don't edit. A 5-second redo is faster than a 20-second cursor-and-keyboard cleanup. Use it.
JustVoice was built by developers who use it daily in Cursor, Claude Code, and the terminal. It runs Whisper locally on your Mac's GPU, supports BYOK AI cleanup with Anthropic and OpenAI, and includes the code mode and custom vocabulary features described above. Read the vibe-coders setup guide for a full workflow walkthrough, see all features, or the developer use case for more on the IDE integrations. If you're coming from Wispr Flow, the comparison is here.