Dictate
Voice to text in any app
Hold ⌥ C, talk, release. Whatever you said becomes typed text in whichever app you were last working in. Slack message, Terminal command, browser address bar — anywhere a cursor blinks.
A microphone in the menubar. Whisper.cpp runs locally on the laptop. I press a key, I talk, the words paste into whichever app I was last typing in — Terminal, Slack, Apple Mail, the address bar of this browser. The audio never leaves the machine.
macOS will warn you once. Right-click → Open. That's the price of a tool that doesn't phone home.
— the whole point —
No need to be in any specific app. Mid-email, mid-terminal, mid-nothing — hold ⌥ C and speak. VoiceType writes the transcription to your clipboard and presses ⌘ V on your behalf. Anywhere a cursor blinks, your words land.
Think out loud without losing the thread. Speak however you actually speak — half-formed, code-mixed, in your own jargon — and VoiceType learns your vocabulary as you go. Every other dictation tool first asks you to find the right app, click record, then talk. VoiceType skips all of that.
— section 1 —
VoiceType writes the transcription to your clipboard and presses ⌘ V on your behalf. Anywhere ⌘ V works, VoiceType works.
— section 2 —
Dictate
Hold ⌥ C, talk, release. Whatever you said becomes typed text in whichever app you were last working in. Slack message, Terminal command, browser address bar — anywhere a cursor blinks.
Steer
Configure your own voice commands. Say help and the settings overlay opens. Say a saved phrase name and the saved text pastes itself. The model picks the action that fits what you said.
Translate
Optional. Drop a DeepL API key into settings and VoiceType translates as it pastes. Speak German into Slack, English text appears. DeepL free tier covers ~500k characters / month.
— section 3 —
Vibecoding with Claude · steering an overlay open · translating German into English on the fly.
Dictate — vibecoding
A 14-second thinking-out-loud prompt to Claude Code. Half-formed reasoning, mid-sentence pivots, project-specific terms — VoiceType doesn't care. By the time you stop talking, the prompt is already in the chat box.
Steer — say "help", the overlay opens
Hold ⌥ C, say "help", release. The intent router opens the help overlay floating above your work — without leaving the keyboard or losing focus on what you were doing.
Translate — speak German, paste English
Speak in your native language. The English (or whatever target you set) lands in the chat box.
Mockups are HTML stand-ins. Real screenshots from your laptop replace these on the live site.
— section 4 —
I built VoiceType because Claude Code in the Terminal had no voice input, and every dictation tool I tried sent my audio to someone else's server. Now it's the program I touch most every day.
Tools like Wispr Flow charge a subscription for what should be a local feature. Whisper runs on a laptop. Nothing has to leave the machine.
When you talk to an AI agent, typing isn't the most efficient way to communicate any more — the agent can reason about what you meant. Voice doesn't have to be precise commands, it can be loose intent. The tool just has to learn your vocabulary, your projects, your context.
— section 5 —