macOS · Apple Silicon & Intel · v0.9.1 · ad-hoc signed

hold ⌥ C, speak, the text appears.

A microphone in the menubar. Whisper.cpp runs locally on the laptop. I press a key, I talk, the words paste into whichever app I was last typing in — Terminal, Slack, Apple Mail, the address bar of this browser. The audio never leaves the machine.

macOS will warn you once. Right-click → Open. That's the price of a tool that doesn't phone home.

Capture an idea the moment it arrives.

No need to be in any specific app. Mid-email, mid-terminal, mid-nothing — hold ⌥ C and speak. VoiceType writes the transcription to your clipboard and presses ⌘ V on your behalf. Anywhere a cursor blinks, your words land.

Think out loud without losing the thread. Speak however you actually speak — half-formed, code-mixed, in your own jargon — and VoiceType learns your vocabulary as you go. Every other dictation tool first asks you to find the right app, click record, then talk. VoiceType skips all of that.

How it works

⌥ C Hold Anywhere. Any app.
Menubar mic turns red.
▮▮▮▮▮ Talk whisper.cpp transcribes
locally. No network.
⌘ V Paste Into whichever app
you were just in.

VoiceType writes the transcription to your clipboard and presses ⌘ V on your behalf. Anywhere ⌘ V works, VoiceType works.

Three things VoiceType does

Dictate

Voice to text in any app

Hold ⌥ C, talk, release. Whatever you said becomes typed text in whichever app you were last working in. Slack message, Terminal command, browser address bar — anywhere a cursor blinks.

Steer

Voice as a command, not just text

Configure your own voice commands. Say help and the settings overlay opens. Say a saved phrase name and the saved text pastes itself. The model picks the action that fits what you said.

Translate

Speak any language, paste English

Optional. Drop a DeepL API key into settings and VoiceType translates as it pastes. Speak German into Slack, English text appears. DeepL free tier covers ~500k characters / month.

Each feature, doing the thing

Vibecoding with Claude · steering an overlay open · translating German into English on the fly.

Dictate — vibecoding

A 14-second thinking-out-loud prompt to Claude Code. Half-formed reasoning, mid-sentence pivots, project-specific terms — VoiceType doesn't care. By the time you stop talking, the prompt is already in the chat box.

Steer — say "help", the overlay opens

Hold ⌥ C, say "help", release. The intent router opens the help overlay floating above your work — without leaving the keyboard or losing focus on what you were doing.

Translate — speak German, paste English

Speak in your native language. The English (or whatever target you set) lands in the chat box.

Mockups are HTML stand-ins. Real screenshots from your laptop replace these on the live site.

Why this exists

I built VoiceType because Claude Code in the Terminal had no voice input, and every dictation tool I tried sent my audio to someone else's server. Now it's the program I touch most every day.

Tools like Wispr Flow charge a subscription for what should be a local feature. Whisper runs on a laptop. Nothing has to leave the machine.

When you talk to an AI agent, typing isn't the most efficient way to communicate any more — the agent can reason about what you meant. Voice doesn't have to be precise commands, it can be loose intent. The tool just has to learn your vocabulary, your projects, your context.

Install in 30 seconds

  1. Download .dmg ~250 MB
  2. Drag to Applications DMG opens, drag the icon
  3. Right-click → Open One-time Gatekeeper bypass
  4. Grant 2 permissions Microphone + Accessibility
  5. Hold ⌥ C Start talking
↓ Download VoiceType

v0.9.1 · macOS 13+ · 250 MB · ad-hoc signed · github.com/polistician/voicetype