I’ve tried several, including this one, and I’ve settled on VoiceInk (local, one-time payment), and with Parakeet V3 it’s stunningly fast (near-instant) and accurate enough to talk to LLMs/code-agents, in the sense that the slight drop in accuracy relative to Whisper Turbo3 is immaterial since they can “read between the lines” anyway.
My regular cycle is to talk informally to the CLI agent and ask it to “say back to me what you understood”, and it almost always produces a nice clean and clear version. This simultaneously works as confirmation of its understanding and also as a sort of spec which likely helps keep the agent on track.
UPDATE - just tried handy with Parakeet v3, and it works really well too, so I'll use this instead of VoiceInk for a few days. I just also discovered that turning on the "debug" UI with Cmd-shift-D shows additional options like post processing and appending trailing space.
I wish one of these models was fine tuned for programming.
I want to be able to say things like "cd ~/projects" or "git push --force".
You can try VSCode Speech to Text extension that works decently well in Github Copilot chat as part of Microsoft accessibility suite.
I'll bet you could take a relatively tiny model and get it to translate the transcribed "git force push" or "git push dash dash force" into "git push --force".
Likewise "cd home slash projects" into "cd ~/projects".
Maybe with some fine tuning, maybe without.