hacker news

Hi HN,

I was frustrated by the lag in Electron-based Whisper wrappers. Most of them feel disconnected from the typing experience because of the 2-5s delay.

I built VoiceFlow to solve this. It’s a native Rust core that targets 0.3s-0.6s latency. The goal is to make voice-to-text feel as instant as typing.

Key features:

Global hotkey [Ctrl+Space] to type into any app (Slack, VS Code, etc.)

Native Rust implementation for performance and low memory footprint

AI-based post-processing for punctuation and style

Privacy-focused: Microphone is only active during the keypress

I'm currently in private beta and looking for feedback, especially on the latency and UX.

I'll be around to answer any technical questions!

Leftium ・ 7 hours ago

Seems like this one is Windows-only (even though it's Tauri?)

And it's not local (uses a cloud-based transcription API)

Also doesn't seem like it's realtime streaming, either. To get the most connected typing experience, try showing results in under a second from within the first word spoken (not after the utterance is complete)

This HN comment captures why realtime streaming is important: https://hw.leftium.com/#/item/47149479

I've also been prototyping realtime streaming transcription with multimodal input: https://rift-transcription.vercel.app

Show HN: VoiceFlow – Sub-second (0.3s-0.6s) voice-to-text built in Rust