This is amazing, I’ve been having the problem with live STT (mainly for voice assistants). I’m curious if your model + whisper tiny would outperform Whisper small or even medium. I’ve been having issues where even Fast Whisper small takes too long.
Also bummed how Qwen3-1.7B purely nonthinking hasn’t been released. Otherwise, I’m curious on “how low can you go”
What hardware are you running? Parakeet runs on nvidia and Mac and it’s way faster than Whisper. And I’ve had issues with training Qwen3 (and even Qwen2.5 but I think I was masking stop tokens wrong). I’ve had success with Gemma 3 though, and they have some really small models (270m and 1b). Maybe 270m for just transcript cleaning? I wonder if the 1b model can handle the transcript analysis…
I’m running on a Jetson Orin Nano. Do you know if there is a parakeet + Wyoming repo?
Unfortunately I have zero experience with the Jetson family, and Parakeet itself is a pain to get running IMO - I took the easy option and used the ONNX version
Try the inkvoice app for example. It can run parakeet with a simple click