The 8088 The 8088 ← All news
Simon Willison Emerging AI Innovations 15h ago

microsoft/VibeVoice

★★★★★ significance 2/5

Microsoft has released VibeVoice, an open-source, Whisper-style audio model designed for speech-to-text with built-in speaker diarization. The model is available under an MIT license and can be run efficiently on hardware like Mac using MLX-based conversions.

Why it matters Open-sourcing high-fidelity diarization models lowers the barrier for developers building sophisticated, localized voice-to-text applications.
Read the original at Simon Willison

Entities mentioned

Microsoft

Tags

#microsoft #speech-to-text #open-source #audio-model #asr

Related coverage