Microsoft AI Releases VibeVoice-Realtime: A Lightweight Real‑Time Text-to-Speech Model Supporting Streaming Text Input and Robust Long-Form Speech Generation
Microsoft has launched VibeVoice-Realtime-0.5B, an actual time textual content to speech mannequin that works with streaming textual content enter and lengthy kind speech output, geared toward agent fashion purposes and reside knowledge narration. The mannequin can begin producing audible speech in about 300 ms, which is crucial when a language mannequin continues to be producing…
