Java Text to Speech Sound Visualization

An Automated Method to Correct Artifacts in Neural Text-to-Speech Models

Abstract: Recent advances in deep learning technology have enabled high-quality speech synthesis, and text-to-speech models are widely used in a variety of applications. However, even state-of-the-art ...

How Large Scale Speech Models Will Impact Voice AI

A duplex speech-to-speech model changes the premise: The intelligence layer consumes audio and produces audio directly. The model can attend to what was said and how it was said—content and delivery ...

People

Jane Fonda Reacts to Sound Flub During Life Achievement Speech at SAG Awards 2025: 'I Can Conjure Voices!'

At the 31st SAG Awards, Fonda became the 60th recipient of the Life Achievement honor Esther Kang is a writer at PEOPLE. She has been working at PEOPLE since 2023 and has previously worked for ...

GitHub

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Small and fast: only 123M parameters. High-quality voice cloning: state-of-the-art performance in speaker similarity, intelligibility, and naturalness. Multi-lingual: support Chinese and English.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results