Site icon Stuff South Africa

Microsoft’s VALL-E AI imitates voice, tone, and emotions using a brief clip of someone talking

Microsoft launched VALL-E voice AI tool.

Microsoft launched VALL-E voice AI tool.

Voice imitation isn’t new but it’s definitely becoming more convincing. Now, it’s becoming downright frightening. Microsoft recently released VALL-E, an AI tool designed to replicate people’s voices. It doesn’t just sound like you, though. It’ll do it in a heartbeat (or three). 

It takes a mere three seconds for Microsoft’s VALL-E to capture and imitate a speaker’s voice. That’s shorter than the time it takes you to read this sentence. VALL-E’s ability to capture a speaker’s tone and emotion makes it a potential game-changer. It could be considerably harder to tell a speaker’s real speech from an AI-generated one in the near future.

Don’t fall off the VALL-E

The voice-matching AI was trained on 60 000 hours of speech data in English, using 3-second voice clips as prompts. Particular voices were used to teach it how to generate content.


Read More: Will talking to AI voice assistants re-engineer our human conversations?


Examples of  Microsoft VALL-E’s work were shared by GitHub. Some sound authentic. Others… still have a robotic tone to them. With a bigger sample size of voices, the feature seems set to spark a new dimension in vocal imitation.

The quick development and evolution of AI continue to raise ethical issues. What do you do when someone is able to capture a mere three seconds of your voice and uses it to say something you’d never say? It’s possible that you’ll be cancelled for actions that you never took. 


Read More: The ChatGPT chatbot is blowing people away with its writing skills. An expert explains why it’s so impressive


Voice actors also stand to lose their jobs. Imagine listening to Morgan Freeman narrate your favorite novel only to discover it’s just an AI. It’s a double-edged sword. Anyone would be able to secure a Morgan Freeman narration on any subject, at any time. However, his actual voice would lose value. Voice actors stand to lose money and employment opportunities if they can be easily (vocally) cloned.

VALL-E is not available to the general public, yet. There’s no need to worry (yet) about someone putting words in your mouth. But… it’s only a matter of time, it seems.

Source: Windows Central

Exit mobile version