Site icon Stuff South Africa

Koe Recast is an AI tool that turns anyone’s voice into something else entirely

Koe Recast main

A technology previously available to Batman and people who kidnap ambassadors’ daughters in action movies, Koe Recast is bringing spookily accurate voice changing to more users. Yes, you can get apps and physical devices to change your voice for you. But Koe Recast lets you do it in real-time, without anything getting in the way.

“Bring the money. Come alone”

Or, at least, it will. The AI-powered technology is still in development, though you can try it out today if you want to. The app will convert up to twenty seconds of vocal audio, with a few caveats, into something a little more… exotic.

If you’re still not convinced, then check out the company’s demo of Facebook founder and head Mark Zuckerberg. Koe Recast turns Zuckerberg from himself into a narrator, a female, or a young anime protagonist. It does so while preserving cadence and tone, which is fairly remarkable.

But it isn’t new. Several companies already offer something like this. The difference is that Koe Recast is being developed by a single person. Texas-based developer Asara Near is working to bring this tech to platforms like Discord, Zoom, or Twitch by way of a desktop app.


Read More: Celebrity deepfakes are all over TikTok. Here’s how you can spot them


Near, speaking to Ars Technica, explained that the app is able to “…modify the parts of audio that correspond to a speaker’s personal style or timbre while preserving the parts of the audio that correspond to the spoken content such as prosody and words.”

“This allows us to change the style of someone’s voice to any other style, including their perceived gender, age, ethnicity, and so on.”

Near believes that more people will use Recast for good rather than evil. That might be a stretch. Even Stuff‘s first thoughts upon hearing about the tech was to think of various ways it could be used for nefarious purposes. Accurate vocal modification leads almost inevitably to accurate vocal deepfakes. Trusting audio and video may become even harder than it already is.

Source: Ars Technica

Exit mobile version