3 Realistic Deepfake Voice Apps That Change How You Sound
Given enough voice samples, there are plenty of deepfake voice apps that can accurately deepfake your voice on top of another person pretty quickly. About one hour of speech is enough for Artificial Intelligence (AI) to form an accurate picture of a speech pattern.
It just goes to show how easy it is these days to fake who you are. Voice phishing and related digital criminal activities are on the rise, and while deepfake voice apps are generally used for entertainment purposes, it doesn’t take much for a person to turn rogue and use the software with bad intentions in mind.
Voice changers and voice changing apps have been popular for a while and can be found in both the Android app store, as well as Apple’s iOS digital download store. But adding a layer of machine learning algorithms and AI tech is something new. It opens up a range of new possibilities in the realm of deepfake audio technology, mostly in the accessibility department. Now everyone is a voice manipulator, just by downloading a simple application on your smartphone.
Let’s take a look at what deepfake audio is, and explore which deepfake voice apps are worth downloading at this moment in time. The number of deepfake voice changers out there might still be low, but a few gems can already be found. It’s expected that this number will only rise in the future.
What Is Deepfake Audio?
Deepfakes are a synthetic media in which a person is replaced with someone else’s likeness. The audio aspect of deepfakes covers this concept in relation to audio recordings, in particular human voices.
Deepfake audio is created using AI and machine learning algorithms, which is a different creation method from the more traditional audio manipulation “by hand” (using audio editing software and / or hardware). The distinction is within the creation method, which is automated using a software program or other type of digital application.
Deepfake audio can be used for all types of purposes, ranging from entertainment to criminal activities. Most deepfake audio apps are intended for entertainment purposes, but could potentially also be used for more harmful intents and purposes. This makes the publication of such software programs and apps somewhat controversial.
Usually, software programs or apps that create deepfake audio recordings will utilize what is known as ‘voice cloning’ or a ‘voice double’. Below you will find a brief explanation of the term, so you can get a better understanding of what this type of software or app is capable of.
What Is Voice Cloning?
Voice cloning or voice doubling takes an audio file of any individual voice and uses it as a source material for creating deepfake audio recordings. With just several hours of source material (audio recordings of an individual voice), deepfake audio software is capable of cloning the voice so it can be used to create new deepfake audio recordings.
The origin of voice cloning is found in software applications like WaveNet, which was created in 2016 by Google-backed startup DeepMind. It was a revolution for the more traditional Text-To-Speech (TTS) systems. The key difference between the earlier and the later TTS systems was the use of concatenative TTS versus parametric TTS:
- Concatenative TTS: The ‘early generation’ of TTS. A large database of short speech fragments would be used from an individual voice source. These are recombined later to form full sentences. It had many downsides, which were generally related to intonation and emotion that could be put into the TTS voice.
- Parametric TTS: The ‘second generation’ of TTS. All the information needed to form TTS sentences would be stored in the parameters of the model. Different model inputs would result in different types of speech characterizations. This made it easier to put emotions and intonation into the TTS, allowing ‘fake voice recordings’ to be a lot more realistic as a result.
Different companies around the globe have created different variations of TTS software, each solving a piece of the puzzle to make things sound a lot more realistic.
Voice cloning has become a lot more impressive over time, and deepfake audio software has placed this progressive improvement over time into a new era. Where the first generation of TTS software would be the ‘invention of the wheel’, the deepfake audio apps of today would get closer to a ‘Mars rocket invention’. It’s a giant leap forward, which is also why it is now so easy to place this seemingly complicated software into a relatively simple smartphone app for the wider public.
While still in the early stages, there already are some apps published on Android and iOS platforms, which allow anyone with a smartphone to create their own voice deepfakes. It is now easier than ever to start manipulating what you can make someone say. Let’s explore some of the best deepfake voice apps currently out there.
Please note, this is a new market that is emerging rapidly and also one that is starting to get relatively tightly regulated, as there are many dangers and downsides to deepfake audio software as well. So it is possible these apps won’t be available forever. So do make sure to download them on your phone while you still can!
Best Deepfake Voice Apps
We have searched the internet and our smartphones (both on Apple’s iOS app store and Android’s Google Play store) for the best deepfake voice apps that can currently be downloaded by the wider public. This resulted in the selection of three deepfake voice apps that utilize AI and machine learning algorithms at their core. We’ll start with the most notable one, which is the much-debated and controversial app called Lyrebird AI.
1. Lyrebird AI
It’s one of the most common names in the world of deepfake audio apps: Lyrebird AI. The application uses Artificial Intelligence to enable creative expression, or so they say themselves. The core goal of the app is clearly related to its entertainment value and the creative aspect of the tool.
The Canadian Montreal-based start-up has been hard at work improving the tool since 2017, with the aim to “create the most realistic artificial voices in the world”. A mission statement that certainly resonated with a lot of people, judging from its popularity in the short timeframe the tool has existed.
Lyrebird aims to capture the very DNA of the voice using a machine learning model. Every factor that makes a voice unique, it captures and uses to improve the final results upon. And for that, it needs samples. A surprisingly small amount of samples, actually. Just a few minutes of audio is enough to create a comprehensive image of a voice. That’s all the algorithms need to disassemble the voice and uncover its DNA, as the creators themselves call it.
The core difference from traditional TTS? The software learns without supervision or human intervention. That’s what makes it a true ‘deepfake tool’. The model itself will decide which aspects of the sounds it should focus on, and which to ignore. And there are no initial instructions, only a set of general rules that make up the algorithm.
A very impressive piece of technology that we’ll be hearing a lot more of in the (near) future. But it’s not the only deepfake audio app out there, since there’s also some healthy competition out there. Take for example Overdub, which is another one of those emerging tools for audio deepfakes from the same creators.
Descript has also created Overdub, which is a compatible tool to Lyrebird in a sense. Overdub is an app that is meant to correct your voice recordings by typing. The tool is powered by the aforementioned Lyrebird AI.
While currently still in development, the beta version of Overdub is already available for testing. Anyone is able to try it out for themselves on the official Overdub page on Descript’s website.
The tool will be of massive use to people that are in the podcast industry, for example. Overdub makes it possible to quickly override small errors in big audio files, without much problems. If you misspoke a tiny little bit and you don’t feel like editing the entire file, Overdub can scrub out the small error in the blink of an eye.
Everything the creator needs to do is to input the source audio file, and write out what was intended to be said. The voice of the source material will then be used to override the error in the audio, which completely removes the mistake without ever having to redo (parts of) the original recording. It will save many hours of work for an entire media industry, that’s not just limited to podcasting.
Just the YouTube sphere alone would massively benefit from a simple tool like this. We are very excited for the final version over Overdub, and are confident that Descript is capable of delivering a high-quality final product.
The Apple fans out there will also appreciate an app called VoiceApp, which is another deepfake audio app that is available to iOS users. It’s an AI-powered voice changer that allows the user to take on any voice it desires, using the magic of machine learning algorithms.
While intended purely for entertainment purposes, one can quickly see how such a tool could be used for a wide range of applications. The quality of the voices is almost too good to be true, and (for now) it’s a great way to prank your friends or family.
VoiceApp uses AI to recognize your voice and create a computer-generated impression of it that is hard to distinguish from the real deal. The app is mainly made to impersonate celebrities and famous people, so remember that the use is limited. Its great fun though, and a good showcase of what such tools are capable of for the Apple-loving audience.
Getting Started With The Best Deepfake Voice App
The developments in the world of deepfake voice applications is very interesting. The speed with which this industry is expanding and creating highly realistic impressions of voices is almost too good to be true. Especially thanks to innovative companies like Descript, the industry is quickly evolving into a serious tool for the entertainment industry.
But we do need to keep in mind that potential dangers (e.g. voice phishing schemes from criminals over the internet or telephone) could be looming around the corner. It’s exactly why we should enjoy the ‘cowboy period’ of this industry, where anything still goes without strict restrictions from governments and other regulating bodies.
For now, tools like Lyrebird AI and Overdub, but also smaller entertaining apps like VoiceApp for iOS devices, can certainly be seen as a help to the world of digital media. The future of deepfake audio is just around the corner. And these are the very first software tools and apps that are presented to the wider public. In the upcoming years, a big increase in such applications is to be expected. For now, enjoy Lyrebird AI and see what it can do for you in a creative way!