How many iterations does a realistic deepfake image need?
Deepfakes are taking the world by storm, and they are getting more and more realistic by the day. The technology might be a little intimidating to beginners, and unknown to people with a little more technical expertise.
This article aims to help those interested in creating deepfake images to learn about the process in a bit more detail. The focus will be on the number of iterations a realistic deepfake requires to be regarded ‘realistic’, i.e. as good as real and impossible to distinguish from the real thing by the average human being.
Deepfakes make use of machine learning and Artificial Intelligence (AI) technology, primarily driven by (a set of) algorithms. Generally, these algorithms and how they are used by the software will decide how long it takes to generate a deepfake image, but also how many iterations are required. The numbers will often run into the many thousands of iterations.
When Are Images Called A Deepfake Image?
Distinguishing between a regular image and a deepfake image is important. A photoshopped image is not a deepfake, but can look equally realistic. The main difference will be in the method of creation.
Deepfake image creation is automated by AI and machine learning algorithms, while man-made photoshops will use software with the input of a human. They require skill and expertise and aren’t auto-generated. Deepfakes are different. They are created entirely by machines, even when humans give the initial input image.
The AI software will use the processing power of a computer to create many iterations of the same image, to end up at a certain final result.
For example, if you’d like to morph the face of Hollywood actor Silvester Stalone into that of his colleague Tom Cruise, you’d need both the starting point and the image it needs to morph into. The process in between will be the ‘deepfake creation’, i.e. the algorithms at work to alter one image to look like something else entirely. Take the face of Silvester Stalone from image one, and place it into the image of Tom Cruise using open-source deepfake software like DeepFaceLab or Faceswap.
Iterating From Zero To Deepfake Image
In order to get a good impression of what it takes to go from nothing and create a realistic deepfake, whether that’s a video or image (the process is similar), let’s pick several iteration numbers and see what happens.
The numbers of iterations are based a previously mentioned software called DeepFaceLab. A relatively strong PC build (good quality graphics card and processor) is recommended when you’re starting out with these types of software packages. Any configuration is fine, but it will result in different processing times. Compare it to video editing: after completion the rendering will take a shorter or longer amount of time depending on the hardware you’re working with.
We will move from as little at 500 iterations and extend it all the way up to 250,000 iterations. In order to create a deepfake with that many layers, a good PC build would need at least one or perhaps even multiple days for the image or video to be created. Again, this will all depend on the hardware you’re working with. Chances are these processing estimations will become outdated rather quickly and deepfake software will get more efficient over time, so please be aware of that.
A test was performed starting at 500 images layered on top of each other with about 24 hours of computing results. Another few tests were done with 2,000 and 5,000 images as well for the same piece of media content.
As expected, the final result with only 500 image iterations turned out to be the worst performing deepfake of all the results. Manual adaptations were required in order to make the deepfake actually look realistic to the human eye. Therefore, we can safely conclude that 500 iterations is insufficient for a good final result.
Moving up in the ranks, let’s attempt to create a deepfake with about 3,000 iterations. It’s another piece of media content (instead of former American President Bush, this one features the infamous green screen ‘meme’ video of Shia Labeouf shouting ‘Just Do It’).
It’s easier to judge for yourself and see what 3,000 iterations can look like. Check out the YouTube video below to see the final results of this particular deepfake:
Let’s just say: it’s not impressive, but it’s not completely terrible either. It’s clear that it’s not real, so more AI thinking would be needed to create much better output. We can assume that more iterations and more computing power is needed to make things look a lot more realistic.
For most cases, it clearly shows the core principle already: more iterations and more space for more computing power will allow the creator to end up with better final results.
Higher numbers, we need much higher numbers! Let’s throw a slightly more serious number against the source images. What can 25,000 iterations do for the final results? Will things finally get serious?
As Jarrod Overson tried in his Medium article, 25k iterations ‘looked promising’ using the DeepFaceLab software. From the looks of it, however, it was still very clear that images had been layered on top of each other. It’s just a reminder of how difficult it can actually be to make a ‘fake image’ look real. It’s a really, really complex procedure.
In order to get serious, still more work needed to be done. Getting much better with the final results, required serious ramping up of the computing power that is needed for the required the AI to deliver something that will actually fool the human eye.
Visually much better results were achieved with a ‘serious’ ramping up of the number of iterations. With a whopping 150,000 image iterations, the deepfake content was becoming much more smooth overall and harder to distinguish from the real thing.
Check it out for yourself in the YouTube video below that showcases the differences between 25,000 and 150,000 iterations (the original clip is also showcased):
However, if you’d like to create a deepfake image or video that is actually impressive, you’ll need to go in overdrive with the number of iterations. Using a baseline of 250,000 iterations, the results can be safely deemed ‘near-realistic’. The creators themselves (Corridor Crew) called it ‘The best deepfake on the internet’, as can be seen in the below YouTube video:
While the hair and lighting was a bit of a giveaway, the mannerisms were really well done. With some alterations to algorithms and AI tech, this problem could potentially be solved quickly. Tom Cruise’s doppelganger will no longer have to be ‘almost him’ but will actually become indistinguishable from the real actor in that way. It will save Hollywood a whole lot of money on expensive world-famous actors, for sure.
If we can take any lessons from this 250,000 iterations attempt, it’s that creating an actually realistic deepfake is still excessively difficult. And for now, that’s a good thing. Even some highly technical bright minds with expert software still aren’t able to create a video or image with AI tech that can practically be regarded as the real thing, while actually still being a deepfake.
Start With At Least 40,000 Iterations
So if you’re sitting at home contemplating the effort you’d need to put into your own deepfake creation, we’d like to give you some advice. If you want a deepfake image that looks even remotely good, start with at least 40,000 iterations. The deepfake image resulting from that process will still be distinguishable from the real thing, but it will already start looking the way you’d like to.
As of the moment of writing, however, it’s still extremely difficult to create a realistic deepfake with just the tools that are available. Manual changes will still need to be made in order to get to where you’d like to be. And technically, that’s not the definition of a true deepfake image.
At least, not the type of deepfakes we’d be looking for. The ones that could potentially disrupt the world, overthrow political regimes, or destroy the careers of famous people around the globe. That dystopian future is still far away. Remember that the technology is here, however, and that it will only take a few alterations to algorithms to get to the next level.
More Iterations Equals More Realistic Deepfake Results
To sum up the key takeaways from this article, it will take many, many iterations for people to get towards a realistically looking deepfake image or video. Currently, even with the most sophisticated and high-tech software and the brightest minds, a deepfake image will still sometimes be detectable due to small cues that are still in there.
Once the bugs and tiny mishaps have been successfully scrubbed out, however, these things could become very dangerous to or society very quickly. Luckily, a massive amount of computing power is still needed in order to create an actually good deepfake. Mass-production of these things is still far away. If only for the technical expertise that’s still needed to set up everything correctly.
If you’re creating a deepfake for entertainment purposes, go for 40,000 deepfake image iterations or more. That’s the bare minimum. If you want something to look good, go beyond 150,000 or even 250,000 iterations to get something that looks great to a larger audience. Best of luck with your own deepfake project!