Deepfake detection: why detection software fails and the real defense is certifying at the source

AI images stopped being a laboratory curiosity a while ago. They show up in feeds, in chats, on the front pages of newspapers, and a lot of the time you cannot tell them apart from a real photograph. The obvious fix seemed simple: hand the problem to software that, given a piece of content, tells us whether it is real or fake. Then came an independent NewsGuard report, published on May 11, 2026, with an uncomfortable finding. The leading deepfake detection tools get it wrong, and they do it often. They call authentic content manipulated, and manipulated content authentic. So if we cannot even trust the detectors, how are we supposed to tell the real from the fake?

The short answer is that we are fighting the wrong battle. Downstream detection, meaning the analysis of content that already exists in order to judge whether it is genuine, is a race lost from the start. The defense that actually holds runs the other way: certify the authenticity of every piece of information at the source, at the moment it is captured, with legal value. The paradigm has shifted. We no longer live in a world of "true unless proven false," but in one of "false unless declared true at the origin."

Deepfake detection software is unreliable: what the numbers say

The leading deepfake detection tools fail in a systematic way, and the figures leave little room for interpretation. NewsGuard tested five detection tools against a sample of 45 real images, split into three groups: 15 authentic, 15 lightly edited, and 15 heavily manipulated. The most damning result concerns false positives. On average, 13.33% of the authentic images were flagged as AI-generated, and the worst tool reached 40% false positives. Put plainly, roughly one real photograph in seven risks being branded as fake by a detector. This was not a one-off, and it was not down to a single faulty product. The same behavior showed up across every tool tested, which points to a structural limit of the method rather than a calibration error in one piece of software.

False positives: the authentic declared fake

The most insidious damage from downstream detection is not that it lets a fake slip through. It is that it condemns the real. The NewsGuard sample drew largely on war and conflict contexts, where the authenticity of a single photograph can move public opinion and shape political decisions. One of the tested tools flagged an authentic video as fake with a probability of 96.9%, a near-total confidence that happened to be completely wrong. When a detector "certifies" with that kind of apparent certainty that a real document is artificial, it hands anyone a ready-made alibi to deny the evidence. Footage of a violation becomes, with a single click, "a fake, the software says so." A tool built to expose manipulation ends up covering for it. It is the same fragility you see when the real looks fake and the burden of doubt lands on genuine material.

False negatives and the absence of consistent criteria

The second problem is a basic inconsistency between systems. NewsGuard found that the five tools disagreed with each other on 35 of the 45 images, which means they landed on the same verdict less than one time in four. The same content, run through different detectors, can come back with opposite labels. There is no shared criterion and no objective threshold. Meanwhile AI images keep getting more realistic and deepfake video keeps getting cleaner, the boundaries blur, and false negatives, meaning manipulations declared authentic, climb in step with the quality of the generators. This is the built-in limit of any deepfake detection: it measures a probability, not a certainty. As Wired Italia and other outlets covering the report pointed out, the issue is not one faulty product. It is the method of after-the-fact detection itself.

Why downstream detection is a race lost from the start

Detecting deepfakes is set up to lose because it chases a technology that moves faster than it does. Detection works by analyzing content that already exists, hunting for artifacts: odd pixels, shadows that do not line up, movement that looks slightly off. The catch is that every signal a detector learns to spot becomes, for whoever generates the deepfake, a flaw to patch in the next version. The race is unbalanced by design. The defense reacts; the attack is already a move ahead. Each new generative model shows up able to beat the detectors of the previous one, and the gap never closes, it just slides forward. That is why a system built on after-the-fact analysis ages faster than the threat it is supposed to counter.

Detection chases, generation runs faster

Anyone producing a deepfake today can train their model against the most widely used detectors and strip out the giveaways one at a time. Take face swap: it used to be recognizable from blurred edges or reflections in the eyes that did not match, and now it is far cleaner. A study from the University of Edinburgh showed how fragile the fingerprints behind detection really are, since what looks distinctive today vanishes with the generator's next update. That is exactly the picture the NewsGuard report paints, with detectors that err, contradict each other, and lose ground every time the models improve. For anyone who has to defend the authenticity of a piece of evidence, the conclusion is blunt: chasing the fake after the fact does not scale. TrueScreen comes at the problem from the other end, acquiring content with forensic methodology and certifying it with legal value at the moment of capture, so there is nothing to verify after the fact. These are the limits of deepfake detection that force the change of logic.

The paradigm has shifted: from "true unless proven false" to "false unless declared true"

For decades we treated a photograph or a video as true until proven otherwise, for the simple reason that faking them took rare skills and rare tools. Generative AI tore up that assumption. Today any piece of content could be synthetic, so the burden moves: the question is no longer whether you can prove something is false, but whether you can prove something is true. This is the digital trust paradigm flipping its sign. The presumption of authenticity that used to back up photos, videos, and documents falls away, and with it the idea that looking at content is enough to trust it. Anyone who wants to be believed now has to bring proof that their material is genuine, built before doubt gets a chance to take hold.

That flip creates a perverse side effect, the one known as the liar's dividend. Once everyone knows deepfakes exist and that even the detectors get it wrong, anyone caught doing something compromising can just deny it: "that is an AI-generated fake." The unreliability of deepfake detection is not a technical footnote here, it is the fuel. The more the detectors contradict each other, the easier it gets to dispute even authentic evidence. The liar's dividend eats away at trust not because fakes are perfect, but because doubt itself has become a weapon. The way out is not to prove after the fact that a piece of content is genuine. It is to declare and certify its truth in the instant it comes into being.

What it means to certify authenticity at the source

TrueScreen, the Data Authenticity Platform, inverts the logic of detection: instead of analyzing content to decide whether it is fake, it certifies its authenticity at the very moment of capture, producing a forensic report with legal value. Certifying at the source means acting at the origin, not after the fact. The content is captured with forensic methodology, reduced to a unique digital fingerprint (hash), and sealed with a qualified timestamp and electronic seal issued by qualified third-party QTSPs, integrated via eIDAS-compliant APIs. From that point on, any alteration is detectable, and authenticity stops depending on a detector's opinion and starts resting on objective, verifiable proof. It is the opposite and complementary approach to detection: you do not ask "is this fake?", you establish "this is authentic, and I can prove it."

Acquisition and certification with legal value in a single act

The strength of the method is that it folds acquisition and certification into one act. Using the TrueScreen app or the Web Portal, the user captures a photo, a video, or a web page, and in the same instant the system locks in its integrity: it computes the hash, applies the qualified timestamp and the electronic seal through an integrated QTSP, and produces a report with probative value. This is not about stamping a seal on pre-existing data whose origin nobody knows. It is about certifying the content from the moment it is born. That is the real difference from trying to certify a photo with legal value only after it has already been created and passed around. The methodology rests on recognized standards: eIDAS for the qualified timestamp and electronic seal through a QTSP, and ISO/IEC 27037 for the handling of digital evidence.

Immutability at the source instead of after-the-fact analysis

The practical difference between the two paradigms is sharp, and a table makes it immediate.

AspectDownstream detectionCertification at the source
When it actsAfter creation and distributionAt the instant of capture
Question it answers"Is this content fake?""Is this content authentic and provable?"
Basis of the verdictStatistical analysis, probability of errorHash, qualified timestamp, QTSP seal
ReliabilityVariable (up to 40% false positives)Verifiable and objective
Keeps pace with generators?No, it chases every new modelYes, independent of the faking technique
Value in courtWeak and contestableProof with legal value

A detector keeps failing more as generative models improve. A proof certified at the source stays valid no matter how sophisticated deepfakes get. You do not reach immutability by hunting for flaws in the fake. You reach it by fixing the truth of the authentic at the exact point where it enters the digital world. It is the same logic as digital provenance: knowing where a piece of content comes from, and being able to prove it.

TrueScreen certified journalism

Use case

Certified journalism: digital evidence for newsrooms and investigations

See how newsrooms use TrueScreen to certify photos and video at the source and make them impossible to contest.

Discover more →

What changes for those who produce and publish information

For newsrooms, public bodies, and businesses, certifying at the source means producing digital evidence that cannot be contested, starting from the origin, which removes at the root the risk of seeing your own work waved away as "probably fake." The change is concrete, not theoretical. A journalist who records an event with the TrueScreen app walks away with a file whose authenticity cannot be challenged in court. The liar's dividend disappears, because the proof is already certified the moment it exists and leaves no opening for anyone wanting to claim it was generated by AI.

The same holds anywhere authenticity carries direct legal consequences. Deepfakes can already amount to a crime in several jurisdictions: in Italy, for example, Law 132/2025 introduced specific offenses for AI-generated content, on top of established charges like defamation, identity theft, and unlawful data processing. At the European level, the AI Act puts transparency and labelling obligations on synthetic content. In that climate, anyone who documents a fact, handles a claim, or publishes an investigation can no longer lean on the good faith of whoever happens to be watching. The same defense matters against deepfakes used in corporate fraud, where one fake video of the CEO can be enough to wave through a wire transfer. Certifying at the source turns each acquisition into a defensible act: the truth no longer has to be chased afterward, it gets fixed and sealed while it happens.

FAQ: deepfake detection and certification at the source

Is deepfake detection software reliable?
No. According to the NewsGuard report of May 11, 2026, the five deepfake detection tools tested flagged on average 13.33% of authentic images as AI-generated, with the worst tool reaching 40%, and they contradicted each other on 35 of 45 images. Downstream detection fails systematically because it analyzes content after the fact, chasing generative models that improve faster than the detectors do.
Are there deepfakes impossible to detect?
Yes. There are already deepfake videos and AI images that no detection tool identifies reliably, because the people producing them can train their models against the detectors themselves and remove the giveaways. This is why spotting a deepfake by eye or with software is increasingly impractical. The defense that holds is not detection, but certification of authenticity at the source: instead of looking for the fake, you prove the real.
What is the best way to fight deepfakes?
The most effective approach is not to detect fakes, but to certify the authenticity of genuine content at the moment it is created. While downstream detection remains a lost chase, certifying at the source fixes the integrity of a photo or a video with a hash, a qualified timestamp, and an electronic seal, making the proof verifiable and giving it legal value regardless of how sophisticated deepfakes become.
What does it mean to certify content at the source?
Certifying content at the source means acquiring it with forensic methodology and fixing its integrity in the same instant of capture, before it can be altered or distributed. The content is reduced to a unique hash and sealed with a qualified timestamp and an electronic seal through qualified QTSPs. Unlike detection, which analyzes content that already exists, certification at the source declares its authenticity at the origin, in an immutable way.
Does certification at the source have legal value?
Yes. Certification at the source produces a report with probative value grounded in recognized standards: a qualified timestamp and electronic seal issued by qualified QTSPs under the eIDAS regulation, and the handling of digital evidence in line with ISO/IEC 27037. Actual admissibility always depends on the jurisdiction and on the context of the case.
Are deepfakes a crime?
They can be. Many jurisdictions are introducing specific rules for AI-generated content: in Italy, for instance, Law 132/2025 added dedicated offenses, which sit alongside established charges such as defamation, identity theft, and unlawful processing of personal data, while the EU AI Act imposes transparency and labelling duties. The exact qualification depends on the actual use of the deepfake and the harm produced. Having proof certified at the source strengthens the position of anyone who suffers or has to document these offenses.

Certify your content at the source, with legal value

Stop chasing the fake. With TrueScreen you capture photos, video and web pages and certify their authenticity at the moment of capture, with a report that carries legal value.

mockup app