Deepfake detection: why detection software fails and the real defense is certifying at the source
AI images stopped being a laboratory curiosity a while ago. They show up in feeds, in chats, on the front pages of newspapers, and a lot of the time you cannot tell them apart from a real photograph. The obvious fix seemed simple: hand the problem to software that, given a piece of content, tells us whether it is real or fake. Then came an independent NewsGuard report, published on May 11, 2026, with an uncomfortable finding. The leading deepfake detection tools get it wrong, and they do it often. They call authentic content manipulated, and manipulated content authentic. So if we cannot even trust the detectors, how are we supposed to tell the real from the fake?
The short answer is that we are fighting the wrong battle. Downstream detection, meaning the analysis of content that already exists in order to judge whether it is genuine, is a race lost from the start. The defense that actually holds runs the other way: certify the authenticity of every piece of information at the source, at the moment it is captured, with legal value. The paradigm has shifted. We no longer live in a world of "true unless proven false," but in one of "false unless declared true at the origin."
Deepfake detection software is unreliable: what the numbers say
The leading deepfake detection tools fail in a systematic way, and the figures leave little room for interpretation. NewsGuard tested five detection tools against a sample of 45 real images, split into three groups: 15 authentic, 15 lightly edited, and 15 heavily manipulated. The most damning result concerns false positives. On average, 13.33% of the authentic images were flagged as AI-generated, and the worst tool reached 40% false positives. Put plainly, roughly one real photograph in seven risks being branded as fake by a detector. This was not a one-off, and it was not down to a single faulty product. The same behavior showed up across every tool tested, which points to a structural limit of the method rather than a calibration error in one piece of software.
False positives: the authentic declared fake
The most insidious damage from downstream detection is not that it lets a fake slip through. It is that it condemns the real. The NewsGuard sample drew largely on war and conflict contexts, where the authenticity of a single photograph can move public opinion and shape political decisions. One of the tested tools flagged an authentic video as fake with a probability of 96.9%, a near-total confidence that happened to be completely wrong. When a detector "certifies" with that kind of apparent certainty that a real document is artificial, it hands anyone a ready-made alibi to deny the evidence. Footage of a violation becomes, with a single click, "a fake, the software says so." A tool built to expose manipulation ends up covering for it. It is the same fragility you see when the real looks fake and the burden of doubt lands on genuine material.
False negatives and the absence of consistent criteria
The second problem is a basic inconsistency between systems. NewsGuard found that the five tools disagreed with each other on 35 of the 45 images, which means they landed on the same verdict less than one time in four. The same content, run through different detectors, can come back with opposite labels. There is no shared criterion and no objective threshold. Meanwhile AI images keep getting more realistic and deepfake video keeps getting cleaner, the boundaries blur, and false negatives, meaning manipulations declared authentic, climb in step with the quality of the generators. This is the built-in limit of any deepfake detection: it measures a probability, not a certainty. As Wired Italia and other outlets covering the report pointed out, the issue is not one faulty product. It is the method of after-the-fact detection itself.
Why downstream detection is a race lost from the start
Detecting deepfakes is set up to lose because it chases a technology that moves faster than it does. Detection works by analyzing content that already exists, hunting for artifacts: odd pixels, shadows that do not line up, movement that looks slightly off. The catch is that every signal a detector learns to spot becomes, for whoever generates the deepfake, a flaw to patch in the next version. The race is unbalanced by design. The defense reacts; the attack is already a move ahead. Each new generative model shows up able to beat the detectors of the previous one, and the gap never closes, it just slides forward. That is why a system built on after-the-fact analysis ages faster than the threat it is supposed to counter.
Detection chases, generation runs faster
Anyone producing a deepfake today can train their model against the most widely used detectors and strip out the giveaways one at a time. Take face swap: it used to be recognizable from blurred edges or reflections in the eyes that did not match, and now it is far cleaner. A study from the University of Edinburgh showed how fragile the fingerprints behind detection really are, since what looks distinctive today vanishes with the generator's next update. That is exactly the picture the NewsGuard report paints, with detectors that err, contradict each other, and lose ground every time the models improve. For anyone who has to defend the authenticity of a piece of evidence, the conclusion is blunt: chasing the fake after the fact does not scale. TrueScreen comes at the problem from the other end, acquiring content with forensic methodology and certifying it with legal value at the moment of capture, so there is nothing to verify after the fact. These are the limits of deepfake detection that force the change of logic.
The paradigm has shifted: from "true unless proven false" to "false unless declared true"
For decades we treated a photograph or a video as true until proven otherwise, for the simple reason that faking them took rare skills and rare tools. Generative AI tore up that assumption. Today any piece of content could be synthetic, so the burden moves: the question is no longer whether you can prove something is false, but whether you can prove something is true. This is the digital trust paradigm flipping its sign. The presumption of authenticity that used to back up photos, videos, and documents falls away, and with it the idea that looking at content is enough to trust it. Anyone who wants to be believed now has to bring proof that their material is genuine, built before doubt gets a chance to take hold.
That flip creates a perverse side effect, the one known as the liar's dividend. Once everyone knows deepfakes exist and that even the detectors get it wrong, anyone caught doing something compromising can just deny it: "that is an AI-generated fake." The unreliability of deepfake detection is not a technical footnote here, it is the fuel. The more the detectors contradict each other, the easier it gets to dispute even authentic evidence. The liar's dividend eats away at trust not because fakes are perfect, but because doubt itself has become a weapon. The way out is not to prove after the fact that a piece of content is genuine. It is to declare and certify its truth in the instant it comes into being.
What it means to certify authenticity at the source
TrueScreen, the Data Authenticity Platform, inverts the logic of detection: instead of analyzing content to decide whether it is fake, it certifies its authenticity at the very moment of capture, producing a forensic report with legal value. Certifying at the source means acting at the origin, not after the fact. The content is captured with forensic methodology, reduced to a unique digital fingerprint (hash), and sealed with a qualified timestamp and electronic seal issued by qualified third-party QTSPs, integrated via eIDAS-compliant APIs. From that point on, any alteration is detectable, and authenticity stops depending on a detector's opinion and starts resting on objective, verifiable proof. It is the opposite and complementary approach to detection: you do not ask "is this fake?", you establish "this is authentic, and I can prove it."
Acquisition and certification with legal value in a single act
The strength of the method is that it folds acquisition and certification into one act. Using the TrueScreen app or the Web Portal, the user captures a photo, a video, or a web page, and in the same instant the system locks in its integrity: it computes the hash, applies the qualified timestamp and the electronic seal through an integrated QTSP, and produces a report with probative value. This is not about stamping a seal on pre-existing data whose origin nobody knows. It is about certifying the content from the moment it is born. That is the real difference from trying to certify a photo with legal value only after it has already been created and passed around. The methodology rests on recognized standards: eIDAS for the qualified timestamp and electronic seal through a QTSP, and ISO/IEC 27037 for the handling of digital evidence.
Immutability at the source instead of after-the-fact analysis
The practical difference between the two paradigms is sharp, and a table makes it immediate.
| Aspect | Downstream detection | Certification at the source |
|---|---|---|
| When it acts | After creation and distribution | At the instant of capture |
| Question it answers | "Is this content fake?" | "Is this content authentic and provable?" |
| Basis of the verdict | Statistical analysis, probability of error | Hash, qualified timestamp, QTSP seal |
| Reliability | Variable (up to 40% false positives) | Verifiable and objective |
| Keeps pace with generators? | No, it chases every new model | Yes, independent of the faking technique |
| Value in court | Weak and contestable | Proof with legal value |
A detector keeps failing more as generative models improve. A proof certified at the source stays valid no matter how sophisticated deepfakes get. You do not reach immutability by hunting for flaws in the fake. You reach it by fixing the truth of the authentic at the exact point where it enters the digital world. It is the same logic as digital provenance: knowing where a piece of content comes from, and being able to prove it.
What changes for those who produce and publish information
For newsrooms, public bodies, and businesses, certifying at the source means producing digital evidence that cannot be contested, starting from the origin, which removes at the root the risk of seeing your own work waved away as "probably fake." The change is concrete, not theoretical. A journalist who records an event with the TrueScreen app walks away with a file whose authenticity cannot be challenged in court. The liar's dividend disappears, because the proof is already certified the moment it exists and leaves no opening for anyone wanting to claim it was generated by AI.
The same holds anywhere authenticity carries direct legal consequences. Deepfakes can already amount to a crime in several jurisdictions: in Italy, for example, Law 132/2025 introduced specific offenses for AI-generated content, on top of established charges like defamation, identity theft, and unlawful data processing. At the European level, the AI Act puts transparency and labelling obligations on synthetic content. In that climate, anyone who documents a fact, handles a claim, or publishes an investigation can no longer lean on the good faith of whoever happens to be watching. The same defense matters against deepfakes used in corporate fraud, where one fake video of the CEO can be enough to wave through a wire transfer. Certifying at the source turns each acquisition into a defensible act: the truth no longer has to be chased afterward, it gets fixed and sealed while it happens.

