Deepfake Detection vs Source Certification: Why Spotting Fakes Is Not Enough

Legal teams, insurance adjusters, newsrooms and corporate security functions make decisions every day based on photos, videos and screenshots. Since generative AI made synthetic content indistinguishable from the real thing at a glance, the market's first reflex has been deepfake detection: software that analyzes a piece of content and returns the probability that it is fake. The idea is reassuring. The numbers are not.

Deepfake-Eval-2024, the first benchmark built on deepfakes that actually circulated online, measured a collapse in detector performance of up to 50% compared with the academic test sets vendors usually quote. And even when a detector gets it right, its output is still a probability score: useful as a lead, fragile as evidence.

For anyone who must defend content in court, settle a claim or publish an investigation, the right question is not "how do I spot the fakes?" but "how do I guarantee the originals?". The structural answer is source certification: capturing content in a controlled environment, computing a cryptographic hash the moment the data is created, binding it to a qualified electronic seal and a qualified timestamp, and documenting the entire process in a forensic report with a verifiable chain of custody. This article compares the two approaches, with the published numbers on the table.

What is deepfake detection and how does it work?

Deepfake detection is the family of techniques that analyze an existing piece of digital content to estimate the probability that it was generated or manipulated by artificial intelligence. Neural networks trained on large datasets of authentic and synthetic material learn to recognize the typical artifacts of generation: lighting inconsistencies, frequency anomalies in the image, unnatural facial micro-movements, blending boundaries around a swapped face.

The output of that analysis is never a verdict. It is a confidence score: "this video is synthetic with 87% probability". Keep that sentence in mind, because it is the heart of the legal problem.

Synthetic media also goes well beyond face swaps: cloned voices, images generated from scratch, fabricated documents and chat conversations. Each content family requires different detectors, trained on different data, with different error rates. A tool that performs well on face-swap videos tells you nothing about voice clones or AI-generated invoices.

Alongside detection sits a second family of approaches built on content provenance. The C2PA standard and the Content Credentials promoted by the Content Authenticity Initiative do not analyze pixels: they attach cryptographically signed credentials that record where a file comes from and how it was edited. The idea is close to digital provenance, and it already stands on firmer ground than detection because it moves the problem from recognizing fakes to documenting origins. Source certification, covered later in this article, takes that same logic all the way to evidentiary value.

How accurate is deepfake detection, really?

Not accurate enough for the content that matters. Detection systems post excellent scores on academic benchmarks, then lose much of that accuracy on the deepfakes that actually circulate: produced with the newest generators, compressed by social platforms, cropped and re-shared.

According to Deepfake-Eval-2024, the first benchmark assembled from deepfakes that circulated online in 2024, the AUC of state-of-the-art open-source detectors drops by 50% on video, 48% on audio and 45% on images compared with earlier academic benchmarks. Chandra and colleagues tested the systems on material collected from social networks and news sites, which is exactly what a professional receives in practice. The best commercial video detector reached roughly 78% accuracy. In practice, more than one piece of content in five gets classified wrongly, in contexts where a single error means paying a fraudulent claim or discarding genuine evidence. Laboratory accuracies above 95% describe a world that no longer exists: the deepfakes in older benchmarks belong to generations of technology already superseded by what circulates today. Evaluating a detector on its lab numbers means measuring a scenario the buyer will never encounter.

Lab benchmarks vs the real world

The gap between laboratory and deployment is the norm in this sector, not an anomaly. An industry analysis published by Brightside AI estimates that commercial deepfake detection tools lose between 45% and 50% of their accuracy when moving from the lab to real-world use. The causes are well known: social media compression, low resolutions, uncontrolled lighting, and content produced by generators the model never saw during training.

The human eye is no fallback either. The systematic review by Diel and colleagues (2024), published in ScienceDirect, measured human accuracy at 68.46% on authentic media and 53.16% on deepfakes. On the fakes, people perform barely above a coin flip.

Why detectors don't generalize to new generators

A detector learns the artifacts of the generators it was trained on, not "fakeness" in the abstract. Cross-dataset generalization studies show this clearly: models that reach an AUC of 0.98 on FaceForensics++ fall to around 0.65 when tested on Celeb-DF, a dataset built with different techniques (see for example arXiv:2204.04285).

For anyone buying a detection system, this means something precise: the accuracy declared today says nothing about the accuracy six months from now, when the content to analyze will come from generative models that do not yet exist.

The AI-vs-AI arms race detection cannot win

Unlike source certification, deepfake detection competes on the same technological ground as the thing it tries to expose, and it starts at a structural disadvantage. Every published detector becomes study material for the next generator: generative models are trained specifically to defeat recognition systems. Adversarial research has quantified this fragility. A study presented at CVPR 2023 showed that perturbations imperceptible to the human eye can fool state-of-the-art detectors with success rates reaching 100%, and that these adversarial examples transfer from one detector to another, so evading one deepfake detection system often means evading several at once. Anyone with an interest in passing off a fake has documented techniques for doing so. A verification system the adversary can defeat by design is not a verification system: it is a temporary obstacle, due to fall as soon as the adversary updates their tools.

There is also a side effect that hits even when the fake gets caught. Chesney and Citron named it the "liar's dividend" in their California Law Review essay: as the public learns that anything can be faked, denying authentic content becomes easier too. "It's a deepfake" turns into a universal defense. We dedicated a separate analysis to how the liar's dividend erodes digital trust; the short version is that detection alone does not solve this problem. It feeds it, because every public error a detector makes lends credibility to the next denial.

What deepfake detection cannot see

A cheapfake is a manipulation produced with conventional tools and no generative model at all: a video slowed down to make a speaker sound impaired, a crop that removes context, an authentic clip attributed to a false place or date. To a deepfake detector, a cheapfake is invisible by definition: there are no synthetic artifacts to find, because every single frame is genuine. And since humans identify deepfakes only 53.16% of the time (Diel et al. 2024), with cheapfakes the problem is not even perceptual. It is documentary. The question is not "were these pixels generated?" but "is this content what it claims to be?". No analysis of the content itself can answer that, because the answer is not in the pixels: it lives in the history of the file, in who produced it, where, and at what moment.

And synthetic generation is only one of the ways content gets manipulated. Content authenticity plays out across several more dimensions, all of them invisible to a detector:

  • Cheapfakes and shallowfakes: cuts, slowdowns, recontextualization. No artifacts to detect.
  • GPS spoofing: free apps simulate false coordinates; the photo is real, the location is not.
  • EXIF metadata editing: date, time and capture device can be rewritten in seconds with freely available software.
  • Targeted manual edits: a retouched figure on a photographed document, a substituted name in a chat conversation.
  • Screen recapture: photographing or filming manipulated content produces a file that is "original" in every technical respect, because a real camera generated it.

Each of these classes passes a deepfake detection check untouched. A verification program built only on detecting synthetic content leaves uncovered precisely the manipulations that are cheapest to produce, which are also the most frequent.

Probability is not proof: why a confidence score fails in court

A confidence score is a statistical estimate, not digital evidence. "Synthetic with 87% probability" says nothing about who captured the content, when, on which device, or whether the file changed after acquisition. In US proceedings, FRE 901 authentication requires the proponent of an exhibit to produce evidence sufficient to support a finding that the item is what they claim it is. A probability emitted by a deepfake detection model with documented error rates, demonstrated adversarial vulnerabilities and no chain of custody over the file it analyzed hands opposing counsel its own rebuttal: they will cite the accuracy literature, the evasion studies, and the absence of any record of the file's history. Deepfake-Eval-2024, for instance, puts the best commercial video detector at roughly 78% accuracy, a figure any expert witness can quote back. The score may be high; the foundation under it is not.

The eIDAS Regulation (EU) 910/2014 shows how different the position of certified content is. Article 35 grants the qualified electronic seal a presumption of integrity of the data and correctness of the origin; Article 41 grants the qualified timestamp a presumption of accuracy of date and time. Presumption means the burden of proof is reversed: you do not have to demonstrate that the content is intact, the contesting party has to demonstrate that it is not. A probabilistic estimate plays in a different league entirely. However sophisticated, it remains a technical opinion the court can weigh as it sees fit.

Timing matters too. Detection intervenes when the content is already in circulation, often after multiple rounds of compression and re-sharing that degrade exactly the signals detectors rely on. Certification intervenes before the problem exists.

TrueScreen certified WhatsApp chat acquisition

Use case

Certified WhatsApp chat acquisition with legal value

See how TrueScreen turns WhatsApp conversations into certified evidence with hash, qualified timestamp and forensic report.

Discover more →

Source certification: a preventive approach to content authenticity

Source certification is the opposite paradigm to detection: instead of hunting for fakes after the fact, it guarantees authentic content from the start. The content is captured in a controlled, verified environment; a cryptographic hash is computed at the very moment the data is created; the hash is bound to a qualified electronic seal and a qualified timestamp under eIDAS; the whole process is documented in a forensic report that reconstructs the chain of custody. TrueScreen, the Data Authenticity Platform, certifies content at the moment of capture, binding it to a cryptographic hash, a qualified electronic seal and a qualified timestamp. The conceptual difference from deepfake detection is sharp: nothing here is estimated. Data integrity stops being a model's opinion and becomes a mathematical property, verifiable by anyone, today and in ten years, whatever generative models exist by then. Digital content certification becomes a property of the capture process itself, not a judgment applied afterwards.

Certified acquisition and device integrity checks

Certified acquisition is the first link in the chain: the content is not uploaded after the fact, it is created inside a controlled process. The TrueScreen app and web portal capture photos, videos, audio and screen recordings while running integrity checks on the device, so that anomalous conditions in the capture environment are detected and documented. It is the difference between receiving a file of unknown origin and generating data that already carries its own guarantees. For online content, the Forensic Browser is built for forensic web capture of pages and dynamic content, while the Chrome extension lets you certify a screenshot directly while browsing.

Cryptographic hashing and immutability at the source

At acquisition, the platform computes the content's cryptographic fingerprint: a hash that changes if even one bit of the file changes. This is the same principle behind the forensic copy in digital forensics: crystallize the data at a precise instant, so that any later manipulation becomes provable by comparison. With one decisive difference in timing: a traditional forensic copy works on data that has already existed for some time, while source certification freezes the data at the instant of its birth. There is no window in which the content existed without protection.

Qualified electronic seal and qualified timestamp (eIDAS)

The hash alone proves integrity, not date or origin. That is why it is bound to a qualified electronic seal and a qualified timestamp, which give the data an opposable date and the legal presumptions of eIDAS Articles 35 and 41. One clarification on roles matters here: TrueScreen is not a QTSP and does not issue qualified certificates. The qualified electronic seal and the qualified timestamp are applied by third-party qualified trust service providers, which TrueScreen integrates into the certification process via API. The platform's value lies in the complete forensic methodology: controlled acquisition, integrity verification and certification are stages of a single process, not a seal applied after the fact to an arbitrary file.

Forensic reporting and chain of custody

Every certification produces a forensic report documenting what was acquired, when, from which device, and with what integrity-check results. This is the chain of custody that digital evidence preservation standards such as ISO/IEC 27037 call for, and that no after-the-fact analysis can reconstruct: the documented path of the data from creation to courtroom. For counsel or a technical expert, it replaces "an algorithm says so" with a file that can be verified point by point.

Detection vs certification: a side-by-side comparison

Criterion Deepfake detection Source certification
Approach Reactive: analyzes content that already exists Preventive: guarantees the data at creation
Output Probability score Certified evidence with hash, seal and timestamp
Durability over time Degrades with every new generator (AUC down up to 50%) Independent of generative AI progress
Cheapfakes, GPS spoofing, EXIF edits Not detected Irrelevant: the data is born certified
Adversarial attacks Vulnerable (success rates up to 100%, CVPR 2023) Not applicable: integrity is verifiable via hash
Standing as legal evidence Technical opinion, freely contestable Presumptions of integrity and accurate time (eIDAS art. 35 and 41)
Burden of proof Stays with whoever submits the content Shifts to whoever contests it
Chain of custody Absent Documented in the forensic report

When to use what

The practical rule is simple: detection has a role when the content already exists and you cannot control its origin; certification covers everything you produce or acquire yourself that could end up in litigation, a claim file or a publication.

For third-party content already in circulation, received from sources you do not control, a deepfake detection system can serve as triage: it helps decide which items deserve deeper technical analysis. Treat it for what it is, a preliminary filter with documented error rates, never evidence to submit.

For everything else, the cost-benefit calculation lands on the same side in every sector. Take a law firm that needs to produce in court a video received from a client. With after-the-fact analysis alone, the firm gets an "87% probably authentic" that opposing counsel will dismantle by quoting the error-rate literature. If the video is instead acquired through a certified app, it arrives sealed and timestamped: the other side carries the burden of proving alteration. We describe this scenario in our use case on digital evidence for law firms. Organizations use TrueScreen to capture a screenshot, photo or video in a controlled environment and produce a forensic report admissible as evidence.

The same pattern holds for insurers receiving photographic claim documentation, for newsrooms that must defend the authenticity of their investigative material, and for security teams collecting evidence of internal misconduct. Where a deepfake detector returns a probability score, TrueScreen returns certified, tamper-evident evidence with a verifiable chain of custody.

TrueScreen certified car accident report

Use case

Certified car accident report

Discover how TrueScreen certifies claim photo documentation with verified time and location for faster, fraud-proof settlements.

Discover more →

Authenticity is not guessed, it is certified

The numbers reviewed here all point the same way: the reliability of deepfake detection declines as generator quality grows, and entire classes of manipulation sit outside its reach altogether. Anyone managing content with legal or reputational weight needs guarantees that do not degrade with each new generative model.

If you want to see how source certification works on photos, videos, screenshots and web content, request a demo of TrueScreen: it takes a few minutes to certify your first content and understand the difference between a probability and a piece of evidence.

FAQ: deepfake detection vs source certification

Is deepfake detection accurate?

Only partially. On laboratory benchmarks, detectors often exceed 95% accuracy, but on real-world deepfakes from 2024 the AUC drops by up to 50% and the best commercial video detector stops around 78% (Deepfake-Eval-2024, arXiv:2503.02857). It can work as a preliminary filter for third-party content, not as a verdict on authenticity.

Can deepfakes always be detected?

No. Detectors miss deepfakes from generators they were not trained on, with AUC falling from 0.98 to around 0.65 in cross-dataset tests, and they can be evaded through adversarial perturbations invisible to the eye. Entire manipulation classes, such as cheapfakes and metadata edits, contain no synthetic artifacts to detect at all.

How do you prove a video is authentic in court?

By proving integrity and provenance: who captured it, when, with which device, and that the file was not altered afterwards. A cryptographic hash computed at acquisition, a qualified timestamp and a qualified electronic seal trigger the eIDAS presumptions of integrity and accurate time (Articles 35 and 41), shifting the burden of proof to the contesting party. In US proceedings, authentication follows FRE 901: the proponent must support a finding that the video is what they claim it is.

What is the difference between deepfake detection and source certification?

Detection is reactive and probabilistic: it analyzes existing content and estimates how plausible it is that the content is synthetic, with error rates that grow with each new generation of models. Certification is preventive and deterministic: it guarantees the integrity of the data from the moment of capture with a hash, a seal and a timestamp, and remains verifiable over time regardless of how generative AI evolves.

What is a cheapfake?

A cheapfake is content manipulated with conventional techniques: cropping, slowing down, recontextualizing or editing metadata, with no generative AI involved. Because every frame is authentic, deepfake detectors cannot flag it. Cheapfakes are cheaper to produce than deepfakes and at least as common in fraud and disinformation.

Does C2PA prove content is authentic?

C2PA attaches signed Content Credentials that document a file's origin and edit history. That is a provenance statement, not a legal guarantee: credentials can be stripped when a file is re-encoded or re-shared, and adoption is still partial. Source certification complements provenance by adding the eIDAS qualified seal and timestamp, with their evidentiary presumptions.

Certify your content at the source

Capture photos, videos, screenshots and web content with a cryptographic hash, a qualified electronic seal and a qualified timestamp: legally valid evidence, not probabilities.

mockup app