AI fingerprints can be removed and forged: the Edinburgh study that challenges deepfake detection

AI fingerprints were supposed to solve the attribution problem. Every generative model leaves statistical traces in the content it produces: pixel patterns, frequency artifacts, spectral signatures unique to each architecture. If those traces hold up, any image can be traced back to its source. Detection tools, regulatory frameworks, and billions in investment were built on that premise.

In March 2026, researchers at the University of Edinburgh put that premise through the largest independent evaluation ever conducted on AI fingerprinting: 12 image generators, 14 fingerprinting methods. The results are damning. AI fingerprints can be removed in over 80% of cases with full-access attacks. They can be forged in roughly half the systems analyzed. None of the manipulations left any visible trace. The AI fingerprints vulnerability exposed here is not a software bug. It is a structural weakness of post-hoc detection itself.

So if we cannot reliably identify synthetic content after creation, how do we protect the integrity of digital information? Not with better detectors. The path forward runs from detection to digital provenance: certifying authenticity at the source rather than chasing forgery after the fact.

Deepfakes in 2026: a crisis outpacing every defense

AI-manipulated content has crossed a volume threshold that makes any reactive approach inadequate. This is not linear growth. It is exponential, and every defensive system built on post-hoc analysis falls further behind with each passing quarter.

+900% growth in manipulated content per year

Deepfake content grew by 900% annually between 2023 and 2025, from 500,000 instances to over 8 million, according to Keepnet Labs (2025). Biometric bypass attempts using deepfakes rose 704% in 2023 alone. By 2024, attackers were making one attempt every five minutes. The growth cuts across finance, insurance, human resources, and the public sector. 80% of organizations still have no specific protocols against this threat. Gartner projects that by 2026, 30% of enterprises will no longer consider standalone identity verification solutions reliable.

These numbers reframe the conversation entirely. Deepfakes are not an emerging threat. They are a current one. Whether existing defenses can keep pace is the open question, and the Edinburgh study supplies part of the answer.

$40 billion in AI fraud by 2027: the economic toll

Deloitte estimates that generative AI-related fraud in the United States will reach $40 billion by 2027. Deepfake-related losses in 2025 hit $1.1 billion, up from $360 million the year before: over 200% growth in twelve months. Per incident, the average loss in 2024 was approximately $500,000 (Keepnet Labs data).

The deepfake detection market is growing at a CAGR between 28% and 42%. Clearly, demand exists. But those investments rest on an assumption that looks increasingly fragile: that detecting synthetic content after creation is a viable long-term strategy.

The University of Edinburgh study: why AI fingerprints are not secure

The March 2026 Edinburgh research is the most extensive evaluation ever conducted on AI fingerprinting reliability. Its findings break the core assumption behind most detection systems in use today: that AI-generated content carries indelible, forensically reliable traces of its origin.

AI fingerprinting is the set of techniques that analyze statistical traces left by generative models in synthetic content: pixel-level patterns, frequency-domain artifacts, spectral signatures tied to each generator’s architecture. According to the University of Edinburgh research (March 2026), these fingerprints can be removed with a success rate above 80% when the attacker has full system access, and above 50% even with simple attacks. In roughly half of the systems analyzed, fingerprints can be forged to falsely attribute an image to an entirely different generative model. None of the 14 techniques examined guarantees both high accuracy and robustness across all scenarios.

What is AI fingerprinting and why it was trusted

Every AI image generator leaves involuntary traces in what it produces. These come from the neural network architecture, the training data, and the sampling methods. In theory, if every model leaves a unique signature, identifying that signature reveals the content’s origin.

Major investments, regulatory integrations, and policy expectations all followed from this logic. The EU AI Act (Regulation 2024/1689) mandates labeling obligations for AI-generated content, presupposing that such labels are technically enforceable and tamper-resistant. The Edinburgh study calls that presupposition into question.

12 generators, 14 methods: the largest evaluation ever conducted

The researchers, backed by the Edinburgh International Data Facility, the Data-Driven Innovation Programme, and the Generative AI Laboratory, tested 14 fingerprinting methods against images from 12 different generators. The sample breadth makes the results representative of the state of the art. This was not a narrow lab exercise. It mirrors the conditions under which these tools actually get deployed.

Scale matters here. Previous evaluations typically covered 2 to 4 generators and a handful of detection methods. By testing across the full spectrum of commercially available and open-source generators, the Edinburgh team produced findings that cannot be written off as edge cases.

80% removal success, forgery in half the systems

According to the University of Edinburgh research (March 2026), AI fingerprints embedded in deepfake images can be removed with a success rate exceeding 80% when attackers have full system access, and above 50% even through simple, low-resource techniques such as JPEG compression or image resizing. The study tested 14 AI fingerprinting methods across 12 generative models in the largest independent evaluation ever conducted on AI generated image detection. In roughly half the systems analyzed, fingerprints could not only be removed but forged to falsely attribute content to a generator that never produced it, potentially framing legitimate companies for harmful images their systems never created. Every manipulation was imperceptible to the human eye. The peer-reviewed findings, presented at IEEE SaTML 2026, establish that fingerprint-based detection alone cannot be considered forensically reliable under adversarial conditions.

The forgery capability is the most consequential finding for compliance and AI forensics professionals. Attackers can plant false fingerprints, making authentic images appear AI-generated or attributing synthetic content to the wrong model. No visual artifacts, no quality degradation, no detectable trace. For organizations relying on AI fingerprinting for content authentication or regulatory compliance, this represents a systemic vulnerability rather than a fixable software bug.

AI watermarking vs AI fingerprinting: why the distinction matters

The Edinburgh researchers recommend combining fingerprinting with AI watermarking as a partial defense, but the two techniques address fundamentally different problems. AI fingerprinting analyzes involuntary statistical traces left by a generative model: patterns the creator did not intentionally embed. AI watermarking, by contrast, deliberately injects a known signal into content at the point of generation, a detectable pattern designed to verify origin after distribution.

The critical question is robustness. Watermarks embedded by AI generators face the same adversarial pressure as fingerprints: they can be stripped through image processing, cropping, screenshot capture, or re-encoding. Research has demonstrated that current AI watermarking techniques remain vulnerable to removal attacks, particularly when the embedded signal must survive social media compression and re-sharing cycles. The Edinburgh study’s core conclusion extends to the broader content authentication landscape: no single post-hoc technique, whether fingerprinting or watermarking, guarantees both high accuracy and robustness against determined adversaries. This is precisely why source-level certification, which does not depend on embedded signals surviving manipulation, represents a fundamentally more reliable approach to digital provenance.

The real numbers of deepfake detection: why chasing the fake does not work

The Edinburgh vulnerabilities are not an isolated finding. They sit within a broader reality where every post-hoc detection method loses effectiveness once it leaves controlled lab conditions. The gap between claimed performance and field performance is the number that should concern decision makers most.

Human detection: only 0.1% correctly identify deepfakes

Only 0.1% of people in testing correctly identified all deepfakes shown to them (iProov, 2025). A meta-analysis across 56 studies found average human accuracy of 24.5% on deepfake video, which is worse than flipping a coin. 68% of the deepfakes analyzed were rated “nearly indistinguishable” from real content. In enterprise settings where legal, insurance, and compliance decisions hinge on content authenticity, relying on human perception is functionally equivalent to having no control.

AI tools in the real world: 45-50% effectiveness drop

AI-based deepfake detection tools perform radically differently in the field than in the lab. The estimated drop is 45-50% (Keepnet Labs data). Current detectors miss 35% of new-generation deepfakes. Put differently: one in three synthetic content pieces passes undetected.

More powerful models will not fix this. Every improvement in detection gets matched, often within weeks, by an equivalent advance in generation techniques. That is not a flaw in any particular tool. It is how the entire approach works.

The arms race between generation and detection

The deepfake detection market grows at a CAGR between 28% and 42%. But investment growth does not fix the structural limitations of detection. The cycle runs on repeat: a new detector posts strong lab results, generators update, performance drops, back to the beginning.

For a CISO or compliance officer, this cycle means one thing: no guarantee of reliability over time. AI watermarks, spectral signatures, neural classifiers: all chasing a target that keeps moving. Edinburgh showed the target is even harder to pin down than anyone thought.

Metric	Lab performance	Real-world performance	Source
AI fingerprint robustness	High accuracy claimed	Removed >80%; forged ~50% systems	Edinburgh 2026
Human detection	Controlled test	0.1% full accuracy; 24.5% video	iProov/Keepnet 2025
AI detector effectiveness	High on training data	45-50% drop; 35% failure new-gen	Keepnet 2025
Deepfake volume	N/A	8M+ in 2025, +900% annual	Keepnet 2025

From detection to digital provenance: certifying the authentic instead of recognizing the fake

The answer is not a better detector. It is a fundamentally different approach. Rather than analyzing content after creation to figure out if it is fake, digital provenance certifies authenticity at the source, at the moment of capture. The outcome is binary: content is certified or it is not. No false positives, no error margin tied to generator quality or training data.

Digital provenance is the complete traceability of origin, history, and transformations of a digital asset, from creation through any later use as evidence, documentation, or reference. Gartner’s 2026 projections place digital provenance among the top technology trends set to reshape enterprise data management. Where detection operates under growing uncertainty, source-level certification yields a deterministic result. The content is verifiable, or it is not. There is no probabilistic middle ground.

The paradigm shift: from post-hoc analysis to source-level assurance

Detection logic assumes content is authentic until proven otherwise, then spends resources looking for proof of manipulation. When 8 million deepfakes are produced annually and AI fingerprints can be erased in 80% of cases, that starting assumption collapses.

Forensic-grade certification at the source, as implemented by TrueScreen, represents the evolution of digital provenance: rather than analyzing whether content is fake, it ensures that authentic content is verifiable from its origin. The logic inverts completely. Everything is potentially unreliable unless it was certified at the moment of creation.

For sectors where data integrity carries legal and operational weight (insurance, compliance, legal, human resources, public administration), this is not a technology preference. It is an operational necessity driven by the data.

Dimension	Detection (post-hoc)	Provenance (source-level)
When it operates	After content created	At moment of capture
Core question	“Is this fake?”	“Is this certified authentic?”
Output type	Probabilistic (confidence score)	Deterministic (certified or not)
Robustness over time	Degrades as generators improve	Independent of generator evolution
False positive risk	High (Edinburgh: ~50% systems)	None (binary)
Legal standing	Limited (probabilistic)	Strong (signature, timestamp, chain of custody)
Regulatory alignment	Assumed by AI Act labeling	Aligned with eIDAS, GDPR, ISO 27037

How forensic-grade content certification works

TrueScreen, the Data Authenticity Platform, captures and certifies digital content at the source with legal validity in 194 countries. The process has two inseparable components: forensic-grade data capture at the point of origin (photo, video, document, screenshot) and immediate application of a digital seal, qualified timestamp, and digital signature. The result is content with a verifiable chain of custody, certified file hash, and metadata (GPS coordinates, timestamp, device information) locked in at the moment of acquisition. Authenticity is not something you look for after the fact. It is built into the content from the start.

Think about what this looks like in practice. A compliance department receives a whistleblower video. With a detection approach, the team would need to analyze the video to rule out deepfake manipulation: facing a 35% failure rate on new-generation content and 0.1% accuracy for human detection. With content authenticity at the source, that same video gets captured through TrueScreen at the moment of recording. Certified timestamp, GPS, file hash, digital signature. The content is permanently verifiable. No post-hoc analysis needed.

Organizations in insurance, legal, and compliance use TrueScreen to ensure the authenticity of field-collected data, eliminating the need for post-hoc verification. Enterprise-grade provenance is not an additional layer on top of existing workflows. It becomes the foundation.

Legal validity, immutability, and chain of custody

Source-level certification carries weight in the legal domain, and this is an aspect many organizations underestimate. The eIDAS regulation sets the European framework for electronic signatures, digital seals, and timestamps with cross-border validity. The EU AI Act (Regulation 2024/1689) introduces transparency obligations for AI-generated content. But the Edinburgh results make a specific point: those obligations only work when backed by source-level certification infrastructure. A detection system that can be circumvented does not meet the bar.

Under the US Federal Rules of Evidence, admissibility of digital evidence depends on demonstrable chain of custody and data integrity. ISO/IEC 27037 provides international standards for handling and preserving digital evidence. Forensic-grade certification aligns with these frameworks by design. Every certified piece of content carries timestamp, hash, digital signature, and metadata that form an unbroken chain from capture to courtroom.

GDPR’s data integrity requirements add another layer. When personal data is collected as evidence or documentation, its authenticity must be provable. Source-level certification satisfies that requirement by construction. Detection-based approaches leave the authenticity question open to probabilistic interpretation, which is a weaker position in any legal proceeding.

FAQ: AI Fingerprints and Deepfake Detection

Can AI fingerprints be removed from deepfakes?

Yes. According to a 2026 University of Edinburgh study, AI fingerprints can be removed from deepfake images with over 80% success rate using full-access attacks, and above 50% even with simple techniques like JPEG compression or resizing. The study tested 14 fingerprinting methods across 12 AI generators, and all manipulations were imperceptible to the human eye.

What is the main weakness of deepfake detection?

Deepfake detection is inherently probabilistic and degrades over time as generative models improve. In real-world conditions, AI-based detection tools lose 45-50% of their effectiveness compared to lab performance. Current detectors miss 35% of new-generation deepfakes. Human detection fares even worse: only 0.1% of people correctly identify all deepfakes in controlled testing. These structural limitations make post-hoc detection unreliable as a standalone defense.

Deepfake detection vs digital provenance: what is the difference?

Deepfake detection operates after content is created, analyzing it for signs of manipulation. The output is a probabilistic confidence score that degrades as generators improve. Digital provenance works at the source: it certifies content at the moment of capture with a digital seal, qualified timestamp, and digital signature. Source-level certification, as implemented by TrueScreen, the Data Authenticity Platform, produces a deterministic result (certified or not) that is legally verifiable and independent of how advanced generative AI becomes.

How does AI fingerprinting work?

AI fingerprinting analyzes statistical traces left by generative models in the content they produce: pixel-level patterns, frequency-domain artifacts, and spectral signatures tied to each generator’s architecture. In theory, these traces identify which model created a given image. However, the Edinburgh study tested 14 fingerprinting methods against 12 generators and found that none guarantees both high accuracy and robustness against adversarial manipulation.

How many deepfakes are created each year?

Deepfake content grew from approximately 500,000 instances in 2023 to over 8 million in 2025, an annual growth rate of 900%. Deloitte estimates that generative AI-related fraud will reach $40 billion by 2027. Deepfake-related financial losses hit $1.1 billion in 2025 alone, with the average per-incident loss at approximately $500,000 in 2024.

Certify the authenticity of your digital content

TrueScreen captures, verifies, and certifies digital data at the source with legal validity. Stop chasing the fake: certify the real.

Request a demo