# Document fraud & KYC glossary

> Plain-English definitions of the terms that come up in document fraud detection, AI KYC, and forensic verification. Canonical URL: https://tampercheck.ai/glossary

---

## Document fraud detection

**Document fraud detection is** the process of analysing a document file - PDF, image, or scan - for evidence of editing, forgery, or AI generation.

Document fraud detection combines structural analysis (PDF object streams, font tables, metadata), pixel-level forensics (Error Level Analysis, noise residue mapping), and AI-generation detection to identify documents that have been tampered with or fabricated. Modern detection systems run 100+ checks per document and return a verdict in under a minute.

## Tamper check

**Tamper check is** a forensic analysis that determines whether a document has been edited, forged, or AI-generated.

A complete tamper check inspects the file's internal structure, the pixel-level properties of any images, and the logical consistency of the content (arithmetic, dates, field relationships). Visual inspection alone catches less than 20% of sophisticated edits; a forensic tamper check catches 90%+.

## AI KYC

**AI KYC is** the use of artificial intelligence to automate Know Your Customer verification - reading documents, matching faces, and detecting forgery.

Modern AI KYC stacks combine OCR, face matching, liveness detection, and document forensics. The biggest gap in most AI KYC pipelines is document forensics, which is where deepfake and AI-generated IDs slip through. Forensic layers like TamperCheck sit alongside traditional KYC providers to close this gap.

## Deepfake document

**Deepfake document is** a document - typically an ID, payslip, or bank statement - fabricated by a generative AI model rather than edited from a real source.

Deepfake documents pass visual inspection because every field is plausibly formatted and the layout matches real templates. They're detected forensically: generative AI models leave spectral signatures, characteristic noise patterns, and compression artifacts that real scanners and printers don't produce.

## Error Level Analysis (ELA)

**Error Level Analysis (ELA) is** a forensic technique that detects edited regions in an image by recompressing it and mapping the differences.

When a JPEG is saved repeatedly, each compression introduces predictable error patterns. Edited regions have different compression histories than the original, so ELA highlights them as bright spots in the analysis map.

## MRZ (Machine Readable Zone)

**MRZ (Machine Readable Zone) is** the two- or three-line string of OCR-readable characters at the bottom of a passport or ID, encoding the document's key fields with checksums.

MRZ lines follow ICAO 9303 format and include the holder's name, document number, nationality, date of birth, sex, expiry date, and check digits. Forensic verification confirms that the MRZ characters are valid, the checksums match, and the encoded fields agree with the visible portions of the document.

## Synthetic identity fraud

**Synthetic identity fraud is** fraud committed using a fabricated identity that doesn't correspond to any real person - typically built from a mix of real and fake personal data.

Synthetic identities are common in credit fraud: a fraudster combines a real SSN with a fabricated name, address, and DOB to build credit. Document fraud detection plays a role by catching the supporting documents synthetic-identity applications submit.

## Document tampering

**Document tampering is** any unauthorised modification of a document - editing fields, inserting transactions, swapping photos, or altering metadata.

Document tampering ranges from crude edits (Photoshop on a JPG) to sophisticated structural edits (modifying PDF objects directly) to fully synthetic generation. Tampering detection identifies edits across all of these layers.

## Automated document verification

**Automated document verification is** software-driven verification of document authenticity, identity matching, and data extraction, with no human reviewer in the happy path.

Automated verification combines OCR, data validation, forensic analysis, and policy logic. Verdicts return in under a minute, and human reviewers only see flagged cases.

## Forensic document analysis

**Forensic document analysis is** the application of scientific techniques to determine the authenticity, origin, and editing history of a document.

Modern digital forensic analysis inspects PDF internals, font metrics, pixel-level signals, and metadata, and increasingly uses ML models trained on examples of known forgeries.

## Liveness detection

**Liveness detection is** a check that confirms the person submitting a selfie or video is physically present, not a photo, video replay, or deepfake.

Liveness detection is a face-matching adjacent check - it's about the person, not the document. Document forensics is the parallel check on the documents themselves. A complete KYC stack uses both.

## PDF producer metadata

**PDF producer metadata is** metadata fields embedded in every PDF identifying the software that created or last modified it.

Genuine documents from a known issuer have consistent producer metadata. Tampered documents often show traces of editing tools that the issuer would not use - one of the strongest single tamper signals.

## Running balance arithmetic

**Running balance arithmetic is** on a bank statement, the running balance is the cumulative total after each transaction. It must reconcile across every row.

Forensic verification of a bank statement includes confirming that every transaction's running balance equals the previous balance plus or minus the transaction amount. A single edited number breaks the chain.

## JPEG ghost

**JPEG ghost is** a pixel-level forensic technique that reveals regions of an image that were saved at a different JPEG quality than the rest.

JPEG ghost analysis recompresses the image at various quality levels and highlights the regions whose compression history doesn't match the rest of the image.

## Photo zone substitution

**Photo zone substitution is** an identity document attack where the original portrait has been replaced with a different person's photo.

Detected via boundary contrast analysis, compression mismatch in the portrait region, and sharpness gradient comparison.

## Document verification API

**Document verification API is** a REST API that accepts a document upload and returns a structured verdict - authentic, tampered, or inconclusive - typically with findings and a risk score.

The integration shape most B2B teams use to add forensic checks to their workflows. One POST per document, one structured response per verdict.

## Risk score

**Risk score is** a calibrated 0–100 numeric output of a document verification system, where lower means more likely authentic and higher means more likely tampered.

Risk scores let workflows make policy decisions: auto-approve below a threshold, auto-flag above one, and route the ambiguous middle to a human reviewer.

## Zero document storage

**Zero document storage is** a processing model in which documents are analysed and immediately discarded, with no copy retained on the verification provider's infrastructure.

Zero storage removes the verification provider from the data retention map entirely - increasingly required by KYC, lending, and insurance compliance regimes.

## OCR (Optical Character Recognition)

**OCR (Optical Character Recognition) is** the process of extracting text from an image or PDF.

OCR is necessary but not sufficient for document verification - it tells you what the document says, not whether the document is authentic.

## Document deepfake

**Document deepfake is** a fully synthetic document produced by a generative AI model from a text prompt or template.

Document deepfakes differ from edited documents because there's no original to compare against. Detection relies on spectral signatures and noise patterns characteristic of generative models.

## Income verification

**Income verification is** the process of confirming an applicant's stated income, typically using payslips, bank statements, or tax documents.

Income verification is the primary fraud surface in personal lending, mortgages, and rental applications. Forensic document analysis is the modern complement to traditional income verification.

## KYC (Know Your Customer)

**KYC (Know Your Customer) is** regulatory and operational process of verifying the identity of a customer at onboarding and during the customer lifecycle.

KYC requirements vary by jurisdiction but typically include identity verification, address verification, and ongoing monitoring for sanctions and PEP status.

## AML (Anti-Money Laundering)

**AML (Anti-Money Laundering) is** regulatory framework requiring financial institutions to detect, prevent, and report money laundering.

Document fraud detection supports AML compliance by catching the falsified documents that money launderers submit.

## Liveness vs document forensics

**Liveness vs document forensics is** liveness confirms the person is real; document forensics confirms the document is real. They solve different halves of the same problem.

A complete identity verification stack runs liveness on the selfie or video and forensic checks on the submitted documents.

## BNPL fraud

**BNPL fraud is** fraud committed against Buy Now, Pay Later providers using synthetic identities or forged income documents.

BNPL is a high-volume, fast-decision credit product, which makes it a prime target. Most BNPL fraud loss traces to forged or AI-generated payslips, bank statements, and identity documents.
