Unveiling Perceptual Artifacts: A Fine-Grained Benchmark for Interpretable AI-Generated Image Detection
Current AI-Generated Image (AIGI) detection approaches predominantly rely on binary classification to distinguish real from synthetic images, often lacking interpretable or convincing evidence to substantiate their decisions. This limitation stems from existing AIGI detection benchmarks, which, despite featuring a broad collection of synthetic images, remain restricted in their coverage of artifact diversity and lack detailed, localized annotations. To bridge this gap, we introduce a fine-grained benchmark towards eXplainable AI-Generated image Detection, named X-AIGD, which provides pixel-level, categorized annotations of perceptual artifacts, spanning low-level distortions, high-level semantics, and cognitive-level counterfactuals. These comprehensive annotations facilitate fine-grained interpretability evaluation and deeper insight into model decision-making processes. Our extensive investigation using X-AIGD provides several key insights: (1) Existing AIGI detectors demonstrate negligible reliance on perceptual artifacts, even at the most basic distortion level. (2) While AIGI detectors can be trained to identify specific artifacts, they still substantially base their judgment on uninterpretable features. (3) Explicitly aligning model attention with artifact regions can increase the interpretability and generalization of detectors. The data and code are available at: https://github.com/Coxy7/X-AIGD.
💡 Research Summary
The paper addresses a critical gap in AI‑Generated Image (AIGI) detection: the lack of interpretability. While most existing detectors treat the problem as a binary classification task and achieve high accuracy on specific datasets, they provide no insight into why an image is judged real or fake. This opacity stems largely from the benchmarks themselves, which, although large, contain no fine‑grained, localized annotations of the visual cues that humans use to spot synthetic artifacts.
To remedy this, the authors introduce X‑AIGD, a fine‑grained benchmark designed explicitly for explainable AIGI detection. X‑AIGD supplies pixel‑level masks for perceptual artifacts and categorizes them into a three‑level taxonomy:
- Low‑level distortions – edge mis‑alignments, unnatural textures, color inconsistencies.
- High‑level semantics – structural or compositional errors that break object integrity or logical scene arrangement.
- Cognitive‑level counterfactuals – violations of commonsense or physical laws (e.g., impossible object relationships).
These seven concrete categories cover a broad spectrum of cues that a human observer might notice.
Dataset construction: The authors start from 4 000 real photographs drawn from four public datasets. Using 13 state‑of‑the‑art text‑to‑image generators (including FLUX.1‑dev, Stable Diffusion 3.5, PixArt‑α, and several community‑fine‑tuned models), they generate 52 000 synthetic counterparts, ensuring semantic alignment with the real images. From each generator, 200 images are sampled for the test split, and a subset of five generators supplies 200 images each for training. After three rounds of annotation by 12 trained labelers, 3 035 fake images receive high‑quality pixel masks and category labels; a confidence‑score study reports an average of 0.86, indicating strong inter‑annotator agreement.
Task definition: X‑AIGD defines two tightly coupled subtasks:
- Authenticity Judgment (AJ) – binary classification of real vs. fake, evaluated with balanced accuracy, precision, recall, and F1.
- Perceptual Artifact Detection (PAD) – for images predicted as fake, the model must output the regions and categories of the artifacts. Evaluation uses Intersection‑over‑Union (IoU), pixel‑level precision/recall/F1, and a weaker instance‑level localization metric to accommodate diverse model architectures.
Experiments:
-
Baseline analysis – Popular end‑to‑end detectors (CNNSpot, DR CT‑ConvB, DR CT‑CLIP, etc.) are examined for attention overlap with the annotated artifact masks. Results show negligible reliance on low‑level distortions; performance drops sharply as image fidelity (measured by NIQE, MDFS, and the authors’ Perceptual Artifact Ratio) improves, indicating that detectors are exploiting dataset‑specific statistical cues rather than human‑perceptible artifacts.
-
Multi‑task learning – Adding PAD as an auxiliary task (via transfer learning or joint training) modestly improves detection of certain low‑level artifacts but yields only marginal gains (<1 % absolute) in overall authenticity accuracy. The models still base most of their decisions on uninterpretable high‑dimensional features, confirming the difficulty of aligning learned representations with explicit artifact cues.
-
Attention alignment – The authors introduce an attention‑alignment loss that penalizes divergence between model attention maps and ground‑truth artifact regions. This regularization forces the network to focus on annotated cues. In cross‑generator evaluations, aligned models achieve a 4.2 % absolute increase in balanced accuracy and a 7 % boost in IoU for artifact localization, without sacrificing overall classification performance. However, overly strong alignment harms the learning of complementary features, revealing a trade‑off that must be balanced.
Key insights:
- Existing AIGI detectors underutilize perceptual artifacts, limiting both interpretability and robustness.
- Simply training detectors to recognize artifacts does not automatically translate into more explainable or more accurate judgments.
- Explicitly guiding model attention toward artifact regions can simultaneously improve interpretability and generalization, but the alignment strength must be carefully tuned.
Contributions:
- A novel benchmark (X‑AIGD) with paired real/fake images, pixel‑level masks, and a hierarchical artifact taxonomy.
- A thorough empirical study showing the gap between current detectors and human‑interpretable cues.
- Demonstration that attention‑alignment is a promising avenue for building explainable, robust AIGI detectors.
The dataset, annotation tools, and code are publicly released (https://github.com/Coxy7/X‑AIGD), inviting the community to explore artifact‑aware detection, develop better explanation mechanisms, and ultimately create more trustworthy AI‑generated content detection systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment