📝 Original Info
- Title: A Comprehensive Dataset for Human vs. AI Generated Image Detection
- ArXiv ID: 2601.00553
- Date: 2026-01-02
- Authors: Rajarshi Roy, Nasrin Imanpour, Ashhar Aziz, Shashwat Bajpai, Gurpreet Singh, Shwetangshu Biswas, Kapil Wanaskar, Parth Patwa, Subhankar Ghosh, Shreyas Dixit, Nilesh Ranjan Pal, Vipula Rawte, Ritvik Garimella, Gaytri Jena, Vasu Sharma, Vinija Jain, Aman Chadha, Aishwarya Naresh Reganti, Amitava Das
📝 Abstract
Multimodal generative AI systems like Stable Diffusion, DALL-E, and MidJourney have fundamentally changed how synthetic images are created. These tools drive innovation but also enable the spread of misleading content, false information, and manipulated media. As generated images become harder to distinguish from photographs, detecting them has become an urgent priority. To combat this challenge, We release MS COCOAI, a novel dataset for AI generated image detection consisting of 96000 real and synthetic datapoints, built using the MS COCO dataset. To generate synthetic images, we use five generators: Stable Diffusion 3, Stable Diffusion 2.1, SDXL, DALL-E 3, and MidJourney v6. Based on the dataset, we propose two tasks: (1) classifying images as real or generated, and (2) identifying which model produced a given synthetic image. The dataset is available at https://huggingface.co/datasets/Rajarshi-Roy-research/Defactify_Image_Dataset.
💡 Deep Analysis
📄 Full Content
A Comprehensive Dataset for Human vs. AI Generated
Image Detection
Rajarshi Roy1, Nasrin Imanpour2, Ashhar Aziz3, Shashwat Bajpai4, Gurpreet Singh5,
Shwetangshu Biswas6, Kapil Wanaskar7, Parth Patwa8, Subhankar Ghosh9, Shreyas Dixit10,
Nilesh Ranjan Pal1, Vipula Rawte2, Ritvik Garimella2, Gaytri Jena11, Vasu Sharma12,
Vinija Jain12, Aman Chadha13, Aishwarya Naresh Reganti13 and Amitava Das14
11Kalyani Govt. Engg. College, 2AI Institute USC, 3IIIT Delhi, 4BITS Pilani Hyderabad, 5IIIT Guwahati, 6NIT Silchar, 7San José
State Univ., 8UCLA, 9Washington State Univ., 10VIIT, 11GITA, 12Meta AI, 13Amazon AI, 14BITS Pilani Goa
Abstract
Multimodal generative AI systems like Stable Diffusion, DALL-E, and MidJourney have fundamentally changed how synthetic
images are created. These tools drive innovation but also enable the spread of misleading content, false information, and
manipulated media. As generated images become harder to distinguish from photographs, detecting them has become an
urgent priority. To combat this challenge, We release MS COCOAI, a novel dataset for AI generated image detection consisting
of 96000 real and synthetic datapoints, built using the MS COCO dataset. To generate synthetic images, we use five generators:
Stable Diffusion 3, Stable Diffusion 2.1, SDXL, DALL-E 3, and MidJourney v6. Based on the dataset, we propose two tasks: (1)
classifying images as real or generated, and (2) identifying which model produced a given synthetic image. The dataset is
available at https://huggingface.co/datasets/Rajarshi-Roy-research/Defactify_Image_Dataset.
Keywords
AI-Generated Images, Detection Techniques, Synthetic Media, Generative AI, Multimodal AI
Real (MS COCO)
Stable Diffusion 2.1
Stable Diffusion XL
Stable Diffusion 3
DALL-E 3
MidJourney v6
Caption: "Two men riding mopeds, one with a woman and boy riding along."
Figure 1: Images generated from the same caption. Each model produces visually distinct outputs,
highlighting the challenge of AI-generated image detection.
Defactify 4.0: Multimodal Fact-Checking and AI-Generated Image Detection, March 2025, Philadelphia, Pennsylvania, USA
$ royrajarshi0123@gmail.com (R. Roy)
© 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
1. Introduction
Generative AI technologies such as Stable Diffusion [1], DALL-E [2], and MidJourney [3] have trans-
formed the production of synthetic visual content. These tools, powered by advanced neural architec-
tures, enable diverse applications in fields ranging from advertising and entertainment to design, with
prompt quality playing a crucial role in generation outcomes [4]. However, the same innovations that
facilitate creative expression also present significant risks when misused. For example, the propagation
of misleading or harmful content can disrupt public discourse and undermine trust [5].
Recent high-profile incidents have demonstrated the societal impact of AI-generated images, from
fabricated depictions that trigger public panic to politically charged visuals intended to sway opinion
[6]. The rapid advancement of image generation models has further blurred the line between synthetic
and authentic imagery, challenging traditional detection methods and complicating efforts to combat
misinformation.
In light of these challenges, there is an urgent need for robust datasets that support the development
and evaluation of effective detection techniques. In this paper, we introduce a dataset specifically curated
for the detection and analysis of AI-generated images. Our dataset aggregates a diverse collection
of images produced by multiple generative models alongside authentic real-world images, and it is
enriched with detailed annotations—including the source model, creation timestamp, and relevant
contextual metadata. Figure 1 provides a sample of our dataset.
By providing a large-scale, representative benchmark, our dataset aims to advance research in
synthetic media detection and foster the development of scalable countermeasures against AI-enabled
disinformation. Building upon the foundations laid by initiatives such as the Defactify workshop series
[7], this work bridges the gap between academic inquiry and practical implementation, offering a
valuable resource for researchers, policymakers, and industry stakeholders committed to safeguarding
the integrity of digital information ecosystems. This work complements parallel efforts addressing
AI-generated text detection [8].
2. Related Work
The rapid growth of generative models has led to highly realistic AI-generated images, making it harder
to tell them apart from real images. This section reviews existing datasets and detection methods.
2.1. AI-Generated Image Datasets
Several datasets have been introduced for AI-generated image detection:
• WildFake: Hong et al. [9] collected fake images from various open-source platforms, covering
diverse categories from GANs and diffusion models. Ho
Reference
This content is AI-processed based on open access ArXiv data.