Real-IAD Variety: Pushing Industrial Anomaly Detection Dataset to a Modern Era

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Industrial Anomaly Detection (IAD) is a cornerstone for ensuring operational safety, maintaining product quality, and optimizing manufacturing efficiency. However, the advancement of IAD algorithms is severely hindered by the limitations of existing public benchmarks. Current datasets often suffer from restricted category diversity and insufficient scale, leading to performance saturation and poor model transferability in complex, real-world scenarios. To bridge this gap, we introduce Real-IAD Variety, the largest and most diverse IAD benchmark. It comprises 198,950 high-resolution images across 160 distinct object categories. The dataset ensures unprecedented diversity by covering 28 industries, 24 material types, 22 color variations, and 27 defect types. Our extensive experimental analysis highlights the substantial challenges posed by this benchmark: state-of-the-art multi-class unsupervised anomaly detection methods suffer significant performance degradation (ranging from 10% to 20%) when scaled from 30 to 160 categories. Conversely, we demonstrate that zero-shot and few-shot IAD models exhibit remarkable robustness to category scale-up, maintaining consistent performance and significantly enhancing generalization across diverse industrial contexts. This unprecedented scale positions Real-IAD Variety as an essential resource for training and evaluating next-generation foundation IAD models.

💡 Research Summary

The paper addresses a critical bottleneck in Industrial Anomaly Detection (IAD): the limited diversity and scale of existing public benchmarks, which hampers the development of unified, multi‑class, and zero‑/few‑shot models. To overcome this, the authors introduce Real‑IAD Variety, a new benchmark that dramatically expands the scope of IAD data. The dataset contains 198,950 high‑resolution images covering 160 distinct object categories drawn from 28 industrial domains, 24 material types, 22 colour variations, and 27 defect types. Each object is captured from five viewpoints (one top‑down and four oblique side cameras), enabling realistic multi‑view analysis where certain defects are only visible from specific angles.

Data collection follows a three‑stage pipeline. First, a dedicated team spent over 11,000 hours preparing a diverse set of materials and artificially introducing four defect variations per material, resulting in 27 defect typologies (e.g., scratches, deformations, missing parts, contamination). Second, a custom acquisition rig was built with a 5,328 × 3,040 top‑down sensor and four 4,096 × 3,000 peripheral sensors, all illuminated by an RGBW multispectral light source, achieving sub‑millimetre lateral accuracy and <0.3 µm Z‑repeatability. Third, expert annotators produced pixel‑level masks, which were iteratively refined through algorithmic cross‑validation until annotation variance fell below a predefined error threshold.

The benchmark supports three evaluation settings: (1) Multi‑class Unsupervised Anomaly Detection (MU‑AD), where a single model must learn normality across all 160 categories; (2) Multi‑view Anomaly Detection (MV‑AD), which fuses information from the five viewpoints; and (3) Zero‑Shot/Few‑Shot Anomaly Detection (ZSAD/FSAD) that leverages large‑scale vision‑language models (VLMs) such as CLIP. Extensive experiments reveal that state‑of‑the‑art MU‑AD methods (PatchCore, CFA, DRAEM, etc.) suffer a 10‑20 % drop in AUROC when scaling from 30 to 160 categories, highlighting a fundamental scalability limitation. In contrast, VLM‑based ZSAD/FSAD approaches—enhanced with dual‑class textual prompts, regional patch‑wise windows, and learnable hybrid prompts—maintain stable performance regardless of category count and eventually surpass MU‑AD methods on the full benchmark.

Multi‑view experiments demonstrate that integrating side‑view data improves detection metrics by an average of 5 % AUROC, confirming the practical value of multi‑angle inspection in manufacturing lines. The dataset’s rich metadata (industry, material, colour, defect type) also enables fine‑grained analysis of model robustness to specific attributes. Evaluation metrics include image‑level AUROC/AUPR, pixel‑level IoU, and view‑fusion efficiency, providing a comprehensive protocol for future research.

By offering unprecedented category diversity, industrial coverage, and high‑quality annotations, Real‑IAD Variety positions itself as a foundational resource for training and benchmarking next‑generation foundation IAD models. The authors release the dataset, code, and evaluation scripts publicly, fostering reproducible research and accelerating progress toward robust, scalable anomaly detection systems applicable across real‑world manufacturing environments.

Real-IAD Variety: Pushing Industrial Anomaly Detection Dataset to a Modern Era

💡 Research Summary

Comments & Academic Discussion

Leave a Comment