About an Automating Annotation Method for Robot Markers
Factory automation has become increasingly important due to labor shortages, leading to the introduction of autonomous mobile robots for tasks such as material transportation. Markers are commonly used for robot self-localization and object identification. In the RoboCup Logistics League (RCLL), ArUco markers are employed both for robot localization and for identifying processing modules. Conventional recognition relies on OpenCV-based image processing, which detects black-and-white marker patterns. However, these methods often fail under noise, motion blur, defocus, or varying illumination conditions. Deep-learning-based recognition offers improved robustness under such conditions, but requires large amounts of annotated data. Annotation must typically be done manually, as the type and position of objects cannot be detected automatically, making dataset preparation a major bottleneck. In contrast, ArUco markers include built-in recognition modules that provide both ID and positional information, enabling automatic annotation. This paper proposes an automated annotation method for training deep-learning models on ArUco marker images. By leveraging marker detection results obtained from the ArUco module, the proposed approach eliminates the need for manual labeling. A YOLO-based model is trained using the automatically annotated dataset, and its performance is evaluated under various conditions. Experimental results demonstrate that the proposed method improves recognition performance compared with conventional image-processing techniques, particularly for images affected by blur or defocus. Automatic annotation also reduces human effort and ensures consistent labeling quality. Future work will investigate the relationship between confidence thresholds and recognition performance.
💡 Research Summary
The paper addresses the problem of robust marker detection for autonomous mobile robots used in factory logistics, focusing on ArUco markers that are widely employed in the RoboCup Logistics League (RCLL) for robot self‑localization and module identification. Traditional OpenCV‑based pipelines detect the black‑and‑white pattern of the marker directly, but they degrade severely under motion blur, defocus, illumination changes, noise, or partial occlusion—conditions that are common in real‑world production environments. While deep‑learning‑based object detectors such as YOLO can handle these degradations, they require large, accurately annotated datasets, and manual annotation of thousands of images is a major bottleneck.
The authors propose an automated annotation pipeline that leverages the built‑in detection capabilities of the ArUco library. The pipeline works as follows: (1) capture images with a standard RGB camera; (2) run the OpenCV‑ArUco module to extract each marker’s corner coordinates and its unique ID; (3) convert the corner coordinates into a bounding‑box in image space; (4) map the ID to a class label; and (5) store the resulting annotation in the YOLO format. Because the ArUco module already provides both class (ID) and pose information, no human intervention is required to generate the training, validation, and test labels.
The experimental setup uses a Logitech HD Pro Webcam C920 and 28 different 5×5‑cell ArUco markers. For each marker, roughly 130 images are collected for training, 25 for validation, and 200 for testing. The dataset is deliberately diversified: motion blur, a range of viewing angles and distances, and extensive photometric augmentations (brightness ±100 %, contrast ±50 %, hue ±2.8 %, saturation ±70 %, scale ±50 %) are applied to mimic realistic factory conditions. A YOLO model (the authors used a recent YOLOv5/v7 architecture) is trained on the automatically generated annotations with identical hyper‑parameters across all experiments.
Performance is evaluated by varying the detection confidence threshold from 0.3 to 0.8 in steps of 0.1. Two metrics are reported: (i) recognition rate – the proportion of test images where the marker is correctly detected and its ID correctly identified; and (ii) mis‑identification rate – the proportion of detections that either assign a wrong ID or falsely label a non‑marker as a marker. Results show a clear trade‑off: lower thresholds increase recall but also raise false positives, while higher thresholds reduce false positives at the cost of missed detections. The optimal range (0.4–0.7) yields a recognition rate improvement of roughly 12 percentage points over the conventional OpenCV method, and a mis‑identification reduction of about 8 percentage points. Notably, under severe blur or defocus the YOLO model maintains >85 % accuracy, whereas the traditional method drops below 60 %.
Beyond accuracy, the automated annotation process dramatically cuts human effort, saving more than 95 % of the time normally spent on manual labeling and eliminating inter‑annotator variability. The authors acknowledge that when a marker is heavily occluded or damaged, the ArUco detector itself fails, leaving the pipeline without a label. Future work is suggested to integrate auxiliary sensors (e.g., depth cameras) or to develop semi‑automatic correction tools for such failure cases, as well as to study the relationship between confidence thresholds and overall system performance.
In summary, the paper demonstrates that ArUco markers can serve as self‑annotating data sources, enabling large‑scale, high‑quality training sets for deep‑learning‑based detection without manual effort. This approach not only improves robustness to challenging visual conditions but also reduces the cost and time associated with dataset creation, making deep‑learning‑driven marker detection a practical solution for modern automated manufacturing and logistics environments.
Comments & Academic Discussion
Loading comments...
Leave a Comment