Identification of 2D colloidal assemblies in images: a threshold processing method versus machine learning

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper is devoted to the problem of identification of colloidal assemblies using the example of two-dimensional coatings (monolayer assemblies). Colloidal systems are used in various fields of science and technology, for example, in applications for photonics and functional coatings. The physical properties depend on the morphology of the structure of the colloidal assemblies. Therefore, effective identification of particle assemblies is of interest. The following classification is considered here: isolated particles, dimers, chains and clusters. We have studied and compared two identification methods: image threshold analysis using the OpenCV library and machine learning using the YOLOv8 model as an example. The features and current results of training a neural network model on a dataset specially prepared for this work are described. A comparative characteristic of both methods is given. The best result was shown by the machine learning method (97% accuracy). The threshold processing method showed an accuracy of about 67%. The developed algorithms and software modules may be useful to scientists and engineers working in the field of materials science in the future.

💡 Research Summary

The paper addresses the problem of automatically identifying and classifying particle assemblies in two‑dimensional colloidal monolayer images. Four structural categories are defined: isolated particles, dimers, linear chains, and clusters (including closed loops). The authors compare two fundamentally different approaches: a classical image‑processing pipeline built with the OpenCV library and a modern deep‑learning object‑detection model (YOLOv8).

In the threshold‑based pipeline, the authors first convert each raw image to a binary mask. After testing several binarization strategies, they select Otsu’s global threshold because it requires no manual parameter tuning and performed best on their dataset. To separate touching particles, they optionally apply a watershed segmentation step. Detected contours are filtered by area, and each particle’s circularity (4πA/P²) is computed. The threshold type (binary or inverse binary) that yields the highest average circularity is automatically chosen for each image, eliminating the need for per‑image manual adjustment.

Particle positions and radii are extracted from the binary mask. A dynamic neighbor radius Rcut = 2R + ε (with ε = (r₁ + r₂)/10) is defined to account for slight size variations. For every particle pair, Euclidean distance is compared with Rcut; if the distance is smaller, the particles are considered neighbors. Because the test images contain fewer than a thousand particles, the authors implement a naïve O(N²) all‑pairs search, noting that a spatial grid could reduce complexity for larger datasets.

The neighbor graph is then traversed using a depth‑first search (DFS) algorithm to identify connected components. The number of neighbors per particle within each component determines its class: zero neighbors → isolated, exactly one mutual neighbor → dimer, two neighbors for interior particles and one for the two ends → chain, and any other configuration → cluster (including loops). The pipeline is fully deterministic, runs on a CPU, and processes an image in a few seconds. However, the authors observe that illumination gradients, noise, and imperfect watershed separation cause missed or merged particles, leading to an overall classification accuracy of about 67 %.

For the machine‑learning approach, the authors construct a bespoke dataset of roughly 2,000 annotated images, each labeled with one of the four classes. They adopt YOLOv8, a state‑of‑the‑art single‑stage detector, and fine‑tune it using transfer learning. Data augmentation (random rotations, scaling, brightness changes) and a cosine learning‑rate schedule are employed to improve generalization. Training proceeds for 100 epochs on a GPU, after which the model achieves a mean average precision of 97 % on a held‑out test set. The deep‑learning model is robust to variations in lighting and particle overlap because it learns global contextual cues rather than relying on explicit segmentation. Inference requires a GPU and takes on the order of 100–200 ms per image, which is slower than the CPU‑based threshold method but still suitable for many laboratory workflows.

The comparative analysis highlights a clear trade‑off: the classical pipeline is fast, lightweight, and easy to implement but suffers from low robustness and modest accuracy; the YOLOv8 solution demands more computational resources and a labeled training set but delivers near‑human performance across diverse imaging conditions. The authors suggest a hybrid strategy—using fast thresholding for coarse pre‑screening and invoking the deep‑learning detector for detailed analysis—as a practical compromise. They also propose future extensions such as integrating watershed‑derived masks as additional channels for the neural network, employing graph‑neural‑network classifiers on the neighbor graph, and scaling the pipeline to multi‑layer or polydisperse particle systems.

In conclusion, the study provides a thorough, side‑by‑side evaluation of traditional image processing versus modern deep learning for colloidal assembly identification. The results demonstrate that while classical methods remain valuable for rapid, low‑cost screening, deep‑learning‑based object detection currently offers the most reliable and accurate solution for quantitative colloid science and related material‑engineering applications.

Identification of 2D colloidal assemblies in images: a threshold processing method versus machine learning

💡 Research Summary

Comments & Academic Discussion

Leave a Comment