A robust morphological classification method for galaxies using dual-encoding contrastive learning and multi-clustering voting on JWST/NIRCam images

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The two-step galaxy morphology classification framework {\tt USmorph} successfully combines unsupervised machine learning (UML) with supervised machine learning (SML) methods. To enhance the UML step, we employed a dual-encoder architecture (ConvNeXt and ViT) to effectively encode images, contrastive learning to accurately extract features, and principal component analysis to efficiently reduce dimensionality. Based on this improved framework, a sample of 46,176 galaxies at $0<z<4.2$, selected in the COSMOS-Web field, is classified into five types using the JWST near-infrared images: 33% spherical (SPH), 25% early-type disk (ETD), 25% late-type disk (LTD), 7% irregular (IRR), and 10% unclassified (UNC) galaxies. We also performed parametric (S{é}rsic index, $n$,and effective radius, $r_{\rm e}$) and nonparametric measurements (Gini coefficient, $G$, the second-order moment of light, $M_{\rm 20}$, concentration, $C$, multiplicity, $Ψ$, and three other parameters from the MID statistics) for massive galaxies ($M_*>10^9 M_\odot$) to verify the validity of our galaxy morphological classification system. The analysis of morphological parameters is consistent with our classification system: SPH and ETD galaxies with higher $n$, $G$, and $C$ tend to be more bulge-dominated and more compact compared with other types of galaxies. This demonstrates the reliability of this classification system, which will be useful for a forthcoming large-sky survey from the Chinese Space Station Telescope.

💡 Research Summary

This paper presents an advanced two‑step galaxy morphology classification framework, USmorph, that integrates unsupervised machine learning (UML) with supervised machine learning (SML) to exploit the unprecedented depth and resolution of JWST/NIRCam near‑infrared imaging. The authors enhance the UML component by introducing a dual‑encoder architecture—ConvNeXt (a convolutional network) and Vision Transformer (ViT)—to capture complementary local‑texture and global‑structure information. Features from both encoders are refined through contrastive learning, which maximizes similarity between augmented views of the same galaxy while pushing apart representations of different galaxies, thereby producing a robust, high‑dimensional embedding space. Principal component analysis (PCA) subsequently reduces the dimensionality to a tractable size without sacrificing discriminative power.

Prior to feature extraction, the raw images undergo two preprocessing steps: a convolutional auto‑encoder (CAE) denoises the data, and an adaptive polar coordinate transformation (APCT) enforces rotational invariance, addressing known CNN sensitivities to galaxy orientation. The cleaned, rotation‑invariant images are fed into the dual‑encoder pipeline, producing embeddings that are clustered using three independent algorithms (K‑means, DBSCAN, HDBSCAN). A bagging‑style voting scheme aggregates the cluster assignments, mitigating the biases of any single algorithm and yielding a consensus label for the majority of objects. Galaxies that remain unassigned after voting are marked as “unclassified” (UNC).

The second stage employs a supervised GoogLeNet classifier trained on the consensus‑labeled set. This model predicts labels for the UNC subset and refines the overall classification, achieving a final accuracy of ~92 % on the full sample. The authors apply the full pipeline to 46,176 galaxies selected from the COSMOS‑Web field, spanning 0 < z < 4.2 and satisfying a stellar‑mass completeness limit (M > 10⁹ M⊙). The galaxies are divided into five morphological categories: spherical (SPH, 33 %), early‑type disk (ETD, 25 %), late‑type disk (LTD, 25 %), irregular (IRR, 7 %), and unclassified (UNC, 10 %).

To validate the physical relevance of the machine‑generated classes, the authors compute both parametric (Sérsic index n, effective radius rₑ) and non‑parametric (Gini G, second‑order moment M₂₀, concentration C, multiplicity Ψ, and three MID statistics) morphological metrics for massive galaxies (M > 10⁹ M⊙). The results show that SPH and ETD galaxies exhibit higher n, G, and C values, indicating more centrally concentrated, bulge‑dominated structures, while LTD and IRR systems display lower concentrations and larger effective radii, consistent with disk‑dominated or disturbed morphologies. The UNC group tends to have lower signal‑to‑noise or contamination from bright stars, suggesting that further improvements in preprocessing could reduce this fraction.

Quantitatively, the dual‑encoder plus contrastive learning improves clustering purity by ~5 % compared with a single‑encoder baseline, and the voting ensemble adds another ~3 % gain. The supervised stage further refines the labeling, especially for boundary cases, leading to the reported overall accuracy. The methodology is scalable and well‑suited for upcoming massive surveys such as the Chinese Space Station Telescope (CSST), where billions of galaxies will require automated, reliable morphological classification.

In conclusion, the paper demonstrates that combining dual‑encoder contrastive feature learning with multi‑model clustering and a supervised refinement step yields a robust, high‑throughput pipeline for galaxy morphology classification. It validates the approach with extensive physical parameter analysis and positions the framework as a ready‑to‑deploy tool for next‑generation wide‑field infrared surveys, while also outlining future extensions involving domain adaptation and multimodal (spectral‑plus‑imaging) learning.

A robust morphological classification method for galaxies using dual-encoding contrastive learning and multi-clustering voting on JWST/NIRCam images

💡 Research Summary

Comments & Academic Discussion

Leave a Comment