Convolutional Neural Networks for classifying galaxy mergers: Can faint tidal features aid in classifying mergers?
Identifying mergers from observational data has been a crucial aspect of studying galaxy evolution and formation. Tidal features, typically fainter than 26 ${\rm mag,arcsec^{-2}}$, exhibit a diverse range of appearances depending on the merger characteristics and are expected to be investigated in greater detail with the Rubin Observatory Large Synoptic Survey Telescope (LSST), which will reveal the low surface brightness universe with unprecedented precision. Our goal is to assess the feasibility of developing a convolutional neural network (CNN) that can distinguish between mergers and non-mergers based on LSST-like deep images. To this end, we used Illustris TNG50, one of the highest-resolution cosmological hydrodynamic simulations to date, allowing us to generate LSST-like mock images with a depth $\sim$ 29 ${\rm mag,arcsec^{-2}}$ for low-redshift ($z=0.16$) galaxies, with labeling based on their merger status as ground truth. We focused on 151 Milky Way-like galaxies in field environments, comprising 81 non-mergers and 70 mergers. After applying data augmentation and hyperparameter tuning, a CNN model was developed with an accuracy of 65–67%. Through additional image processing, the model was further optimized, achieving an accuracy of 67–70% when trained on images containing only faint features. This represents an improvement of $\sim$ 5% compared to training on images with bright features only. This suggests that faint tidal features can serve as effective indicators for distinguishing between mergers and non-mergers. The future direction for further improvement based on this study is also discussed.
💡 Research Summary
This paper investigates whether faint tidal features, detectable only in very deep imaging such as that expected from the Rubin Observatory LSST, can improve the automated classification of galaxy mergers using convolutional neural networks (CNNs). The authors employ the state‑of‑the‑art Illustris TNG50 cosmological hydrodynamic simulation, which provides sufficient spatial resolution to resolve low‑surface‑brightness structures down to the LSST depth of ~29 mag arcsec⁻². From the simulation they select 151 Milky‑Way‑mass central galaxies at z ≈ 0.16, of which 70 have experienced a merger with a stellar‑mass ratio > 1:10 within the past 2 Gyr (including both major and minor events) and 81 are classified as non‑mergers.
Mock LSST‑like images are generated in the K‑band (chosen to minimize dust effects while still tracing stellar mass) by projecting star particles within 20 effective radii onto a 600 × 600 pixel grid (0.2″ per pixel). The images are convolved with a Gaussian kernel to mimic 0.7″ seeing and are optionally injected with background noise. Three distinct preprocessing schemes are applied: (1) the full image (bright core + faint outskirts), (2) masking bright regions to retain only faint tidal features, (3) masking faint regions to keep only bright structures, and (4) an inverted version of (2) where faint features are given higher normalized pixel values.
Because the raw sample is small, extensive data augmentation is performed. Each galaxy is rendered from multiple projection angles: a modest set of three orthogonal views and a larger set of 28 orientations generated via HEALPix, each also rotated by 90°. This yields 4,228 training images. The authors acknowledge that images of the same galaxy can appear in training, validation, and test splits, but they mitigate over‑fitting concerns by repeating the random split 1,000 times and observing stable performance.
The CNN architecture follows a conventional stack of convolutional, pooling, and fully‑connected layers. Hyperparameters (learning rate, batch size, number of layers, etc.) are tuned using grid and Bayesian optimization. When trained on the full images, the model achieves an accuracy of 65–67 %. Training on images that emphasize only the faint tidal features (masked bright core, inverted normalization) raises the accuracy to 67–70 %, an improvement of roughly 5 % over the bright‑only case. This demonstrates that low‑surface‑brightness structures contain discriminative information that the network can exploit.
The study highlights several important implications. First, LSST’s unprecedented depth will likely enable merger detection pipelines to benefit from faint tidal debris that are invisible in shallower surveys like SDSS. Second, the modest overall accuracy reflects limitations of the small, mass‑restricted sample and the use of a single photometric band; expanding to a broader mass range, including satellite galaxies, and incorporating multi‑band color information should boost performance. Third, the authors propose future work involving (a) larger, more diverse training sets, (b) realistic dust attenuation and multi‑wavelength imaging, and (c) validation against actual LSST data to quantify simulation‑observation mismatches.
In summary, the paper provides a proof‑of‑concept that faint tidal features, as captured in deep LSST‑like imaging, can serve as effective cues for CNN‑based merger classification. It establishes a methodological framework and points toward concrete steps needed to translate this feasibility study into a robust, large‑scale merger identification tool for forthcoming survey data.
Comments & Academic Discussion
Loading comments...
Leave a Comment