A Dual-TransUNet Deep Learning Framework for Multi-Source Precipitation Merging and Improving Seasonal and Extreme Estimates

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Multi-source precipitation products (MSPs) from satellite retrievals and reanalysis are widely used for hydroclimatic monitoring, yet spatially heterogeneous biases and limited skill for extremes still constrain their hydrologic utility. Here we develop a dual-stage TransUNet-based multi-source precipitation merging framework (DDL-MSPMF) that integrates six MSPs with four ERA5 near-surface physical predictors. A first-stage classifier estimates daily precipitation occurrence probability, and a second-stage regressor fuses the classifier outputs together with all predictors to estimate daily precipitation amount at 0.25 degree resolution over China for 2001-2020. Benchmarking against multiple deep learning and hybrid baselines shows that the TransUNet - TransUNet configuration yields the best seasonal performance (R = 0.75; RMSE = 2.70 mm/day) and improves robustness relative to a single-regressor setting. For heavy precipitation (>25 mm/day), DDL-MSPMF increases equitable threat scores across most regions of eastern China and better reproduces the spatial pattern of the July 2021 Zhengzhou rainstorm, indicating enhanced extreme-event detection beyond seasonal-mean corrections. Independent evaluation over the Qinghai-Tibet Plateau using TPHiPr further supports its applicability in data-scarce regions. SHAP analysis highlights the importance of precipitation occurrence probabilities and surface pressure, providing physically interpretable diagnostics. The proposed framework offers a scalable and explainable approach for precipitation fusion and extreme-event assessment.

💡 Research Summary

The paper introduces a novel dual‑stage deep learning framework, DDL‑MSPMF, for merging multiple satellite‑derived precipitation products (MSPs) and improving both seasonal and extreme precipitation estimates over China. Six MSPs—CMORPH, PERSIANN, GPM (TRMM‑based), GSMAP, MSWEP, and ERA5‑derived precipitation—are combined with four ERA5 near‑surface physical predictors (2 m temperature, 2 m dew point, surface pressure, and 0‑7 cm soil moisture). The first stage employs a TransUNet‑based binary classifier that predicts the daily probability of precipitation occurrence for each 0.25° grid cell. The second stage uses another TransUNet regressor that ingests the classifier’s probability output together with all six MSPs and the four physical variables to estimate the daily precipitation amount at the same spatial resolution.

Training uses data from 2001–2018, while 2019–2020 serve as independent validation and test periods. The authors systematically benchmarked eleven model configurations, including XGBoost, CNN‑Transformer, UNet, pure Transformer, LSTM, and hybrid variants, against the proposed dual‑TransUNet architecture. Evaluation metrics cover (1) event detection (accuracy, F1‑score), (2) seasonal mean performance (Pearson correlation R, RMSE), and (3) extreme‑event skill (Equitable Threat Score, ETS) for thresholds >25 mm day⁻¹.

Results show that the TransUNet‑TransUNet pair achieves the highest seasonal skill (R = 0.75, RMSE = 2.70 mm day⁻¹) and markedly improves heavy‑rain detection, raising ETS by an average of 0.12 across eastern China. The framework successfully reproduces the spatial pattern of the July 2021 Zhengzhou rainstorm, demonstrating its capacity to capture extreme events beyond simple mean‑bias correction. SHAP (Shapley Additive exPlanations) analysis reveals that the classifier’s precipitation‑occurrence probability and surface pressure are the most influential features, confirming that the model leverages both data‑driven event likelihood and physically meaningful atmospheric drivers.

An independent assessment on the Qinghai‑Tibet Plateau using the TPHiPr high‑resolution precipitation dataset yields R = 0.68 and RMSE = 3.12 mm day⁻¹, indicating that the framework generalizes well to data‑scarce, high‑altitude regions.

The study’s key contributions are: (1) a dual‑stage architecture that first isolates the binary precipitation event and then quantifies its magnitude, thereby reducing error propagation and enhancing extreme‑event sensitivity; (2) the integration of multi‑source precipitation fields with physically relevant predictors, which improves spatial consistency and physical interpretability; (3) a thorough comparative analysis against a wide suite of statistical and deep‑learning baselines, establishing the superiority of the TransUNet‑TransUNet configuration; and (4) the use of SHAP to provide transparent diagnostics of feature importance.

Future work suggested includes extending the framework to global scales, developing lightweight variants for real‑time operational use, incorporating additional atmospheric and hydrological variables (e.g., wind fields, convective indices), and evaluating long‑term climate‑change scenarios to test robustness under shifting precipitation regimes.

A Dual-TransUNet Deep Learning Framework for Multi-Source Precipitation Merging and Improving Seasonal and Extreme Estimates

💡 Research Summary

Comments & Academic Discussion

Leave a Comment