Integrating Weather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting

Integrating W eather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting Ziqing Ma ∗ maziqing.mzq@alibaba- inc.com D AMO Academy , Alibaba Group Hangzhou, Zhejiang, China Kai Ying ∗ yingkai.ying@alibaba- inc.com D AMO Academy , Alibaba Group Hangzhou, Zhejiang, China College of Computer Science, Zhejiang University of T echnology Hangzhou, Zhejiang, China Xinyue Gu ∗ guxinyue.gxy@alibaba- inc.com D AMO Academy , Alibaba Group Hangzhou, Zhejiang, China Tian Zhou ∗ tian.zt@alibaba- inc.com D AMO Academy , Alibaba Group Hangzhou, Zhejiang, China Tianyu Zhu ∗ yunrui.zty@alibaba- inc.com D AMO Academy , Alibaba Group Hangzhou, Zhejiang, China Haifan Zhang zhanghaifan.zhf@alibaba- inc.com D AMO Academy , Alibaba Group Hangzhou, Zhejiang, China Peisong Niu niupeisong.nps@alibaba- inc.com D AMO Academy , Alibaba Group Hangzhou, Zhejiang, China W ang Zheng zhengwang@zjut.edu.cn College of Computer Science, Zhejiang University of T echnology Hangzhou, Zhejiang, China Cong Bai congbai@zjut.edu.cn College of Computer Science, Zhejiang University of T echnology Hangzhou, Zhejiang, China Liang Sun liang.sun@alibaba- inc.com D AMO Academy , Alibaba Group Bellevue, USA Abstract Accurate day-ahead solar irradiance forecasting is essential for integrating solar energy into the power grid. How ever , it remains challenging due to the pronounced diurnal cycle and inherently complex cloud dynamics. Current methods either lack ne-scale resolution (e .g., numerical weather prediction, weather foundation models) or degrade at longer lead times (e.g., satellite extrapolation). W e propose Baguan-solar , a two-stage multimodal framework that fuses forecasts from Baguan, a global weather foundation model, with high-resolution geostationary satellite imagery to produce 24- hour irradiance forecasts at kilometer scale. Its decouple d two-stage design rst forecasts day-night continuous interme diates (e.g., cloud cover ) and then infers irradiance, while its modality fusion jointly preserves ne-scale cloud structures from satellite and large-scale constraints from Baguan forecasts. Evaluated over East Asia using CLD AS as ground truth, Baguan-solar outp erforms strong baselines (including ECMWF IFS, vanilla Baguan, and SolarSeer), reducing ∗ Authors contributed equally to this work. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for components of this work owned by others than the author(s) must be honor ed. Abstracting with credit is permitted. T o copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specic permission and /or a fee. Request permissions from permissions@acm.org. Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y © 2018 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-XXXX -X/2018/06 https://doi.org/XXXXXXX.XXXXXXX RMSE by 16.08% and better resolving cloud-induced transients. An operational deployment of Baguan-solar has supported solar power forecasting in an eastern province in China, since July 2025. Our co de is accessible at https://github .com/D AMO-DI-ML/Baguan- solar .git. CCS Concepts • Computing methodologies → Supervise d learning by r e- gression . Ke ywords Solar irradiance forecasting, W eather foundation mo dels, Multi- modal fusion, Satellite imagery , Swin Transformer . A CM Reference Format: Ziqing Ma, Kai Ying, Xinyue Gu, Tian Zhou, Tianyu Zhu, Haifan Zhang, Peisong Niu, W ang Zheng, Cong Bai, and Liang Sun. 2018. Integrating W eather Foundation Model and Satellite to Enable Fine-Graine d Solar Irra- diance Forecasting . In Proceedings of Make sure to enter the correct conference title from your rights conrmation email (Conference acronym ’XX) . ACM, New Y ork, N Y , USA, 12 pages. https://doi.org/XXXXXXX.XXXXXXX 1 Introduction The inherent variability of solar irradiance, driven largely by cloud dynamics, presents a major challenge to integrating solar power into the electricity grid, impacting both its stability and operational eciency [ 31 ]. Accurate day-ahead (24 h) forecasting of surface Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y xxx et al. (a). Weather Foundation Model — Baguan Observations Assimilation Compute-intensive Initial Field Baguan Output Variables Baguan q_925 z_925 t_925 tcc lcc tcwv ghi fdir ... ... Baguan GHI (0.25  ) Coarse 、 limited deta il (b). Baguan-solar Himawari - 8/9 Satellite Band 03 Band 07 Band 10 Band 14 High resolution (0.05  ) , Low latency Rich information of cl oud-morphology Baguan Global Forecast Low resolution (0.25  ), Flexible Rich information of en vironment-forcing Pressure Level-Vars U, V, T, Q, Z for 7 levels Single Level-Vars Cloud, Vapor, Radiation (GHI) Baguan-solar GHI (0.05  ) Fine-grained 、 rich de tail, supervised by CLDAS From coarse to fine-grained Baguan-solar (Two Stage) Decoupling Fine-grained TCDC GHI Inference Figure 1: From coarse Baguan forecasts to ne-graine d GHI: the Baguan-solar overall framework. solar irradiance, typically quantie d as Global Horizontal Irradi- ance (GHI), is therefore essential to enable the large-scale, reliable , and cost-eective integration of solar energy into modern power systems [10]. Despite its importance, day-ahead GHI forecasting remains di- cult for two fundamental reasons. First, GHI exhibits a pronounced diurnal cycle: it is strictly zero at night and incr eases rapidly after sunrise, introducing strong discontinuities around day–night transi- tions [ 31 ]. Second, accurate forecasts must simultane ously captur e ne-scale cloud morphology (which driv es sharp local irradiance uctuations) and remain skillful at longer lead times, where cloud motion errors accumulate and cloud evolution involv es formation and dissipation rather than pure advection [21]. Currently , mainstream operational and research solutions for GHI forecasting can be broadly grouped into three categories: (i) physics-based numerical w eather prediction (NWP), (ii) data-driven weather foundation models (WFMs), and (iii) satellite-based ex- trapolation methods. N WP models solve the governing physical equations of atmospheric dynamics and radiation, oering strong physical consistency and reliable long-horizon predictability . How- ever , they are computationally expensive and often struggle to resolve the ne-scale, rapidly e volving cloud processes that domi- nate local irradiance variability [ 1 , 19 ]. In addition, they also suer from the so called “spin-up” problem, making it often inaccurate in the early-hour prediction [30]. Recent advances in WFMs present a promising alternative to conventional NWP systems. WFMs such as Pangu - W eather [ 3 ], FuXi [ 7 ], GraphCast [ 17 ], and Baguan [ 24 ] have demonstrated fore- casting skill surpassing NWP models at medium ranges, while drastically reducing computational expense. These mo dels learn to predict the evolution of global atmospheric elds typically at 0 . 25 ◦ . However , current WFMs remain suboptimal for solar energy applications due to two key limitations. First, they are primarily trained to optimize conventional meteorological variables (e.g., wind, temperature, humidity ), yet many lack outputs for irradiance or cloud-sp ecic parameters that are essential for solar forecast- ing, as summarized in Appendix T able 4. Second, their native spa- tial resolution is too coarse to adequately resolve mesoscale cloud structures. Furthermore, incr easing the spatial resolution leads to a dramatic growth in computational cost, making high-r esolution global WFMs impractical in real-world deployment [22]. In parallel, satellite extrapolation based metho ds have made compelling progress in high-resolution solar now casting [ 1 , 4 , 21 ]. Geostationary platforms, such as Himawari - 8/9 [ 2 ], provide multi- spectral imagery at kilometer-scale resolution and minute-scale cadence, oering a rich description of cloud morphology , cloud- top temperature, and moisture structure. Models like SolarSeer [ 1 ] leverage such satellite imagery to forecast GHI over large domains (e.g., CONUS) at 5 km resolution, achieving substantial speed-ups while narrowing the error gap. At longer lead times (12–24 h), satellite-only methods b ecome a mere extrapolation problem, ig- noring atmospheric dynamics in dierent lay ers and leading to degraded performance. These observations highlight a critical gap: existing approaches either (i) provide physically rich but coarse and computationally expensive forecasts (N WP , WFMs), or (ii) oer high spatial reso- lution and low latency but rely solely on satellite extrapolation, which limits longer-horizon skill. However , a seamless integration of WFMs with satellite observations for solar irradiance forecasting remains a recognized challenge in the eld. T o bridge this gap, we propose Baguan - solar , a two-stage, multimodal framew ork that integrates WFMs forecasts with satellite imagery for ne-grained solar irradiance forecasting. The overall framework is illustrated in Figure 1. T o handle the pronounced diurnal cycle, in which GHI changes abruptly around sunrise and sunset, Baguan-solar employs a decoupled two-stage design. The rst stage targets intermediate variables like total cloud cov er and future satellite imager y , en- suring consistent mo deling across day and night. These outputs then serve as inputs to the se cond stage , which infers GHI by com- bining them with clear-sky GHI and meteorological context. T o jointly preserve ne-scale cloud morphology while maintaining Integrating W eather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y day-ahead skill, Baguan-solar further adopts a modality-fusion de- sign that leverages high-resolution satellite observations to capture local cloud structures and uses Baguan forecasts [ 24 ] to provide large-scale dynamical and thermodynamical constraints that guide longer-horizon cloud evolution. Baguan-solar is comprehensiv ely evaluated over East Asia using the China Meteorological A dministration’s Land Data Assimilation System (CLD AS) [ 27 ] as ground truth. CLD AS provides a spatially and temporally consistent analysis by assimilating dense surface observations and satellite retrievals, making it a sup erior refer ence for GHI over East A sia compared with reanalysis datasets such as ERA5. Evaluated on CLDAS, Baguan-solar outperforms a range of baselines, including vanilla Baguan [ 24 ], ECMWF Integrated Fore- casting System (IFS), SolarSeer [ 1 ]. Under our operational setting, Baguan-solar reduces GHI RMSE by approximately 16.08% relative to the strongest baseline, while substantially improving the rep- resentation of rapid irradiance transients associated with passing clouds. In summary , the contributions of this work are fourfold: (1) W e propose Baguan-solar , a multimodal fusion framework that integrates Baguan forecasts with geostationar y satel- lite imagery to produce ne-grained day-ahead (24 h) GHI forecasts. (2) W e design a decoupled two-stage Swin Transformer that rst forecasts day–night continuous cloud-related intermediates and then infers GHI. (3) W e conduct comprehensive experiments over East Asia using CLD AS as ground truth, demonstrating that Baguan-solar consistently outperforms strong baselines (including Baguan, ECMWF IFS, and satellite-based methods). (4) W e demonstrate operational deployment of Baguan-solar in an online forecasting system, where it runs hourly using the latest Baguan forecasts and satellite data as input to support real-world solar power for ecasting. 2 Related W ork 2.1 WFMs for Irradiance Forecasting WFMs have progressed rapidly in recent years. Existing WFMs can be broadly categorized into two architectural paradigms: graph- based and transformer-based mo dels. Graph-based approaches [ 17 , 18 ] naturally accommodate Earth’s geometr y and enabling exible spatial discretization. In contrast, transformer-based approaches [ 3 , 6 , 7 , 23 ] typically tokenize gridded meteorological elds and lever- age attention mechanisms for spatiotemporal modeling. Beyond ar- chitectural choices, Baguan [ 24 ] adopts a pre-training–ne-tuning pipeline to mitigate ov ertting under limited real-w orld data. How- ever , most existing WFMs [ 3 , 6 , 7 , 16 – 18 , 23 ] do not natively support solar irradiance forecasting. T o the best of our knowledge, only Baguan and FuXi-2.0 [ 32 ] pro vide irradiance-related outputs. More- over , although WFMs like Baguan produce surface irradiance elds, their predictions are not specically optimize d for solar-energy ap- plications and remain restricted to a coarse 0 . 25 ◦ spatial resolution. 2.2 From WFMs to Downstream Irradiance Products Recent studies demonstrate the potential of building downstream irradiance forecasting applications on top of WFMs. Huang et al. propose FuXi-RTM [ 13 ], which couples FuXi [ 7 ] with a xed radia- tive transfer model to enforce radiative-transfer consistency during training. Similarly , N VIDIA Earth-2 [ 5 ] integrates the Four CastNet SFNO forecasting model with dedicated radiation diagnostic mod- ules to generate global multi-day solar irradiance forecasts. In indus- try , GraphCast [ 17 ] forecasts have also be en used as multi-variable meteorological inputs for power-market applications [ 8 , 29 ]. How- ever , these advances remain limited by coarse r esolution, as they focus on model-side adaptations without leveraging satellite obser- vations to enhance ne-grained forecasting. 2.3 Satellite-based Irradiance Forecasting Complementary to WFM-based irradiance products, satellite im- agery provides high-frequency , high-resolution observations of cloud evolution and has become an eective auxiliary mo dality for day-ahead solar irradiance forecasting. Boussif et al. propose CrossViViT [ 4 ], which improves site-lev el irradiance prediction by incorporating geostationary satellite imager y . Extending to multi- site settings, Schubnel et al. develop SolarCrossFormer [ 26 ], which couples satellite patches with station networks via graph-based cross-attention. For solar power for ecasting, Ma et al. present Fu- sionSF [ 21 ], a tri-modal framework that integrates NWP outputs and satellite images, using vector quantization to align heteroge- neous modalities. However , most multimodal metho ds r emain lim- ited to site-specic forecasts and lack scalable, gridded outputs for broader applications. T o bridge this gap, Bai et al. pr opose So- larSeer [ 1 ], an end-to-end model that uses historical satellite obser- vations to forecast cloud cover and irradiance at 5 km resolution, oering faster inference and lower RMSE than HRRR. However , its reliance on satellite observations alone limits accuracy at longer lead times. 3 Multimodal Datasets This section describes the multimodal datasets used in our study , as summarized in T able 1, including geostationar y satellite obser- vations (Himawari), regional analysis elds from CLDAS, global reanalysis data (ERA5), and global WFM forecasts. Although these datasets have dierent spatial coverages, all data are croppe d to their common spatial intersection for subsequent analyses. 3.1 Satellite Obser vations W e utilize multi-spectral imagery from the Himawari-8/9 geosta- tionary satellites, op erated by the Japan Mete or ological Agency ( JMA) [ 2 ]. These satellites provide full-disk observations over the Asia-Pacic region at 10-minute intervals, with spatial resolutions of 0.5–2 km depending on the channel. The visible, near-infrared, and thermal infrared bands capture critical information on cloud op- tical properties, aerosol loading, and atmospheric moisture . Given its low latency (<30 minutes), this data stream serves as a timely observational constraint for short-term solar irradiance prediction. Himawari-8/9 observations from the Advanced Himawari Imager (AHI) include 13 sp ectral bands. The complete set of AHI bands, together with their central wavelengths and typical applications, is summarized in T able 5. Following SolarSeer [ 1 ], we focus on four AHI bands: B03, B07, B10, and B14, with central wavelengths of 0.64, 3.9, 7.3, and 11.2 𝜇 m, respectively . Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y xxx et al. T able 1: Summary of datasets used in our study . The reported spatial resolution refers to the eective resolution used in our experiments after interpolating the original products. Dataset Spatial Resolution Coverage Channels Usage Himawari 0.05 ◦ Asia-Pacic Band 0.64, 3.9, 7.3, and 11.2 𝜇 m Input during training and inference CLD AS 0.05 ◦ East Asia SSRD , TCDC Training target ERA5 0.25 ◦ Global U, V , T , Q, Z, TCC, SSRD , etc. Input during training Baguan forecast 0.25 ◦ Global U, V , T , Q, Z, TCC, SSRD , etc. Input during inference 3.2 CLD AS Analysis Fields The CLDAS [ 27 ] provides hourly , near-real-time land surface anal- ysis over East Asia (0 ◦ –65 ◦ N, 60 ◦ –160 ◦ E) at an eective r esolu- tion of 0.01 ◦ . For consistency with our model grid, we interpo- late the CLDAS elds to 0.05 ◦ resolution. CLDAS integrates sur- face observations from over 30,000 automatic weather stations, FengY un satellite retrievals, radar-base d precipitation estimates, and background elds from CMA ’s numerical models through a statistical blending framework. W e use CLDAS-derived GHI, computed from downward surface shortwave radiation (SSRD), as the target variable for model training and evaluation, where GHI ( W m − 2 ) = SSRD ( J m − 2 ) / 3600 . In addition, we include TCDC (total cloud cover ) from CLD AS as an auxiliary predictor to pro vide complementary information on cloudiness conditions. 3.3 ERA5 Dataset ERA5 [ 11 ], produced by the European Centre for Medium-Range W eather Forecasts (ECMWF), is a widely used global atmospheric reanalysis that provides comprehensiv e hourly estimates of a wide range of meteorological variables at a resolution of 0 . 25 ◦ . During the training stage, ERA5 supplies the large-scale meteorological context essential for learning the spatiotemporal dynamics of irradiance- relevant variables. In the inference stage, howev er , the ERA5 elds are replaced by real-time forecasts generated by Baguan [ 24 ], en- abling fully operational forecasting. Additionally , while ERA5 can serve as a reference (“ground truth”) for GHI, it is inherently less accurate and coarser than higher-delity obser vational products such as CLD AS. 3.4 Baguan Global Forecasts W e incorporate operational forecasts from Baguan [ 24 ], a state-of- the-art data-driven global weather prediction system trained on ERA5 dataset. Baguan provides 0.25 ◦ -resolution forecasts of key atmospheric variables, including GHI, at hourly lead times up to 14 days. Within our framework, during inference, Baguan supplies the large-scale meteorological context and a coarse predictive signal for GHI, which we further rene using high-resolution satellite observations. In addition, Baguan serves as a WFM baseline, as it directly produces GHI forecasts at 0.25 ◦ resolution. 4 Baguan-solar Framework As illustrated in Figure 2, Baguan-solar contains two stages: (i) cloud evolution modeling and (ii) irradiance inference modeling . Stage 1 explicitly forecasts the future 24 h cloud cover and satellite images by fusing historical 6 h multi-spectral geostationar y satellite ob- servations and 30 h Baguan weather forecasts (spanning both the past 6 h and the subsequent 24 h). Stage 2 then infers the future 24 h GHI forecast by combining Stage 1 cloud-aware outputs with clear-sky GHI and radiation-rele vant Baguan variables. Specically , clear-sky GHI is estimated by the Ineichen–Perez model and is a deterministic function of longitude, latitude, and time (see Appen- dix A.4). Both stages are implemented with Swin Transformer [ 20 ] backbones to capture multi-scale spatial structures eciently while preserving high-resolution outputs via patch emb edding and patch recovery . 4.1 Stage 1: Cloud Evolution Mo deling. Accurate GHI forecasting critically dep ends on predicting cloud evolution, since clouds dominate radiative attenuation and intro- duce strong spatiotemp oral nonlinearity . Thus, Stage 1 is formu- lated as an explicit cloud-eld for ecasting task. Specically , it lever- ages two complementary inputs: (i) historical satellite observations X sat 𝑡 − 5: 𝑡 ∈ R 6 × 4 × 𝐻 sat × 𝑊 sat , where the four spectral channels are de- ned in Section 3.1; and (ii) Baguan weather for ecasts X bg 𝑡 − 5: 𝑡 + 24 ∈ R 30 × 𝐶 1 × 𝐻 bg × 𝑊 bg . W e select a total of 𝐶 1 = 39 Baguan channels, including moisture, cloud state, thermo dynamic, and dynamical conditions (see Appendix T able 6). All Baguan variables ar e inter- polated to the satellite grid. Architecturally , Stage 1 uses two Swin Transformer encoders to disentangle cloud morphology and atmospheric forcing. W e rst project satellite observations and Baguan forecast elds into patch tokens via mo dality-specic patch emb eddings 𝜙 sat ( ·) and 𝜙 bg ( ·) , and then fee d them into a Cloud-Morphology Encoder and a Cloud-Environment Encoder , respectively . Each encoder consists of 8 stacked Swin Transformer blocks with residual connections, layer normalization and multi head attention. W e use a patch size of 8 × 8 , a window size of 16 , an embedding dimension of 256 , and 2 attention heads in each transformer layer . The Cloud-Morphology Encoder extracts multi-scale cloud textures and b oundary cues from satellite imager y to produce Z sat , while the Cloud-Environment Encoder encodes the interpolated Baguan variables to capture dy- namical and thermodynamical conditions, yielding Z bg . we then apply cross-attention to inject environmental guidance into the satellite representation, and concatenate the enhanced satellite to- kens with the Baguan tokens to form a fuse d r epresentation Z fused : Z sat = Enc sat ( 𝜙 sat ( X sat ) ) , Z bg = Enc bg ( 𝜙 bg ( X bg ) ) , (1) Z fused = cat ( Z sat + An ( Z sat 𝑊 𝑄 , Z bg 𝑊 𝐾 , Z bg 𝑊 𝑉 ) , Z bg ) . (2) Integrating W eather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y Stage 1. Cloud Evolution Modeling C=4 Satellite observations (0.05  ) (Band 0.64, 3.9, 7.3, and 11.2 μ m) Patch Embedding Cloud-Morphology Encoder Baguan Forecast (0.25  ) (U, V, T, Q, Z, TCC, LCC, TCW, TC WV) Decoder Sat-Head Multi-Modality Fusion Z sat Z bg Projection Q K V Softmax × + C T 0 T -5 .. . T +24 .. . .. . .. . H=103 W=103 C=39 Patch Embedding Interpolate Cloud-Environement Encoder Swin Trans Block .. . W=512 H=512 T 0 T -5 .. . Satellite Prediction .. . W=512 H=512 T +24 T 0 .. . Cloud-Head C=4 TCDC Prediction .. . W=512 H=512 T +24 .. . C=1 T 0 Clear-Sky GHI (0.05  ) T 0 T +24 .. . .. . H=103 W=103 C=11 Baguan Forecast (0.25  ) (Q, TCW, TCWV, FDIR, SSRD ) Interpolate Swin Trans Block Projection × Swin Trans Block .. . W=512 H=512 T +24 .. . T 0 C=1 Patch Embedding Irradiance Block Swin Trans Block GHI-Head c .. . W=512 H=512 T +24 .. . T 0 C=1 Solar Irradiance Forecast (GHI, 0.05  ) Stage 2. Irradiance Inference Modeling Figure 2: Baguan-solar model architecture. Baguan-solar uses a two-stage Swin Transformer framework that fuses Himawari satellite observations with Baguan forecasts to rst predict cloud-related intermediates (satellite elds and TCDC) and then infer 24 h high-resolution GHI. Finally , Z fused is processed by a shared Swin Transformer decoder followed by two task-specic heads to jointly generate the day- ahead total cloud cover forecast ˆ Y TCDC 𝑡 + 1: 𝑡 + 𝑇 out and future satellite pre- dictions ˆ Y sat 𝑡 + 1: 𝑡 + 𝑇 out . [ ˆ Y TCDC 𝑡 + 1: 𝑡 + 𝑇 out , ˆ Y sat 𝑡 + 1: 𝑡 + 𝑇 out ] = De c ( Z fused ) . (3) 4.2 Stage 2: Irradiance Inference Mo deling. Stage 2 infers irradiance by integrating the cloud-aware intermedi- ate outputs from Stage 1 with additional meteorological and phys- ical priors. Specically , we construct the Stage 2 input by con- catenating the Stage 1 predictions ˆ Y 𝑇 𝐶 𝐷𝐶 𝑡 + 1: 𝑡 + 24 and ˆ Y sat 𝑡 + 1: 𝑡 + 24 , together with the clear-sky GHI X clear − sky 𝑡 + 1: 𝑡 + 24 and Baguan forecast variables X bg 𝑡 + 1: 𝑡 + 24 ∈ R 𝑇 out × 𝐶 2 × 𝐻 𝑠 × 𝑊 𝑠 with 𝐶 2 = 11 radiation-relevant chan- nels. The concatenated tensor is patch-embe dded and processed by a Swin Transformer backbone with 8 stacked Swin blocks, patch size 𝑃 = 8 , and hidden dimension 𝐷 = 256 . A Solar Irradiance head then performs patch recovery to restore the original spatial resolu- tion and outputs multi-step GHI forecasts: ˆ Y 𝑔ℎ𝑖 = Irr ( cat ( X clear − sky , X bg , ˆ Y TCDC , 𝑡 Y sat ) ) . (4) W e train the two-stage model in an end-to-end manner with a weighted multi-task objective o ver three prediction targets. W e use mean squared error (MSE) as the loss function for all tasks: L = 𝜆 sat L sat + 𝜆 TCDC L TCDC + 𝜆 ghi L ghi , (5) where 𝜆 sat = 1 , 𝜆 TCDC = 0 . 5 , and 𝜆 ghi = 1 in all experiments. 4.3 Implementation & Evaluation The multimodal dataset from 2022–2024 is use d for model develop- ment and is split into training and validation subsets at a 0.9:0.1 ratio. Data from 2025 are reserved exclusively for testing to pro vide an independent evaluation. All inputs are cropped to a 512 × 512 pixel domain covering East A sia. Baguan-solar is trained on 8 NVIDIA A100 GP Us with a batch size of 4. The training of Baguan-solar uses the scheduler-free optimizer [ 9 ], which removes the need for an explicit learning-rate schedule while maintaining stable conver- gence. The complete set of model hyperparameters is pro vided in Appendix A.3. T o assess the quality of GHI forecasts, we adopt the r oot mean squared error (RMSE) as the primary metric, consistent with prior studies [1, 24, 25]. RMSE is computed as: RMSE = v t 1 𝑛 𝑛  𝑖 = 1 ( 𝑦 𝑖 − ˆ 𝑦 𝑖 ) 2 , (6) where 𝑛 denotes the number of samples in the test set, and 𝑦 𝑖 and ˆ 𝑦 𝑖 are the observed and pr edicted GHI values. Smaller RMSE values correspond to more accurate forecasts. 5 Experiments 5.1 Benchmarking on CLD AS 5.1.1 Baselines and Experimental Setup. W e evaluate Baguan-solar against the following established benchmarks: Operational W eather Models: • Baguan [ 24 ]: An operational weather foundation model de- signed for renewable energy , providing irradiance-relevant parameters at 0 . 25 ◦ resolution. • EC IFS : ECMWF (EC) Integrated Forecasting System (IFS) is a high-resolution ( 0 . 1 ◦ ) NWP system that directly outputs surface solar radiation downward (SSRD), serving as a robust benchmark. Satellite-based Models: • SolarSeer [ 1 ]: A state-of-the-art, satellite-based nowcasting model, retrained our dataset for a region-fair evaluation. Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y xxx et al. T able 2: Benchmark comparison (RMSE) for solar irradiance forecasting (GHI W m − 2 ) over East Asia in 2025, evaluated for forecasts initialized at 00:00 and 12:00 U TC. Results are reported at lead times of 1 h, 2 h, 3 h, 6 h, 12 h, and 24 h, as well as the average RMSE over 1–24 h ( A vg.). Model T ype Inputs RMSE ( W m − 2 ) A vg. 1 h 2 h 3 h 6 h 12 h 24 h Mean Statistical None 83.89 – – – – – – Clear-sky Statistical None 113.54 – – – – – – Baguan [24] weather foundation model Gridded initial eld 58.17 49.53 58.57 65.50 77.98 37.86 37.13 EC IFS NWP Gridded initial eld 54.46 47.48 59.86 70.65 72.23 37.59 36.50 SolarSeer [1] Extrapolation-based Satellite 53.09 36.07 51.40 64.40 68.75 33.14 35.47 T wo-stage Unet Extrapolation-based Satellite 59.10 47.03 60.32 72.15 74.79 39.10 41.78 T wo-stage Swin Extrap olation-based Satellite 49.89 32.74 46.77 59.23 65.33 31.46 33.51 Baguan-solar Multimodal ERA5 & Satellite 41.21 29.99 43.52 55.04 57.66 24.80 24.20 Baguan-solar (oper .) Multimodal Baguan forecasts & Satellite 41.87 30.31 43.64 55.34 57.98 25.18 25.04 • T wo-stage U-Net : Our two-stage U-Net baseline, built fol- lowing SolarSeer’s two-stage design. • T wo-stage Swin : Our tw o-stage Swin Transformer baseline , also built following SolarSeer’s two-stage design. Statistical Baselines: • Mean : Predicts the historical average GHI for each hour from the training period (2022–2024). • Clear-sky : Estimates the theoretical GHI under cloud-free conditions based on spatiotemporal coordinates. Both weather model forecasts are bilinearly interpolated to 0 . 05 ◦ for a consistent comparison. All models are evaluated on a common test set comprising data from the year 2025. 5.1.2 Overall Performance Comparison. W e evaluate Baguan-solar against a comprehensive set of baselines. As summarized in T a- ble 2, the operational Baguan-solar achieves the best p erformance , reducing the average RMSE by 16.08% compar ed to the strongest baseline, the T wo-stage Swin, and reducing RMSE by 28.02% relative to the Baguan forecasts. The extrapolation-based metho ds, espe- cially SolarSeer and T wo-stage Swin, sho w clear advantages over operational weather models. This performance gap stems primarily from their training on high-resolution ( 0 . 05 ◦ ) CLD AS data. The ner spatial resolution allows these models to better capture local GHI variability , leading to systematic gains over coarser-resolution approaches such as Baguan at 0 . 25 ◦ . W e further evaluate two vari- ants of Baguan-solar . The idealized variant uses ERA5 reanalysis as input, which assumes perfect knowledge of future atmospheric conditions and is not operationally feasible. The operational variant instead uses Baguan forecasts as input, thereby emulating real-time deployment through reforecast experiments. The results show that the idealized version slightly outperforms the operational version, but the gap is small, suggesting that Baguan’s short-term weather forecasts are highly accurate and closely approximate actual at- mospheric conditions and introduce only limited degradation in downstream GHI prediction. 5.1.3 Lead-time-dependent Forecast Skill. Figure 3 shows how the RMSE of GHI forecasts varies with lead time from 1 to 24 h, com- paring Baguan-solar with EC IFS, Baguan [ 24 ], and SolarSeer [ 1 ]. Figure 3: RMSE of GHI forecasts as a function of lead time (1—24 h) for four initialization times (U TC 00:00, 06:00, 12:00, and 18:00), evaluated over a 512 × 512 gridde d domain. Across all initialization times (U TC 00:00, 06:00, 12:00, and 18:00), Baguan-solar consistently achieves the lowest err ors at every lead time. In addition, a pronounced diurnal cycle is observed in the error curves: errors increase during local daytime, peaking around mid- day when GHI magnitude is highest, and decrease toward nighttime. During night hours, when GHI is eectively zero, RMSE approaches zero for all methods, reecting the negligible forecasting uncer- tainty under no-sun conditions. W e also observe that SolarSeer , as an extrapolation-based model, degrades marke dly with increasing lead time, consistent with the accumulation of cloud-motion errors and the lack of large-scale dynamical constraints. 5.2 Ablation Studies In this section, we compare variants of Baguan-solar to quantify the modality contributions and stage-wise decoupling. As shown in T able 3, our ablation studies reveal sev eral key insights. Remov- ing Baguan forecasts signicantly increases the average RMSE by Integrating W eather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y CLD A S( GT) EC I FS Ba guan Sol arSee r Ba guan-s olar 1 h our 3 h our 6 h our 9 h our 1 0-22 h our 2 4 hou r (a) Case 1: 2025.07.30 UTC 00:00 1-10 hour 1 2 hou r 15 h our 1 8 hou r 21 h our 22-24 hour (b) Case 2: 2025.12.08 U TC 12:00 Sol ar I rra diance (W m - 2 ) Figure 4: Qualitative comparison of GHI forecast elds for two representative cases initialized at U TC 00:00 and 12:00. T able 3: Ablation studies on modality contributions and stage- wise de coupling in Baguan-solar . "Only S1" means forecasting GHI and TCDC in a single stage, and then using Clear-sky GHI for p ost-pr ocessing to mask out the night. Results are reported at lead times of 1 h, 2 h, 3 h, 6 h, 12 h, and 24 h, as well as the average RMSE over 1–24 h (A vg.). Exp. (setting) RMSE ( W m − 2 ) A vg. 1 h 2 h 3 h 6 h 12 h 24 h Baguan-solar (S1+S2) 41.87 30.31 43.64 55.34 57.98 25.18 25.04 w/o Baguan 49.89 32.74 46.77 59.23 65.33 31.46 33.51 w/o satellite 42.66 37.35 49.54 59.36 59.03 25.05 24.60 Baguan-solar (Only S1) 45.50 32.34 44.86 56.38 59.73 35.86 28.92 w/o TCDC 48.30 35.21 47.30 58.37 61.87 41.00 32.62 19.15%. The disparity persists across all lead times, especially at longer lead times, with RMSE increasing by 25.18% at 12 h and 33.82% at 24 h, where the satellite-only extrapolation struggles to capture cloud formation or dissipation. By contrast, Baguan fore- casts provide thermodynamic and dynamical conditions that better constrain the evolution of cloud elds, leading to accurate GHI fore- casts at longer lead times. Removing satellite increases the average RMSE by 1.88%, while the short-term err or increase substantially by 23.22% at 1 h and 13.5% at 2 h . This r esult emphasizes the role of satellite imagery in capturing ne-scale cloud morphology and boundary motion. In addition, simplifying the two-stage frame- work to single-stage increases the average RMSE by 8.67%, and further removing TCDC supervision increases it by 15.36%. These results indicate that decoupling cloud evolution provide a physically grounded intermediate constraint that makes GHI variations attrib- utable to forecast cloud occurrence and motion, thereby improving physical consistency . 5.3 Qualitative Results W e present two representative cases for qualitativ e evaluation. The rst case ( on 2025.07.30) features an organized vortex over East Asia, forming a distinct pattern that lo w-GHI core surrounded by higher GHI. The Second case ( on 2025.12.08), for a typical winter day , features a distinct low-GHI belt over North East Asia. A cross both cases, EC IFS follows the overall structure reasonably well but tends to under-suppress the low-GHI cor e with narrow range and sharp boundaries. Baguan exhibits a systematic bright bias in b oth cases. SolarSeer is consistently over-smoothed, blurring cloud-band boundaries and gradients. In particular , it fails to retain the vortex-related signature at 24 h in the rst case. In contrast, Baguan-solar provides the most balanced reconstruction in both morphology and amplitude. It captures ne-scale structures and transitions for short-term forecast. Although performance degrades with increasing lead time, it still retains the reasonable intensity and spatial extent compared to the other methods. 5.4 Modality Importance Analysis T o justify our two-stage design and the inclusion of multimodal data, we use Integrated Gradients (IG) [ 28 ] to quantify how much each input modality , satellite versus ERA5 (or Baguan forecasts), contributes to RMSE r eduction across lead times 1–24 h. The analy- sis is performed over 2400 samples and 24 dierent initialization hours, ensuring robustness to diurnal cycles and variability across initialization times. The feature importance in Figure 5 reveals a clear temporal dynamics: the satellite input contributes signicantly (10-31%) to RMSE at short lead times (1–6 h), but it rapidly drops after 6 h and is ultimately below 5% by 24 h. In contrast, ERA5’s contribution starts high (68.1%) and steadily increases to over 95% at 24 h. This pattern justies tw o key asp ects of our design. First, it high- lights the nee d for including ERA5 (or Baguan forecasts) during training. Although high-resolution satellite data can extrapolate Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y xxx et al. Figure 5: Importance of ERA5 vs Satellites across lead times. well and shine in the nowcasting of GHI, meteorological elds provide essential large-scale dynamical and thermodynamical in- formation to achieve skillful day-ahead forecasts. The nding also coincides with the ablation study in T able 3 and the GHI forecast results in Figure 3. Second, it validates the rationale for the archi- tecture that decouples the forecasting problem into two physically grounded subtasks. Stage 1 is designed to handle the time-var ying subtask by learning to dynamically weigh satellite and mete or olog- ical inputs according to the forecast horizon, avoiding the degener- ation of pure extrapolation. This specialization allows Stage 1 to model dicult cloud evolution under shifting modality dominance as a standalone task. Stage 2 then fo cuses on the more stable trans- formation from predicted clouds to GHI, relying only on physical priors such as clear-sky GHI. By separating these concerns, our ar- chitecture reduces overall learning diculty and improves forecast skill across all lead times. 6 Deployment 6.1 Operational Deplo yment in East China Since July 2025, Baguan-solar has b een deploy ed online to support operational solar p o wer forecasting in an eastern pro vince in China, which has the highest solar power capacity of 918.4 GW among all provinces. It is co-deplo yed with the Baguan weather forecasting system and shares a similar operational pip eline: Baguan is exe- cuted four times per day (U TC 00:00, 06:00, 12:00, and 18:00) and produces weather forecasts with lead times up to 14 days. Each time, it takes 0.5 h for Baguan to perform inference on two GP Us. On the other hand, the four AHI bands (B03, B07, B10, and B14) imagery data from Himawari-8/9 geostationary satellites are collected ev ery 10 minutes. Building on the latest available Baguan outputs and satellite data, Baguan-solar runs at a higher frequency (hourly) to provide 24 h high-resolution GHI forecasts, oering a faster but less dynamically constrained vie w of the evolving atmosphere . The weather forecasts, including the GHI forecasts, are provided to the downstream applicaitons, such as solar power forecasting model and electric load forecasting models [33]. 6.2 Operational V erication with Pyranometer Sites W e collect GHI measurements from 246 sites equipped with pyra- nometers in this province and use these in-situ obser vations to verify Baguan-solar forecasts. Although the pre vious section uses the CLDAS reanalysis data as the reference due to its ne spa- tial resolution, pyranometers in these sites provide a more direct Figure 6: (T op) RMSE of GHI forecasts as a function of lead time (1–24 h) for two initialization times ( UTC 00:00 and 12:00), averaged over 246 sites. (Bottom) Visualization of GHI forecasts for one photovoltaic site over a one-week period. and accurate measur e of surface GHI and therefore constitute a stricter benchmark for operational validation. T o further assess which gridded product better matches the observations, we com- pute the averaged RMSE between the site measurements and the ERA5/CLD AS elds interpolated to the station lo cations. ERA5 yields a higher RMSE (77.85) than CLDAS (66.69), indicating that CLD AS provides a more r eliable gridded reference over our study region; this also supports our choice of CLDAS as the ground truth for model training and evaluation. Figure 6 (top) illustrates the RMSE of GHI forecasts for Baguan- solar and other baselines. It shows that Baguan-solar consistently achieves the lowest average errors across all initialization times. SolarSeer , as an extrapolation-based model, degrades remarkably with increasing lead time, with the most pronounced deterioration in the 12–24 h range for the U TC 12:00 initialization. EC IFS and Baguan exhibit similar overall p erformance; howev er , Baguan tends to perform worse in the afternoon. In the one-week case study at the photovoltaic site in July 2025 (Figure 6 ( bottom)), Baguan-solar shows closer agreement with the site observations than the baseline forecasts, capturing the peak irradiance and the o verall temporal variability more accurately . 7 Conclusion and Future W ork W e presented Baguan-solar , a two-stage model that fuses weather foundation model forecasts and satellite imagery to deliver accurate, ne-grained ( 0 . 05 ◦ ) day-ahead solar irradiance predictions. In eval- uations, our model surpasses strong baselines (EC IFS, Baguan, and SolarSeer) not only in overall accuracy but also in tracking rapid irradiance changes caused by clouds. It has be en in operational use in China since 2025. Future work includes extending the system globally using multi- satellite data, incorporating uncertainty quantication via ensem- ble methods, and exploring end-to-end co-training with upstream weather foundation models for improved physical consistency . Integrating W eather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y References [1] Mingliang Bai, Zuliang Fang, Shengyu Tao , Siqi Xiang, Jiang Bian, Y anfei Xiang, Pengcheng Zhao, W eixin Jin, Jonathan A. W eyn, Haiyu Dong, Bin Zhang, Hongyu Sun, Kit Thambiratnam, Qi Zhang, Hongbin Sun, Xuan Zhang, and Qiuwei Wu. 2025. Ultrafast 24-h solar irradiance forecasts outperform numerical weather predictions across the USA. Cell Rep orts Physical Science 6, 12 (2025), 102996. https://doi.org/10.1016/j.xcrp.2025.102996 [2] Kotaro BESSHO , Kenji D A TE, Masahiro HA Y ASHI, Akio IKED A, T akahito IMAI, Hidekazu INOUE, Y ukihiro KUMAGAI, T akuya MIY AKA W A, Hidehiko MURA TA, T omoo OHNO, Arata OK UY AMA, Ryo O Y AMA, Yukio SASAKI, Y oshio SHIMAZU, Kazuki SHIMOJI, Y asuhiko SUMIDA, Masuo SUZUKI, Hidetaka T ANIGUCHI, Hi- roaki TSUCHIY AMA, Daisaku UESA W A, Hironobu YOK OT A, and Ryo YOSHID A. 2016. An Introduction to Himawari-8/9 mdash; Japan rsquo;s New-Generation Geostationary Meteorological Satellites. Journal of the Meteorological Society of Japan. Ser . II 94, 2 (2016), 151–183. https://doi.org/10.2151/jmsj.2016- 009 [3] Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, and Qi Tian. 2023. Accurate medium-range global weather forecasting with 3D neural networks. Nature 619, 7970 (2023), 533–538. [4] Oussama Boussif, Ghait Boukachab, Dan A ssouline, Stefano Massaroli, Tianle Y uan, Loubna Benabbou, and Y oshua Bengio. 2023. Improving *day-ahead* Solar Irradiance Time Series Forecasting by Leveraging Spatio-T emporal Context. In Advances in Neural Information Processing Systems 36: A nnual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 , Alice Oh, Tristan Naumann, Amir Globerson, Kate Saenko , Moritz Hardt, and Sergey Levine (Eds.). http://papers.nips.cc/paper_les/paper/ 2023/hash/070a57c5ef1e58cc90201b11d369b3c2- Abstract- Conference.html [5] Alberto Carpentieri, Jussi Leinonen, Je Adie, Boris Bonev , Doris Folini, and Farah Hariri. 2024. Data-driven Surface Solar Irradiance Estimation using Neural Operators at Global Scale. arXiv preprint arXiv:2411.08843 (2024). [6] Kang Chen, Tao Han, Junchao Gong, Lei Bai, Fenghua Ling, Jing-Jia Luo, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, et al . 2023. Fengwu: Pushing the skillful global medium-range weather forecast beyond 10 days lead. arXiv preprint arXiv:2304.02948 (2023). [7] Lei Chen, Xiaohui Zhong, Feng Zhang, Y uan Cheng, Yinghui Xu, Y uan Qi, and Hao Li. 2023. FuXi: a cascade machine learning forecasting system for 15-day global weather forecast. npj Climate and Atmospheric Science 6, 1 (2023), 190. https://doi.org/10.1038/s41612- 023- 00512- 1 [8] Cale Colony and Razan Andigani. 2024. Solarcast-ML: Per Node GraphCast Extension for Solar Energy Production. arXiv preprint arXiv:2406.13559 (2024). [9] Aaron Defazio, Xingyu Yang, Harsh Mehta, Konstantin Mishchenko, Ahmed Khaled, and Ashok Cutkosky. 2024. The Road Less Scheduled. arXiv:2405.15682 [cs.LG] [10] Michael Emmanuel, Kate Doubleday, Burcin Cakir, Marija Marković, and Bri- Mathias Hodge. 2020. A re view of power system planning and operational models for e xibility assessment in high solar energy penetration scenarios. Solar Energy 210 (2020), 169–180. https://doi.org/10.1016/j.solener .2020.07.017 Special Issue on Grid Integration. [11] Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz-Sabater , Julien Nicolas, Carole Peubey , Raluca Radu, Dinand Schepers, et al . 2020. The ERA5 global reanalysis. Quarterly Journal of the Royal Meteoro- logical Society 146, 730 (2020), 1999–2049. [12] William F Holmgren, Cliord W Hansen, and Mark A Mikofski. 2018. pvlib python: A python package for modeling solar energy systems. Journal of Op en Source Software 3, 29 (2018), 884. [13] Qiusheng Huang, Xiaohui Zhong, Xu Fan, Lei Chen, and Hao Li. 2025. FuXi- RTM: A Physics-Guided Prediction Framework with Radiative Transfer Model- ing. CoRR abs/2503.19940 (2025). https://doi.org/10.48550/ARXIV .2503.19940 [14] Pierre Ineichen and Richard Perez. 2002. A new airmass independent formulation for the Linke turbidity coecient. Solar Energy 73, 3 (2002), 151–157. [15] Japan Meteorological Agency and Meteorological Satellite Center. [n. d.]. Uti- lization of Meteorological Satellite Data in Cloud Analysis. MSC T ech- nical Note (Special Issue). https://ww w .data.jma.go.jp/mscweb/technotes/ UtilizationMetSatData.pdf Accessed: 2026-01-27. [16] Thorsten Kurth, Shashank Subramanian, Peter Harrington, Jaide ep Pathak, Morteza Mardani, David Hall, Andrea Miele, Karthik Kashinath, and Anima Anandkumar . 2023. FourCastNet: Accelerating Global High-Resolution W eather Forecasting Using Adaptive Fourier Neural Operators. In Proceedings of the Plat- form for Advanced Scientic Computing Conference, P ASC 2023, Davos, Switzerland, June 26-28, 2023 , Axel Huebl, Cristina Silvano, and Timothy Robinson (Eds.). ACM, 13:1–13:11. https://doi.org/10.1145/3592979.3593412 [17] Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger , Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton- Rosen, W eihua Hu, Alexander Merose, Stephan Hoyer , George Holland, Oriol Vinyals, Jacklynn Stott, Alexander Pritzel, Shakir Mohamed, and Peter Battaglia. 2023. Learning skillful medium-range global weather forecasting. Sci- ence 382, 6677 (2023), 1416–1421. https://doi.org/10.1126/science.adi2336 arXiv:https://www.science .org/doi/pdf/10.1126/science.adi2336 [18] Simon Lang, Mihai Alexe, Matthew Chantry , Jesper Dramsch, Florian Pinault, Baudouin Raoult, Mariana CA Clare, Christian Lessig, Michael Maier-Gerber, Linus Magnusson, et al . 2024. AIFS–ECMWF ’s data-driven forecasting system. arXiv preprint arXiv:2406.01465 (2024). [19] Francisco JL Lima, Fernando R Martins, Enio B Pereira, Elke Lorenz, and Detlev Heinemann. 2016. Forecast for surface solar irradiance at the Brazilian Northeast- ern region using NWP model and articial neural networks. Renewable Energy 87 (2016), 807–818. [20] Ze Liu, Y utong Lin, Y ue Cao, Han Hu, Yixuan W ei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021 . IEEE, 9992–10002. https://doi.org/10.1109/ICCV48922.2021.00986 [21] Ziqing Ma, W enwei W ang, Tian Zhou, Chao Chen, Bingqing Peng, Liang Sun, and Rong Jin. 2024. FusionSF: Fuse Heterogeneous Modalities in a V ector Quan- tized Framework for Robust Solar Power Forecasting. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024, Barcelona, Spain, August 25-29, 2024 , Ricardo Baeza-Yates and Francesco Bonchi (Eds.). ACM, 5532–5543. https://doi.org/10.1145/3637528.3671509 [22] S K arthik Mukkavilli, Daniel Salles Civitarese, Johannes Schmude, Johannes Jaku- bik, Anne Jones, Nam Nguyen, Christopher Phillips, Sujit Roy , Shraddha Singh, Campbell Watson, et al . 2023. Ai foundation mo dels for weather and climate: Applications, design, and implementation. arXiv preprint arXiv:2309.10808 (2023). [23] T ung Nguyen, Johannes Brandstetter , Ashish Kapoor , Jayesh K. Gupta, and Aditya Grover . 2023. ClimaX: A foundation model for weather and climate. In Interna- tional Conference on Machine Learning . https://api.semanticscholar .org/CorpusID: 256231457 [24] Peisong Niu, Ziqing Ma, Tian Zhou, W eiqi Chen, Lefei Shen, Rong Jin, and Liang Sun. 2025. Utilizing Strategic Pre-training to Reduce Overtting: Baguan - A Pre-trained W eather Forecasting Model. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V .2, KDD 2025, T oronto ON, Canada, August 3-7, 2025 , Luiza Antonie, Jian Pei, Xiaohui Yu, Flavio Chierichetti, Hady W . Lauw , Yizhou Sun, and Srinivasan Parthasarathy (Eds.). ACM, 2186–2197. https://doi.org/10.1145/3711896.3737178 [25] Stephan Rasp, Stephan Hoyer , Alexander Merose, Ian Langmore, Peter Battaglia, T yler Russel, Alvaro Sanchez-Gonzalez, Vivian Y ang, Rob Carver , Shreya Agrawal, Matthew Chantry , Zied Ben Bouallegue, Peter Dueben, Carla Bromberg, Jared Sisk, Luke Barrington, Aaron Bell, and Fei Sha. 2024. W eatherBench 2: A benchmark for the next generation of data-driven global weather mo dels. arXiv:2308.15560 [physics.ao-ph] [26] Baptiste Schubnel, Jelena Simeunović, Corentin Tissier, Pierre-Jean Alet, and Rafael E Carrillo. 2025. SolarCrossFormer: Improving day-ahead Solar Irradi- ance Forecasting by Integrating Satellite Imagery and Ground Sensors. IEEE Transactions on Sustainable Energy (2025). [27] Chunxiang Shi, Lipeng Jiang, T ao Zhang, Dongbin Zhang, Bin Xu, Xiao Liang, and Chen Zhu. 2013. China Land Data Assimilation System (CLD AS) Research and Op- eration. In 6th WMO Symposium on Data Assimilation . https://www.cscamm.umd. edu/programs/das6/program/Posters/abs/Fp13- Shi_Chunxiang.pdf Accessed: 2026-01-27. [28] Mukund Sundararajan, Ankur T aly , and Qiqi Y an. 2017. Axiomatic Attribution for Deep Networks. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, V ol. 70) , Doina Precup and Y e e Whye T eh (Eds.). PMLR, 3319–3328. https://proceedings.mlr .press/v70/ sundararajan17a.html [29] Jie W ang, W enping Qin, Ruipeng Lu, W enbo Zhang, Haixiao Zhu, and Anting Zhao. 2024. Short-term P V power forecasting system Based on GraphCast W eather Forecasting Model. In 2024 IEEE 8th Conference on Energy Internet and Energy System Integration (EI2) . IEEE, 5331–5335. [30] James L W arner , Jon Petch, Chris J Short, and Caroline Bain. 2023. Assessing the impact of a NWP warm-start system on model spin-up over tropical Africa. Quarterly Journal of the Royal Meteorological Society 149, 751 (2023), 621–636. [31] Dazhi Y ang, W enting W ang, Christian A. Gueymard, T ao Hong, Jan Kleissl, Jing Huang, Marc J. Perez, Richard Perez, Jamie M. Bright, Xiang’ao Xia, Dennis van der Meer, and Ian Marius Peters. 2022. A review of solar forecasting, its depen- dence on atmospheric sciences and implications for grid integration: T owards carbon neutrality . Renewable and Sustainable Energy Reviews 161 (2022), 112348. https://doi.org/10.1016/j.rser .2022.112348 [32] Xiaohui Zhong, Lei Chen, Xu Fan, W enxu Qian, Jun Liu, and Hao Li. 2024. FuXi-2.0: Advancing machine learning weather for ecasting model for practical applications. CoRR abs/2409.07188 (2024). https://doi.org/10.48550/ARXI V .2409. 07188 [33] Zhaoyang Zhu, W eiqi Chen, Rui Xia, Tian Zhou, Peisong Niu, Bingqing Peng, W enwei W ang, Hengbo Liu, Ziqing Ma, Qingsong W en, et al . 2023. eForecaster: Unifying electricity forecasting with robust, exible , and explainable machine learning algorithms. In Proceedings of the AAAI Conference on Articial Intelli- gence , V ol. 37. 15630–15638. Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y xxx et al. A Appendix A.1 Limitations of weather foundation mo dels T able 4 highlights a key gap in current weather foundation models (WFMs): despite their strong p erformance in general meteorological forecasting, large-scale AI models that e xplicitly predict cloud cover or solar irradiance remain largely unavailable. Baguan-solar is de- signed to bridge this gap by extending WFMs toward cloud-aware, day-ahead solar irradiance forecasting. A.2 Himawari-8/9 Satellite Data T able 5 summarizes the 16 spectral bands of the Himawari-8/9 Ad- vanced Himawari Imager (AHI), including their center wavelengths, band types (visible, near-infrared, and infrared), and typical me- teorological applications. These channels provide complementary information on cloud amount and texture in the visible range, cloud phase and microphysics in the near-infrared, and cloud-top tem- perature/height as well as water-vap or structure in the infrared. As described by JMA/MSC (2024) [15], Band 3 (0.64 𝜇 m) measures reected visible solar radiation and supp orts true-color composites as well as daytime identication of low clouds and fog. Band 7 (3.9 𝜇 m) senses emitted terrestrial radiation and includes a substan- tial reected solar comp onent during daytime; at night, it supports hotspot detection and fog/low-cloud identication via the Band 7– Band 13 brightness temperature dierence. Band 10 (7.3 𝜇 m) is a water vapor channel primarily sensitive to mid-tropospheric mois- ture and can also respond to volcanic SO 2 . Band 14 (11.2 𝜇 m) is a longwave infrared window channel used for cloud imaging and cloud-top characterization, and it can support surface temp eratur e applications under clear-sky conditions. A.3 Hyperparameter details Baguan-solar uses a two-stage Swin T ransformer design. The fol- lowing list summarizes the hyperparameters used for Baguan-solar . i m a g e _ s i z e : [ 5 1 2 , 5 1 2 ] p a t c h _ s i z e : [ 8 , 8 ] w i n d o w _ s i z e : 1 6 e m b e d _ d i m : 2 5 6 n u m _ h e a d s : [ 2 ] p a t c h _ n o r m : T r u e d r o p _ p a t h _ r a t e : 0 . 1 m l p _ r a t i o : 4 q k v _ b i a s : T r u e E n v E n c o d e r S w i n N e t : i n _ c h a n s : 1 1 7 0 , o u t _ c h a n s : 2 5 6 , d e p t h s : [ 8 ] S a t e E n c o d e r S w i n N e t : i n _ c h a n s : 2 4 , o u t _ c h a n s : 2 5 6 , d e p t h s : [ 8 ] M u l t i D e c o d e r S w i n N e t _ S t a g e 1 : i n _ c h a n s : 5 1 2 , o u t _ c h a n s : 1 2 0 , d e p t h s : [ 2 ] M u l t i D e c o d e r S w i n N e t _ S t a g e 2 : i n _ c h a n s : 1 7 , o u t _ c h a n s : 1 , d e p t h s : [ 8 ] Listing 1: Hyperparameters of Baguan-solar . A.4 Fast V ectorized Clear-Sky GHI Computation Clear-sky global horizontal irradiance (GHI) provides an upper bound of surface irradiance under cloud-free conditions and is widely used as a physics-based prior for solar forecasting. In this work, we adopt the Ineichen–Perez clear-sky model [ 14 ] (as imple- mented in pvlib [ 12 ]) but re-implement the critical steps using a lightweight, vectorized NumPy routine to enable high-throughput gridded computation. Our forecasting pipeline requires clear-sky GHI on a dense spa- tial grid of size 512 × 512 for each time stamp. The pvlib library is primarily designed for site-based (per-location) clear-sky com- putations, necessitating an outer loop over grid cells for grid-wide evaluation. A direct call to pvlib.clearsky.ineichen introduces substantial per-point overhead, taking roughly 4 minutes per time step on a 512 × 512 grid with the standard pvlib pipeline. T o eliminate this b ottleneck, we implement a str eamlined clear- sky routine that (i) computes the solar zenith angle using a com- pact approximation inspir ed by the NREL Solar Position Algorithm (SP A), and (ii) rewrites the pvlib.clearsky.ineichen computa- tion with fully vectorized NumPy broadcasting, allo wing latitude, longitude, and elevation to be provided as 2D arrays and yielding clear-sky GHI ov er the entir e raster in a single pass. On a 512 × 512 grid, this reduces the wall-clo ck time fr om 4 minutes to 1 second p er time step (a ∼ 240 × speedup), while preserving the original physical assumptions and keeping the numerical discrepancy within <1%. The Ineichen–Perez clear-sky GHI is computed as follows: Given the solar zenith angle 𝑧 (degrees; determined by longi- tude, latitude, and time), site elevation ℎ (meters), Linke turbidity 𝑇 𝐿 (dimensionless), day of year DO Y , and air mass AM , the Ine- ichen–Perez clear-sky GHI 𝐼 clear is computed as: 𝐼 clear = 𝑐 𝑔 1 𝐼 0 cos ( 𝑧 ) exp  − 𝑐 𝑔 2 AM [ 𝑓 ℎ 1 + 𝑓 ℎ 2 ( 𝑇 𝐿 − 1 ) ]  exp  0 . 01 AM 1 . 8  , (7) where 𝐼 0 is the extraterrestrial irradiance (top-of-atmosphere nor- mal irradiance), approximated by: 𝐼 0 = 1367 . 7 ×  1 + 0 . 033 × cos  2 𝜋 365 × DOY   . (8) The elevation-dependent coecients are: 𝑐 𝑔 1 = 5 . 09 × 10 − 5 ℎ + 0 . 868 , (9) 𝑐 𝑔 2 = 3 . 92 × 10 − 5 ℎ + 0 . 0387 , (10) 𝑓 ℎ 1 = exp ( − ℎ / 8000 ) , (11) 𝑓 ℎ 2 = exp ( − ℎ / 1250 ) . (12) Air mass AM is computed from 𝑧 using a Kasten– Y oung–type approximation: AM = 1 cos ( 𝑧 ) . (13) Integrating W eather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y T able 4: Comparison of representative weather foundation models and solar irradiance forecasting methods, including their inputs/outputs, resolution, region, and whether irradiance-related variables are explicitly modeled. Model Input V ariable Output V ariable Spatial Resolution Region Accuracy Irradiance- related V ariable Graphcast [17] U10, V10, T2M, MSLP, TP, U , V , Q, Z, T , W U10, V10, T2M, MSLP, TP, U , V , Q, Z, T , W 0.25 ° Global beats EC IFS × Pangu- W eather [3] U10, V10, T2M, MSLP, U, V , Q, Z, T U10, V10, T2M, MSLP, U, V , Q, Z, T 0.25 ° Global beats EC IFS × Fengwu [6] U10, V10, T2M, MSLP, U, V , Q, Z, T U10, V10, T2M, MSLP, U, V , Q, Z, T 0.25 ° Global beats EC IFS × Fuxi [7] U10, V10, T2M, MSLP, TP, U , V , Q, Z, T U10, V10, T2M, MSLP, TP, U , V , Q, Z, T 0.25 ° Global beats EC IFS × Baguan U10, U100, V10, V100, T2M, MSLP, TP , TCC, LCC, FDIR, SSRD , TCW , TCW V , TP, SP, U , V , Q , Z, T U10, U100, V10, V100, T2M, MSLP, TP , TCC, LCC, TCW , TCW V , TP, SP, FDIR , SSRD , U , V , Q , Z, T 0.25 ° Global beats EC IFS ✓ SolarSeer [1] Satellite SSRD 0.05 ° the CONUS beats HRRR ✓ Baguan-solar Satellite & Baguan forecasts SSRD 0.05 ° China beats all baselines ✓ T able 5: Himawari-8/9 (AHI) spe ctral bands and typical applications. Band Center wavelength ( 𝜇 m) T yp e T ypical applications B01 0.47 Visible Aer osol/land–ocean contrast; thin cloud (daytime) B02 0.51 Visible Green band; true-color composition (daytime) B03 0.64 Visible Cloud/scene detail; cloud amount (daytime) B04 0.86 NIR V egetation/reectance; cloud phase aid B05 1.6 NIR Cloud phase (ice vs. water ); snow–cloud separation; hotspot aid B06 2.3 NIR Cloud microphysics (particle size); hotspot aid B07 3.9 IR (SWIR) Night fog/low cloud; res/hotspots; cloud-top temperature support B08 6.2 IR (W V) Upper-tropospheric water vapor; jet/upper-level dynamics B09 6.9 IR (W V) Mid-level water vapor; moisture structure B10 7.3 IR (WV) Lower-lev el water vapor; dry intrusion/convection environment B11 8.6 IR Cloud phase/microphysics; ash/SO 2 discrimination aid B12 9.6 IR (O 3 ) Ozone absorption; stratospheric inuence; deep convection top features B13 10.4 IR window Primary cloud-top brightness temperature; cloud-top height proxy B14 11.2 IR window Split-window combinations for fog/dust/ash; micr ophysics B15 12.4 IR window Split-window for fog/low cloud, dust; SST/LST retrieval support B16 13.3 IR (CO 2 ) CO 2 slicing for cloud-top height; thin cirrus detection In our experiments, the proposed vectorized implementation makes clear-sky priors computationally practical at scale, enabling their use during both training and inference for long-horizon fore- casting. Conference acronym ’XX, June 03–05, 2018, W oodstock, N Y xxx et al. T able 6: V ariables from ERA5 and Baguan used for training and inference in Baguan-solar . T ype V ariable name Abbrev . Stage1 input Stage2 input Levels Single low cloud cover LCC ✓ - Single total cloud cover TCC ✓ - Single total column water TCW ✓ ✓ - Single total column water vapour TCWV ✓ ✓ - Single total sky direct solar radiation at surface FDIR ✓ ✓ - Single surface solar radiation downwards SSRD ✓ ✓ - Atmospheric U wind component U ✓ 50, 250, 500, 600, 700, 850, 925 Atmospheric V wind component V ✓ 50, 250, 500, 600, 700, 850, 925 Atmospheric T emperature T ✓ 50, 250, 500, 600, 700, 850, 925 Atmospheric Specic humidity Q ✓ ✓ 50, 250, 500, 600, 700, 850, 925 Atmospheric Geopotential Z ✓ 50, 250, 500, 600, 700, 850, 925

Integrating Weather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment