DAG: A Dual Correlation Network for Time Series Forecasting with Exogenous Variables
Time series forecasting is essential in various domains. Compared to relying solely on endogenous variables (i.e., target variables), considering exogenous variables (i.e., covariates) provides additional predictive information and often leads to more accurate predictions. However, existing methods for time series forecasting with exogenous variables (TSF-X) have the following shortcomings: 1) they do not leverage future exogenous variables, 2) they fail to fully account for the correlation between endogenous and exogenous variables. In this study, to better leverage exogenous variables, especially future exogenous variables, we propose DAG, which utilizes Dual correlAtion network along both the temporal and channel dimensions for time series forecasting with exoGenous variables. Specifically, we propose two core components: the Temporal Correlation Module and the Channel Correlation Module. Both modules consist of a correlation discovery submodule and a correlation injection submodule. The former is designed to capture the correlation effects of historical exogenous variables on future exogenous variables and on historical endogenous variables, respectively. The latter injects the discovered correlation relationships into the processes of forecasting future endogenous variables based on historical endogenous variables and future exogenous variables.
💡 Research Summary
The paper addresses a critical gap in time‑series forecasting with exogenous variables (TSF‑X): most existing approaches either ignore future exogenous covariates or treat them as simple concatenated inputs without explicitly modeling the intricate relationships among historical endogenous, historical exogenous, and future exogenous variables. To overcome these limitations, the authors propose DAG (Dual Correlation Network), a novel architecture that simultaneously captures correlation structures along two orthogonal dimensions—temporal and channel—through dedicated discovery and injection sub‑modules.
Temporal Correlation Module
The temporal module first quantifies how past exogenous variables (X_exo) influence future exogenous variables (Y_exo) using Granger causality, producing a correlation matrix A_T that encodes time‑wise dependencies across exogenous channels. This matrix is then injected into a Transformer‑based forecasting backbone: the standard self‑attention queries, keys, and values are modulated by A_T, allowing the model to explicitly incorporate the learned influence of future covariates when predicting future endogenous values (Y_endo) from historical endogenous series (X_endo). The design leverages the structural similarity between “historical exogenous → future exogenous” and “historical endogenous → future endogenous” dynamics.
Channel Correlation Module
In parallel, the channel module discovers inter‑channel relationships by computing Pearson correlations between historical exogenous and historical endogenous variables, yielding a channel‑wise matrix A_C. This matrix is injected into a second Transformer block that processes future exogenous inputs. By doing so, the model transfers the learned interaction patterns from the past to the future, effectively guiding how each future exogenous channel should affect each target endogenous channel.
Loss Function and Training
DAG optimizes a composite loss: (i) a standard forecasting loss (e.g., MSE) on Y_endo, (ii) a temporal‑correlation loss that penalizes divergence between the learned A_T and the true Granger‑based dependencies, and (iii) a channel‑correlation loss that aligns A_C with observed Pearson correlations. Hyper‑parameters λ₁ and λ₂ balance the three components, enabling end‑to‑end learning where correlation discovery and forecasting mutually reinforce each other.
Experimental Evaluation
The authors evaluate DAG on a suite of public benchmarks (ETTh, ECL, etc.) and on newly released TSF‑X datasets covering traffic, retail, and energy domains. Baselines include state‑of‑the‑art models such as PatchTST, DUET, TiDE, TFT, TimeXer, and CrossLinear. Across all datasets, DAG consistently outperforms baselines, achieving relative improvements of 3–7% on MSE/MAE/SMAPE. Ablation studies demonstrate that both temporal and channel modules contribute positively, with the full dual‑correlation configuration delivering the highest accuracy.
Strengths and Limitations
Key strengths of DAG are: (1) full exploitation of future exogenous information, (2) explicit modeling of bidirectional causal relationships in both time and channel dimensions, and (3) a unified loss that jointly optimizes correlation learning and prediction. However, the approach incurs additional computational overhead due to Granger causality tests and Pearson correlation calculations, which may become burdensome for very high‑dimensional covariate sets. Moreover, linear correlation measures may not capture complex nonlinear interactions, suggesting room for graph‑based or kernelized extensions.
Conclusion and Future Work
DAG introduces a principled framework for TSF‑X that bridges the gap between exogenous covariate availability and effective utilization. By discovering and injecting dual‑dimensional correlations, it achieves superior forecasting performance across diverse real‑world scenarios. Future research directions include (i) incorporating nonlinear correlation estimators or graph neural networks to capture richer dependencies, (ii) developing memory‑efficient representations for large‑scale exogenous streams, and (iii) extending the model to online or streaming settings where future covariates arrive incrementally.
Comments & Academic Discussion
Loading comments...
Leave a Comment