Quasi-average predictions and regression to the trend: an application the M6 financial forecasting competition

Quasi-average predictions and regression to the trend: an application the M6 financial forecasting competition
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The efficient market hypothesis considers all available information already reflected in asset prices and limits the possibility of consistently achieving above-average returns by trading on publicly available data. We analyzed low dispersion prediction methods and their application to the M6 financial forecasting competition. Predictive averages and regression to the trend offer slight but potentially consistent advantages over the reference indexes. We put these results in the context of high variability approaches, which, if not accompanied by high information content, are bound to underperform the benchmark index as they are prone to overfit the past. In general, predicting the expected values under high uncertainty conditions, such as those assumed by the efficient market hypothesis, is more effective on average than trying to predict actual values.


💡 Research Summary

The paper investigates whether low‑dispersion, average‑based forecasting methods can achieve meaningful out‑performance in a highly competitive financial forecasting setting, specifically the M6 competition organized by the Makridakis Institute. Grounded in the Efficient Market Hypothesis (EMH), the authors begin by reviewing the persistent under‑performance of actively managed U.S. equity funds, noting that only a small fraction of funds beat their benchmarks over multi‑year horizons and that this rarity is even less than what would be expected by random chance. This observation motivates the exploration of simple, robust prediction techniques that do not rely on sophisticated, high‑variance models that are prone to over‑fitting.

The methodology consists of two complementary components. First, a “quasi‑average” forecasting model is built by combining two sources of information: (i) the cross‑sectional average return of each asset class (stocks versus ETFs) and (ii) the temporal average of each individual asset over recent windows (5, 10, and 40 weeks). The authors weight these three temporal averages (0.2, 0.2, 0.6 respectively) and add the class‑level average to obtain a probability distribution over the five quintiles required by the competition. Because the arithmetic mean minimizes mean‑squared error, this approach is theoretically optimal when the information content of the data is limited. The Ranked Probability Score (RPS) is used as the primary accuracy metric; a naïve uniform benchmark yields an RPS of 0.16, whereas the authors’ best submission achieves 0.1573, placing them among the top performers (first in the first quarter, eighth overall).

Second, the investment decision rule—dubbed “regression to the trend”—selects assets that have performed well over a long horizon (120 trading days) but excludes those that have shown exceptionally strong short‑term performance (top 15% over the last 40 days). This filter is intended to remove transient spikes that are likely to revert toward the mean, thereby reducing portfolio volatility. The baseline portfolio is long the filtered stocks and short a basket of ETFs, with a 2/3 long, 1/3 short allocation. A later “compensated” variant adds a systematic short position in ETFs to further dampen volatility. Performance is evaluated using an information ratio (IR) that sets the benchmark return to zero, analogous to a Sharpe ratio without a risk‑free rate. The regression‑to‑trend strategy yields an IR of 1.30, far exceeding the benchmark IR of 0.45, while the compensated version pushes the IR to about 1.45, albeit with returns close to the benchmark.

The results demonstrate that, in a high‑variability environment, predicting expected values (i.e., averages) is more reliable than attempting to forecast precise outcomes. The average‑based forecasts consistently beat the naïve benchmark, and the trend‑regression portfolio achieves superior risk‑adjusted returns by filtering out short‑term overreactions. The authors discuss why further gains may be limited: top‑ranking teams in the competition differ only marginally in RPS, suggesting a ceiling for improvement using simple averaging. Nonetheless, they propose possible extensions, such as Bayesian hierarchical models that jointly estimate class and temporal averages, or clustering assets into finer subclasses before averaging. For the investment side, they suggest optimizing the long‑ and short‑term windows separately for each stock rather than using uniform 120‑day and 40‑day thresholds.

In conclusion, the study provides empirical evidence that modest, low‑variance forecasting techniques—rooted in quasi‑averages and regression‑to‑the‑mean principles—can achieve competitive performance in a real‑world forecasting competition, even under the constraints of the EMH. This challenges the notion that only complex, high‑frequency models can add value, and it highlights the practical utility of simplicity and robustness in financial prediction and portfolio construction.


Comments & Academic Discussion

Loading comments...

Leave a Comment