A hybrid-Hill estimator enabled by heavy-tailed block maxima

A hybrid-Hill estimator enabled by heavy-tailed block maxima
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

When analysing extreme values, two alternative statistical approaches have historically been held in contention: the block maxima method (or annual maxima method, spurred by hydrological applications) and the peaks-over-threshold. Clamoured amongst statisticians as wasteful of potentially informative data, the block maxima method gradually fell into disfavour whilst peaks-over-threshold-based methodologies climbed to the centre stage of extreme value statistics. This paper devises a hybrid method which reconciles these two hitherto disconnected approaches. Appealing in its simplicity, our main result introduces a new universality class of extreme value distributions that discards the customary requirement of a sufficiently large block size for the plausible block maxima-fit to an extreme value distribution. Natural extensions to dependent and/or non-stationary settings are mapped out. We advocate that inference should be drawn solely on larger block maxima, from which practice the mainstream peaks-over-threshold methodology coalesces: the asymptotic properties of the hybrid-Hill estimator herald more than its efficiency, but rather that a fully-fledged unified semi-parametric stream of statistics for extreme values is viable. A reduced-bias off-shoot of the hybrid-Hill estimator provably outclasses the incumbent maximum likelihood estimation that relies on a numerical fit to the entire sample of block maxima.


💡 Research Summary

This paper proposes a novel “hybrid‑Hill” (H2) estimator that unifies the two dominant streams in extreme‑value analysis: the block‑maxima (BM) approach and the peaks‑over‑threshold (POT) approach. Traditional BM requires large block sizes to justify the Generalised Extreme Value (GEV) limit, which reduces the effective sample size, while POT relies on a high threshold whose choice can be arbitrary and data‑wasteful. The authors introduce a new universality class (Condition B) showing that, for any block size m ≥ 1, the tail of the block‑maxima distribution 1 − Fₘ, when normalised by a quantile function V(m(t‑½)), converges to a Pareto tail with the same extreme‑value index γ as the underlying distribution. This result removes the need for “sufficiently large” blocks and allows inference to be based on the larger block maxima only.

Based on this, the H2 estimator is defined as a Hill‑type statistic computed on the top k₀ order statistics of the block‑maxima sample, irrespective of m. The paper derives its asymptotic normality, variance (γ²/k₀), and bias under regular variation (Condition A1) and a second‑order refinement (Condition A2 with parameter ρ ≤ 0). A bias‑reduction scheme is then constructed, yielding a “bias‑reduced H2” that eliminates the leading second‑order bias term.

Theoretical results are complemented by extensive simulations using i.i.d. Pareto data (γ = ¼, n = 10 000) with block sizes ranging from 1 to 100. The H2 estimator consistently outperforms the conventional GEV maximum‑likelihood estimator in mean‑squared error, especially when the block size is small (the POT regime). The bias‑reduced version further narrows the gap, demonstrating robustness to the choice of k₀ (25 % of the block maxima were used in the illustrations).

The authors also discuss extensions to weakly dependent and non‑stationary time series, noting that Conditions A1 and A2 accommodate such settings, and that the block‑maxima construction can be applied with overlapping or adaptive blocks.

In summary, the hybrid‑Hill estimator provides a semi‑parametric, block‑size‑agnostic tool that merges the robustness of BM (insensitivity to short‑term irregularities) with the efficiency of POT (use of tail information). It reduces data waste, simplifies threshold selection, and offers a theoretically sound bias correction. Limitations include the need for practical guidance on selecting k₀ and further validation on multivariate or strongly dependent data, which are suggested as avenues for future research.


Comments & Academic Discussion

Loading comments...

Leave a Comment