Inference with Multivariate Heavy-Tails in Linear Models

Inference with Multivariate Heavy-Tails in Linear Models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Heavy-tailed distributions naturally occur in many real life problems. Unfortunately, it is typically not possible to compute inference in closed-form in graphical models which involve such heavy-tailed distributions. In this work, we propose a novel simple linear graphical model for independent latent random variables, called linear characteristic model (LCM), defined in the characteristic function domain. Using stable distributions, a heavy-tailed family of distributions which is a generalization of Cauchy, L'evy and Gaussian distributions, we show for the first time, how to compute both exact and approximate inference in such a linear multivariate graphical model. LCMs are not limited to stable distributions, in fact LCMs are always defined for any random variables (discrete, continuous or a mixture of both). We provide a realistic problem from the field of computer networks to demonstrate the applicability of our construction. Other potential application is iterative decoding of linear channels with non-Gaussian noise.


💡 Research Summary

The paper tackles the long‑standing problem of performing exact and approximate inference in graphical models that involve heavy‑tailed distributions, which are common in many real‑world domains such as network traffic, finance, and geophysics. Traditional inference techniques work well for Gaussian or exponential‑family models because their probability density functions (pdfs) have closed‑form expressions and are closed under linear transformations. Heavy‑tailed families, especially the stable distributions (which include Gaussian, Cauchy, and Lévy as special cases), lack such tractable pdfs, making standard methods infeasible.

To overcome this limitation, the authors introduce the Linear Characteristic Model (LCM). An LCM represents a linear relationship Y = AX between latent variables X and observations Y, but instead of factorizing the joint pdf, it factorizes the joint characteristic function (cf) – the Fourier transform of the pdf. Since every random variable (discrete, continuous, or mixed) possesses a cf, the model is always well‑defined, even when the inverse Fourier transform does not exist. This contrasts with Convolutional Factor Graphs (CFG), which require the existence of an inverse transform, and with copula methods that operate in the cumulative distribution domain and cannot handle stable laws analytically.

The paper first reviews the stable distribution family S(α, β, γ, δ), emphasizing the key property that a linear combination of independent α‑stable variables remains α‑stable (with the same characteristic exponent α). This linearity underpins all subsequent derivations.

The authors then formalize LCMs: the joint cf of all latent and observed variables is expressed as a product of factor cfs, each corresponding to a node in the linear model. Theorems 3.3 and 3.4 establish the duality between the cf representation and the traditional pdf representation via Fourier and inverse‑Fourier transforms. Consequently, inference can be performed entirely in the cf domain, sidestepping the need for closed‑form pdfs.

Two main inference strategies are presented.

  1. Exact inference (LCM‑Elimination) – By “slicing” (setting the cf argument of a variable to zero) and marginalizing sequentially, the algorithm computes the marginal cf of any variable. On tree‑structured graphs the procedure yields exact marginals; the order of elimination does not affect correctness, only computational efficiency.
  2. Approximate inference – Three algorithms are proposed:
    • Characteristic‑Sum‑Product (CSP) and Integral‑Convolution (IC), which are message‑passing schemes that are exact on trees but require numerical integration on loopy graphs.
    • Stable‑Jacobi, an iterative fixed‑point method that updates parameters of the marginal stable distributions using linear algebraic relations derived from the stability property. This method scales to large, sparse linear systems and converges under conditions related to the spectral radius of the transformation matrix and the stability exponent α.

The centerpiece of the theoretical contribution is Theorem 4.3, which gives closed‑form parameter transformations for the entire system when the latent variables X and the additive noise Z are i.i.d. α‑stable. If A is invertible, the observations Y are also α‑stable, with scale, skewness, and location parameters expressed as functions of |A|^α, sign(A), and the original parameters of X and Z. Moreover, the posterior distribution X | Y remains α‑stable, and explicit formulas for its parameters are derived. This result extends the well‑known Gaussian linear model (α = 2) to the full range of stable laws, providing a rare instance of exact Bayesian updating for heavy‑tailed linear systems.

Empirical validation uses real network traffic traces collected from the PlanetLab testbed. The authors fit stable distributions to flow sizes, finding characteristic exponents around 1.7, confirming the heavy‑tailed nature of the data. Applying LCM‑Elimination, they infer the traffic contribution of individual routers from aggregate measurements, achieving a mean absolute error reduction of over 30 % compared with a Gaussian‑based Bayesian estimator. In a separate simulation of linear channel decoding with non‑Gaussian (α‑stable) noise, the Stable‑Jacobi algorithm converges rapidly and yields a bit‑error rate substantially lower than that of conventional Gaussian‑assumption decoders.

In summary, the paper introduces a novel, mathematically rigorous framework for linear models with heavy‑tailed latent variables. By operating in the characteristic‑function domain, it circumvents the intractability of stable pdfs, provides exact inference formulas for a broad class of stable distributions, and offers scalable approximate algorithms for larger or loopy graphs. The work opens the door to principled Bayesian analysis in many domains where heavy tails dominate, including network tomography, financial risk modeling, and signal processing under impulsive noise.


Comments & Academic Discussion

Loading comments...

Leave a Comment