Spatiotemporal Autoregressive Models for Areal Compositional Data
Compositional data, such as regional shares of economic sectors or property transactions, are central to understanding structural change in economic systems across space and time. This paper introduces a spatiotemporal multivariate autoregressive model tailored for panel data with composition-valued responses at each areal unit and time point. The proposed framework enables the joint modelling of temporal dynamics and spatial dependence under compositional constraints, and is estimated via a quasi-maximum likelihood approach. We build on recent theoretical advances to establish the identifiability and asymptotic properties of the estimator as both the number of regions and the number of time points grow. The utility and flexibility of the model are demonstrated through two applications: analysing property transaction compositions in an intra-city housing market (Berlin), and regional sectoral compositions in Spain’s economy. These case studies highlight how the proposed framework captures key features of spatiotemporal economic processes that are often missed by conventional methods.
💡 Research Summary
The paper introduces a novel statistical framework for analyzing areal compositional data observed over time. Compositional data consist of non‑negative parts that sum to a constant, which imposes a simplex constraint and precludes the direct use of standard regression or time‑series methods. To respect this constraint while modelling spatial and temporal dependence jointly, the authors proceed in three main steps.
First, they map the simplex to an unconstrained Euclidean space using the isometric log‑ratio (ilr) transformation. The ilr provides an isometric isomorphism between the Aitchison geometry of the simplex and ℝ^{D‑1}, preserving inner products, distances, and allowing the use of ordinary multivariate normal theory. The paper discusses the centred log‑ratio (clr) as an intermediate step and notes that for components with zeros, α‑transformations can be employed.
Second, on the transformed data they specify a multivariate simultaneous autoregressive (MSAR) model that incorporates spatial lags, temporal lags, and spatial‑temporal lags. The model can be written as
Y_t = λ W Y_t + γ Y_{t‑1} + ρ W Y_{t‑1} + X_t Π + V_t,
where W is a pre‑specified spatial weight matrix (row‑standardised), λ captures contemporaneous spatial spillovers, γ the autoregressive effect within each region, ρ the spillover of past values from neighbours, X_t are covariates, Π their coefficients, and V_t are i.i.d. errors. This structure simultaneously accounts for (i) current neighbour influence, (ii) each region’s own past, and (iii) neighbours’ past, thus disentangling three distinct channels of dependence.
Third, estimation proceeds via quasi‑maximum likelihood (QML). Because the ilr‑transformed observations are approximately multivariate normal, the QML criterion is tractable. The authors derive consistency and asymptotic normality of the estimator under a “large‑N, large‑T” asymptotic regime (both the number of regions n and the number of time periods T grow). Key technical conditions include a normalised spatial weight matrix, bounded eigenvalues of the spatial coefficient matrix, and independence of the error vectors across time and space.
A Monte‑Carlo simulation study explores finite‑sample performance across different numbers of components D, spatial dimensions n, and time lengths T. Results show small bias, accurate standard‑error estimates, and good separation of λ, γ, and ρ even when spatial dependence is moderate to strong.
The methodology is illustrated with two empirical applications.
- Berlin real‑estate market: Monthly transaction shares of three categories (condominiums, developed land, undeveloped land) are observed for 24 postal districts from 1995 to 2015. The estimated spatial coefficient λ≈0.28, temporal coefficient γ≈0.45, and spatial‑temporal lag ρ≈0.22 reveal moderate spatial autocorrelation and a clear dynamic pattern: rising condominium shares in one district tend to increase the share of developed land in neighbouring districts with a lag.
- Spanish sectoral composition: Annual shares of services, industry, and construction are recorded for 2,793 municipalities over 2012‑2021. Despite the short time horizon, the spatial lag remains significant, indicating that a municipality’s increase in service sector share is associated with a decrease in the industrial share of its neighbours.
The paper discusses limitations: handling of zeros (requiring α‑transformations), sensitivity to the choice of W, and the assumption of Gaussian errors. It suggests future extensions such as Bayesian hierarchical formulations, multi‑scale spatial weights, and non‑Gaussian error structures.
Overall, the study provides a rigorous, flexible, and computationally feasible approach for modelling spatiotemporal compositional panel data, opening new possibilities for policy analysis in urban economics, regional planning, and environmental studies where the relative composition of categories evolves over space and time.
Comments & Academic Discussion
Loading comments...
Leave a Comment