Scalable spatial point process models for forensic footwear analysis
Shoe print evidence recovered from crime scenes plays a key role in forensic investigations. By examining shoe prints, investigators can determine details of the footwear worn by suspects. However, establishing that a suspect’s shoes match the make and model of a crime scene print may not be sufficient. Typically, thousands of shoes of the same size, make, and model are manufactured, any of which could be responsible for the print. Accordingly, a popular approach used by investigators is to examine the print for signs of ``accidentals,’’ i.e., cuts, scrapes, and other features that accumulate on shoe soles after purchase due to wear. While some patterns of accidentals are common on certain types of shoes, others are highly distinctive, potentially distinguishing the suspect’s shoe from all others. Quantifying the rarity of a pattern is thus essential to accurately measuring the strength of forensic evidence. In this study, we address this task by developing a hierarchical Bayesian model. Our improvement over existing methods primarily stems from two advancements. First, we frame our approach in terms of a latent Gaussian model, thus enabling inference to be efficiently scaled to large collections of annotated shoe prints via integrated nested Laplace approximations. Second, we incorporate spatially varying coefficients to model the relationship between shoes’ tread patterns and accidental locations. We demonstrate these improvements through superior performance on held-out data, which enhances accuracy and reliability in forensic shoe print analysis.
💡 Research Summary
This paper addresses a central challenge in forensic footwear analysis: quantifying how rare a particular pattern of “accidentals” (cuts, scrapes, and other wear‑induced marks) is on a shoe sole, and using that rarity to compute a random‑match probability (RMP) for a crime‑scene print. Traditional approaches have either assumed a uniform distribution of accidentals across the outsole or have aggregated accidentals from many shoes into a single heat map, ignoring the specific geometry of each shoe’s contact surface. While Spencer and Murray (2020) introduced a semi‑parametric hierarchical Bayesian Cox‑process model that linked accidentals to a discretized contact‑surface image, their inference relied on bespoke Markov‑chain Monte Carlo (MCMC) sampling, which becomes computationally prohibitive for large datasets.
The authors propose a new framework that reframes the problem as a latent Gaussian spatial point‑process model, enabling fast approximate Bayesian inference via Integrated Nested Laplace Approximation (INLA). By treating accidental locations as a Poisson point process whose log‑intensity is a sum of a global intercept, spatially varying regression coefficients, and a smooth Gaussian random field, the model captures both global trends and local effects of the shoe’s tread pattern. The contact‑surface image is retained at a relatively high resolution; each pixel (or small patch) contributes a covariate that can vary smoothly across the sole. Spatially varying coefficients are modeled with second‑order random walks or Matérn kernels, providing a flexible, non‑linear relationship between tread geometry and accidental propensity.
Key methodological contributions include:
-
Scalable inference – INLA provides deterministic Laplace approximations to posterior marginals, delivering orders‑of‑magnitude speed‑ups over MCMC while preserving accuracy. The authors demonstrate that fitting the model to the WVU database (≈1,300 shoes and tens of thousands of accidental points) completes in a few hours on a standard workstation, compared with days required by previous MCMC‑based methods.
-
Explicit use of shoe‑specific geometry – The spatially varying coefficients allow the intensity surface to adapt to each shoe’s unique tread layout, rather than relying on a single aggregated heat map. This yields shoe‑specific accidental density estimates, which are essential for computing meaningful RMPs.
-
Improved predictive performance – Through 10‑fold cross‑validation, the proposed model outperforms both the registration‑aggregation baselines and the earlier hierarchical Bayesian model on log‑likelihood, area‑under‑the‑curve, and calibrated RMP estimates. In particular, the model correctly identifies high‑risk regions (e.g., toe‑strike zones, lateral edges) where accidentals are more likely to appear, leading to more conservative (higher) RMPs when such marks are observed on a crime‑scene print.
-
Interpretability – The estimated spatially varying coefficients can be visualized as maps that highlight which tread features most strongly increase or decrease accidental intensity. This transparency is valuable for forensic experts presenting statistical evidence in court.
The paper also discusses practical considerations: the limited repeatability of exact contact‑surface images in existing databases, measurement error in accidental coordinates, and the need to define tolerance thresholds for “consistency” between a print and a shoe. The authors propose using the contact surface as a proxy for class characteristics and general wear, thereby modeling the conditional distribution (p(\text{accidentals}\mid \text{contact surface})) directly.
Limitations are acknowledged. The current formulation treats the outsole as a two‑dimensional planar surface, ignoring three‑dimensional shape aspects such as heel height or curvature that could affect wear patterns. Moreover, the model assumes perfect detection of accidentals; in practice, low‑quality prints may miss subtle marks, suggesting future work should incorporate observation‑error models.
In conclusion, the study delivers a statistically rigorous, computationally scalable, and interpretable solution for forensic footwear analysis. By leveraging latent Gaussian Cox processes and INLA, it bridges the gap between rich, shoe‑specific geometric information and the need for rapid, data‑driven inference on large forensic databases. The approach promises to enhance the reliability of random‑match probability estimates, thereby strengthening the evidential weight of shoe‑print evidence in judicial settings and paving the way for automated, real‑time forensic matching systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment