Estimation of Contextual Exposure to HIV from GPS Data
We present a comprehensive statistical methodological framework for estimating contextual exposure to HIV that includes local (grid-cell level) estimation of HIV prevalence and human activity space estimation based on GPS data. The development of our framework was necessary to analyze HIV surveillance and sociodemographic survey data in conjunction with GPS data collected in rural KwaZulu-Natal, South Africa, to study the mobility patterns of young people. Based on mobility and contextual exposure measures, we examine whether the sex and age of study participants systematically influence the extent and structure of their mobility patterns. We discuss techniques for investigating how the study participants’ contextual exposure to HIV changes as their activity spaces expand beyond residential locations, as well as methods for identifying study participants who may be at increased risk of acquiring HIV. KEYWORDS: Contextual HIV exposure; GPS-based mobility analysis; Activity space; HIV prevalence mapping
💡 Research Summary
This paper presents a comprehensive statistical framework for estimating contextual exposure to HIV by integrating high‑resolution HIV surveillance data with GPS‑derived mobility information collected in rural KwaZulu‑Natal, South Africa. The authors first describe two primary data sources: (1) a longitudinal HIV cohort maintained by the Africa Health Research Institute (AHRI) covering the years 2011‑2023, which includes 12 annual demographic surveillance rounds, quarterly updates on births, deaths, and migrations, and precise residential geocodes (<2 m accuracy); and (2) the “Sesikhona” GPS study conducted from June 2021 to May 2025, which enrolled 207 participants aged 20‑30 years and recorded their smartphone locations at irregular intervals ranging from 10 minutes to 2 hours. After data cleaning, 204 participants provided usable mobility traces.
The methodological core consists of four interlocking components. First, the authors address the pervasive missingness of HIV status in the surveillance cohort. Rather than using the conventional midpoint imputation between the last negative and first positive test, they develop a Bayesian logistic‑regression multiple‑imputation scheme that incorporates individual covariates (sex, age, migration history) and spatial prevalence trends, thereby generating plausible HIV status trajectories for each participant across periods of non‑testing.
Second, they estimate HIV prevalence at the grid‑cell level across the study region. Using a Bayesian spatial model (e.g., conditional autoregressive priors), they combine observed test results with auxiliary information such as population density, age‑sex structure, and neighboring cell prevalence to produce posterior prevalence maps with credible intervals. This fine‑scale mapping supersedes traditional administrative‑boundary approaches and captures micro‑level heterogeneity.
Third, the GPS trajectories are processed to define individual activity spaces. Gaps are identified when the time between consecutive points exceeds 30 minutes; these gaps are interpolated using a hybrid approach that blends linear interpolation for short gaps with a Gaussian Markov movement model for longer intervals, preserving realistic path continuity while acknowledging uncertainty. The resulting smoothed trajectories are then converted into multi‑polygon activity spaces (e.g., kernel density contours or convex hulls) that reflect both the spatial extent and the temporal intensity of visits.
Fourth, contextual HIV exposure is quantified by overlaying each participant’s activity space onto the grid‑level prevalence surface. The authors compute a time‑weighted exposure index: for each grid cell intersecting the activity space, the cell’s prevalence is multiplied by the proportion of time the participant spent within that cell; the sum across all intersected cells yields an individual exposure score. Additional modifiers—such as distance travelled per day and frequency of visits to high‑prevalence zones—are incorporated through a generalized linear mixed‑effects model (GLMM) to assess the influence of demographic factors.
Empirical results reveal clear patterns. Participants aged 27‑34 years exhibit higher mobility (average daily distance ≈ 15 km) and spend more days outside the AHRI core area than younger peers, leading to a 1.8‑fold increase in the exposure index. Sex differences are modest; however, men tend to travel slightly farther outside the study area. The exposure index rises sharply as activity spaces expand beyond residential neighborhoods, confirming the hypothesis that risk is not captured by static home‑based measures. High‑risk individuals are identified as those with prolonged external residence periods or frequent long‑distance trips to metropolitan hubs (e.g., Durban, Richards Bay).
The discussion emphasizes the public‑health relevance: contextual exposure metrics can guide targeted HIV testing, pre‑exposure prophylaxis distribution, and community outreach to the specific locales where at‑risk individuals spend the most time. Limitations include GPS data irregularities, privacy‑preserving coordinate perturbations, sensitivity to the chosen grid resolution, and the absence of concurrent environmental or social risk layers (e.g., alcohol outlet density, crime). The authors propose future extensions such as integrating real‑time exposure dashboards, refining movement models with accelerometer data, and coupling the framework with other health outcomes.
In conclusion, the study delivers a robust, reproducible pipeline for converting raw GPS traces into actionable contextual HIV exposure estimates, demonstrating that dynamic mobility information substantially enriches risk assessment beyond static residential analyses. This approach holds promise for informing precision public‑health interventions in high‑burden settings worldwide.
Comments & Academic Discussion
Loading comments...
Leave a Comment