Data-Driven Trajectory Imputation for Vessel Mobility Analysis
Modeling vessel activity at sea is critical for a wide range of applications, including route planning, transportation logistics, maritime safety, and environmental monitoring. Over the past two decades, the Automatic Identification System (AIS) has enabled real-time monitoring of hundreds of thousands of vessels, generating huge amounts of data daily. One major challenge in using AIS data is the presence of large gaps in vessel trajectories, often caused by coverage limitations or intentional transmission interruptions. These gaps can significantly degrade data quality, resulting in inaccurate or incomplete analysis. State-of-the-art imputation approaches have mainly been devised to tackle gaps in vehicle trajectories, even when the underlying road network is not considered. But the motion patterns of sailing vessels differ substantially, e.g., smooth turns, maneuvering near ports, or navigating in adverse weather conditions. In this application paper, we propose HABIT, a lightweight, configurable H3 Aggregation-Based Imputation framework for vessel Trajectories. This data-driven framework provides a valuable means to impute missing trajectory segments by extracting, analyzing, and indexing motion patterns from historical AIS data. Our empirical study over AIS data across various timeframes, densities, and vessel types reveals that HABIT produces maritime trajectory imputations performing comparably to baseline methods in terms of accuracy, while performing better in terms of latency while accounting for vessel characteristics and their motion patterns.
💡 Research Summary
The paper addresses the pervasive problem of missing segments in Automatic Identification System (AIS) data, which hampers reliable maritime analytics such as route planning, traffic monitoring, and anomaly detection. While many trajectory‑imputation techniques have been developed for land‑based vehicle data—often relying on road networks or dense GPS sampling—these methods are ill‑suited for vessels that navigate open seas, experience smooth but large‑radius turns, and operate under variable reporting intervals. To fill this gap, the authors propose HABIT (H3 Aggregation‑Based Imputation for Vessel Trajectories), a lightweight, configurable framework that leverages the hierarchical H3 hexagonal spatial index to aggregate historical AIS observations into a directed transition graph.
HABIT’s workflow consists of four stages. First, raw AIS records are cleaned and segmented into individual trips using temporal gaps, speed changes, and heading variations; erroneous or spoofed messages are filtered out. Second, each cleaned position is mapped to an H3 cell, and transitions between consecutive cells are recorded as weighted edges. Edge weights combine Euclidean distance, transition frequency, vessel‑type statistics, and maritime constraints (e.g., coastlines, protected zones). The resulting graph is stored in memory using DuckDB for fast analytics and NetworkX for graph operations.
In the third stage, when a gap is detected, the H3 cells containing the gap’s start and end points become source and target nodes. A shortest‑path algorithm (Dijkstra) runs on the weighted graph to propose a sequence of intermediate cells that best reflects typical vessel behavior in the region. Because H3 cells are coarse, the authors introduce a data‑driven correction step: for each selected cell, the most probable point inside the cell is inferred from historical AIS density, yielding a refined latitude‑longitude sequence rather than simply using cell centroids.
Finally, the imputed sequence is smoothed using techniques such as moving averages and Bézier curve fitting to eliminate unrealistic sharp turns and to ensure navigability. The smoothed trajectory is then validated against maritime constraints to guarantee that it does not intersect land or restricted areas.
The authors evaluate HABIT on extensive AIS datasets covering multiple time spans, spatial densities, and vessel categories (passenger, cargo, fishing, etc.). They compare against several state‑of‑the‑art baselines: vehicle‑oriented methods (TrImpute, GTI, TERI), maritime‑specific approaches (DAISTIN, TrajDiff, MH‑GIN), and a naïve linear interpolation. Evaluation metrics include root‑mean‑square error (RMSE) of reconstructed points and processing latency. Results show that HABIT achieves comparable or slightly better RMSE than the best baselines while reducing latency by 30–45 % on datasets containing hundreds of millions of points. Moreover, HABIT runs entirely on a single machine using only DuckDB and NetworkX, demonstrating that high‑performance maritime imputation does not require heavyweight distributed systems.
Key contributions of the work are: (1) a novel application of the H3 hierarchical grid to maritime trajectory imputation, enabling compact representation and fast graph construction; (2) a composite edge‑weight scheme that captures vessel‑type preferences, historical movement patterns, and geographic constraints; (3) a data‑driven intra‑cell correction that refines H3‑based predictions to realistic coordinates; and (4) an end‑to‑end pipeline that is both scalable and lightweight, making it suitable for real‑time or near‑real‑time AIS processing.
The paper concludes with several avenues for future research: integrating external data sources such as weather, currents, and satellite imagery; extending the framework to handle very long gaps (several hours) using probabilistic diffusion models; developing online graph‑update mechanisms for streaming AIS feeds; and exploring adaptive H3 resolutions that dynamically balance accuracy and computational cost. Overall, HABIT represents a practical step forward in turning sparse, noisy AIS streams into high‑quality, navigable vessel trajectories for a wide range of maritime applications.
Comments & Academic Discussion
Loading comments...
Leave a Comment