Kharita: Robust Map Inference using Graph Spanners
The widespread availability of GPS information in everyday devices such as cars, smartphones and smart watches make it possible to collect large amount of geospatial trajectory information. A particularly important, yet technically challenging, application of this data is to identify the underlying road network and keep it updated under various changes. In this paper, we propose efficient algorithms that can generate accurate maps in both batch and online settings. Our algorithms utilize techniques from graph spanners so that they produce maps can effectively handle a wide variety of road and intersection shapes. We conduct a rigorous evaluation of our algorithms over two real-world datasets and under a wide variety of performance metrics. Our experiments show a significant improvement over prior work. In particular, we observe an increase in Biagioni f-score of up to 20% when compared to the state of the art while reducing the execution time by an order of magnitude. We also make our source code open source for reproducibility and enable other researchers to build on our work.
💡 Research Summary
**
The paper introduces Kharita, a novel framework for inferring road networks from large‑scale GPS trajectory data. Recognizing the limitations of existing map‑construction methods—namely sensitivity to GPS noise, uneven sampling rates, data sparsity in low‑traffic areas, and difficulty handling complex intersections such as roundabouts—the authors propose a two‑phase pipeline that combines spatial clustering with graph‑spanner based sparsification.
In the first phase, raw GPS points (latitude, longitude, timestamp, speed, heading) are pre‑processed. A new distance metric dθ is defined that fuses the Vincenty geographic distance with an angular difference term, weighted by a parameter θ. This metric treats points that are close in space but opposite in heading as far apart, which helps to separate parallel lanes. To mitigate irregular sampling, an optional densification step inserts synthetic points every ~20 m along straight‑line segments when the heading change between consecutive points is small.
The second phase clusters the densified points using k‑means. Seed selection is performed with a uniform spacing constraint based on dθ, ensuring that each seed is at least a seed‑radius away from any other. Cluster centroids are computed as the mean latitude/longitude and the circular mean of headings. If a cluster exhibits high heading variance (e.g., >10°), it is split into more homogeneous sub‑clusters, a step crucial for correctly modeling roundabouts and other complex junctions.
After clustering, each trajectory is mapped to a sequence of cluster centroids. Consecutive centroids generate candidate directed edges (EC). Two filters prune spurious edges: self‑loops are discarded, and an edge is retained only if its support (the number of trajectories traversing it) exceeds a threshold derived from the point counts of its incident nodes. This reduces noise from GPS errors or illegal maneuvers.
The candidate graph GC = (V, EC) is then sparsified via an α‑spanner algorithm. An α‑spanner is a subgraph that preserves all pairwise shortest‑path distances within a factor α while using far fewer edges. The authors adapt a greedy spanner construction to respect directionality and to maintain road connectivity. By tuning α, users can trade off map detail against storage and computational efficiency.
Kharita* extends the offline pipeline to an online setting. New GPS points are streamed in; each point is assigned to the nearest existing cluster (or creates a new one) using dθ, and edges are added incrementally. The spanner structure is updated on‑the‑fly, allowing the map to evolve in near real‑time as roads are built, closed, or temporarily blocked.
The authors evaluate Kharita on two real‑world datasets from Doha and another urban area. They use the Biagioni f‑score (the de‑facto standard for map quality), average path‑length error, precision, recall, runtime, and memory consumption. Kharita achieves up to a 20 % increase in f‑score over the previous state‑of‑the‑art methods while reducing execution time by an order of magnitude (average 0.8 s for hundreds of thousands of points). The spanner‑based graph also dramatically lowers memory usage compared to dense candidate graphs.
Key contributions of the work are: (1) a direction‑aware distance metric that distinguishes opposite‑direction lanes; (2) a clustering‑plus‑splitting strategy that accurately captures complex intersection geometries; (3) the novel application of graph spanners to road‑network inference, providing provable guarantees on distance preservation while achieving sparsity; (4) an online variant that supports continuous map updates; and (5) extensive empirical validation demonstrating superior accuracy and efficiency.
Overall, Kharita represents a significant advance in GPS‑based map inference, marrying concepts from computational geometry, clustering, and graph theory to produce high‑quality, up‑datable road maps. Its open‑source implementation further encourages reproducibility and future extensions in autonomous driving, smart‑city planning, and real‑time navigation systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment