Detecting Location Fraud in Indoor Mobile Crowdsensing
Mobile crowdsensing allows a large number of mobile devices to measure phenomena of common interests and form a body of knowledge about natural and social environments. In order to get location annotations for indoor mobile crowdsensing, reference tags are usually deployed which are susceptible to tampering and compromises by attackers. In this work, we consider three types of location-related attacks including tag forgery, tag misplacement, and tag removal. Different detection algorithms are proposed to deal with these attacks. First, we introduce location-dependent fingerprints as supplementary information for better location identification. A truth discovery algorithm is then proposed to detect falsified data. Moreover, visiting patterns are utilized for the detection of tag misplacement and removal. Experiments on both crowdsensed and emulated dataset show that the proposed algorithms can detect all three types of attacks with high accuracy.
💡 Research Summary
Mobile crowdsensing (MCS) has emerged as a powerful paradigm for collecting large‑scale sensor data using ordinary smartphones. In indoor environments, GPS is unavailable or unreliable, so most indoor MCS applications rely on physical reference tags (e.g., QR codes, NFC, RFID) to provide location annotations. While inexpensive, these tags are vulnerable to tampering: an adversary can duplicate a tag (forgery), relocate a tag to a different spot (misplacement), or simply remove a tag (removal). Such attacks corrupt the collected dataset, degrade model training, and potentially waste incentives paid to participants.
The paper first formalizes three attack types:
- Tag Forgery – an attacker creates counterfeit tags or copies existing ones and uses them to annotate data without physically visiting the claimed locations.
- Tag Misplacement – a legitimate tag is moved from its intended position; all subsequent measurements associated with that tag become systematically mis‑located.
- Tag Removal – a tag disappears, leading to missing data for the corresponding location.
To detect these threats, the authors propose a two‑layer detection framework that leverages location‑dependent fingerprints (Wi‑Fi RSS and magnetic field strength) as auxiliary information.
Coarse‑grained filtering is performed first. The system checks whether the Wi‑Fi SSIDs observed in a fingerprint match the known set associated with a tag. It also computes the implied walking speed between two consecutive uploads from the same user: if the distance divided by the time interval exceeds a human‑reasonable bound (ρ = 10 m/s), the pair is flagged as suspicious. This simple filter can catch naïve forgery attempts where an attacker submits data from a single physical spot while claiming multiple locations. However, sophisticated attackers could manipulate SSIDs or mimic realistic speeds, so a more robust method is needed.
Truth‑Discovery based detection constitutes the core contribution. The authors model the problem as a probabilistic graphical model with two Gaussian Mixture Models (GMMs):
- GMM_T (truthful) models fingerprints collected at genuine tags. Each mixture component corresponds to one of the K physical locations.
- GMM_F (falsified) models fingerprints that belong to forged locations. Here each component corresponds to a distinct user (U components), reflecting the assumption that a malicious user’s forged data will share a common underlying fingerprint pattern.
The hidden variable t_i indicates whether the i‑th record’s location tag is truthful (t_i = 1) or falsified (t_i = 0). User reliability β_u is also hidden; β_u = 1 means all of user u’s contributions are truthful, while lower values indicate suspicious behavior.
Using the Expectation‑Maximization (EM) algorithm, the system iteratively updates:
- E‑step: compute the posterior probability Q_i(t_i) = p(t_i | x_i, θ_T, θ_F, β).
- M‑step: maximize the expected complete‑data log‑likelihood with respect to the mixture weights α, means μ, covariances Σ of both GMMs, and the reliability parameters β. Closed‑form updates are derived (equations 13‑19).
Through this process, the model simultaneously learns the statistical signatures of each genuine location, the typical fingerprint pattern of each malicious user, and a reliability score for every participant. Records with high posterior probability of being falsified are flagged, and users with low β are identified as potential attackers.
Visiting‑pattern analysis addresses tag misplacement and removal. The authors construct a directed graph for each user where nodes are tag IDs and edges represent consecutive visits. Under normal operation, edges respect physical adjacency and realistic travel times. Anomalous edges—e.g., a jump from tag A to a distant tag C with an implausibly short time interval—suggest that either tag A or tag C has been moved. Moreover, a sudden drop in the visitation frequency of a particular tag across the whole crowd indicates possible removal. By monitoring these patterns, the system can alert administrators to relocate or replace compromised tags.
Experimental evaluation uses two datasets: (1) a real‑world indoor crowdsensing campaign (temperature, humidity) where the authors injected synthetic attacks, and (2) a fully simulated environment with controlled attack scenarios. Metrics include accuracy, precision, recall, and F1‑score. Results show:
- Forgery detection: >96 % precision, >94 % recall.
- Misplacement detection: >93 % precision, >91 % recall.
- Removal detection: >98 % accuracy in identifying missing tags.
Even under combined attacks (e.g., a user forges tags and then moves a genuine tag), the EM‑based truth discovery maintains >90 % overall detection accuracy, thanks to the dynamic adjustment of user reliability scores.
The paper discusses limitations: reliance on Wi‑Fi and magnetic fingerprints makes the system sensitive to environmental changes, requiring periodic re‑training; EM convergence can be computationally intensive for very large crowds, suggesting future work on online or incremental updates; and privacy concerns arise from collecting fine‑grained sensor signatures, which may need anonymization or encryption mechanisms.
In conclusion, the authors present a comprehensive, low‑cost security layer for indoor mobile crowdsensing that combines simple heuristic filters, a probabilistic truth‑discovery framework, and graph‑based visitation analysis. The approach effectively mitigates three realistic location‑related attacks without requiring additional infrastructure beyond the already‑present Wi‑Fi and magnetic sensors in smartphones. Future directions include extending the model to incorporate other modalities (e.g., Bluetooth beacons), optimizing the EM algorithm for streaming data, and integrating privacy‑preserving techniques to protect participants while maintaining robust attack detection.
Comments & Academic Discussion
Loading comments...
Leave a Comment