Detection and Identification of Sensor Attacks Using Partially Attack-Free Data

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this paper, we investigate data-driven attack detection and identification in a model-free setting. We consider a practically motivated scenario in which the available dataset may be compromised by malicious sensor attacks, but contains an unknown, contiguous, partially attack-free interval. The control input is assumed to include a small stochastic watermarking signal. Under these assumptions, we establish sufficient conditions for attack detection and identification from partially attack-free data. We also develop data-driven detection and identification procedures and characterize their computational complexity. Notably, the proposed framework does not impose a limit on the number of compromised sensors; thus, it can detect and identify attacks even when all sensor outputs are compromised outside the attack-free interval, provided that the attack-free interval is sufficiently long. Finally, we demonstrate the effectiveness of the proposed framework via numerical simulations.

💡 Research Summary

The paper addresses the problem of detecting and identifying sensor attacks in discrete‑time linear time‑invariant systems without requiring a prior mathematical model. The authors consider a realistic scenario where the available input‑output dataset may already be corrupted by malicious false‑data injection attacks, yet it contains an unknown contiguous interval during which no attacks occur (a “partially attack‑free interval”). The location and length of this clean interval are not known a priori. To enable data‑driven analysis, the control input is deliberately augmented with a small i.i.d. Gaussian watermark signal. This stochastic excitation serves two purposes: (i) it guarantees persistent excitation of arbitrary order with probability one, thereby satisfying the rank conditions needed for subspace‑based methods; and (ii) it improves the detectability of attacks by making the system’s response statistically distinguishable from an adversarially crafted attack.

The system model is
x(k+1)=A x(k)+B u(k), y(k)=C x(k)+a(k),
where a(k) denotes the sensor attack vector. The attacker is assumed to have full knowledge of the system state, inputs, outputs, and model, and can design a(k) arbitrarily, but the set of compromised sensors A* (size ℓ) is fixed over time and unknown to the defender.

Key contributions are:

Detection Condition – By constructing Hankel matrices of the input (U) and output (Y) over sliding windows of length q and T, the authors prove that, under the watermarking input, U(q,T)(k) has full row rank almost surely for any window. When a sufficiently long attack‑free interval exists (τ ≥ q+T−1), the left‑kernel of the clean output Hankel matrix can be identified, yielding a rank‑based test that distinguishes clean from attacked data. A heuristic detection algorithm is proposed that searches for a window where the residual of the rank test falls below a threshold, indicating the presence of the clean interval and thus confirming that an attack is occurring elsewhere.
Identification Condition – Building on the detection result, the paper derives an SVD‑based criterion: the singular values of the concatenated input‑output Hankel matrix Z(q,T)(k) reveal the dimension of the subspace spanned by the clean data. By comparing the observed singular spectrum with the expected rank (mq+pq−ℓ), the algorithm can uniquely recover the compromised sensor set A*. The method does not impose any upper bound on ℓ; even if all sensors are compromised outside the clean interval, identification remains possible provided the clean interval is long enough.
Complexity Analysis – The detection algorithm requires O(N · mq · T) operations for constructing Hankel matrices and performing rank checks, while the identification algorithm adds an SVD step of O((mq+pq)³) per candidate window. Both complexities are polynomial and scalable for moderate‑size systems.
Simulation Validation – Numerical experiments on a 4‑state, 6‑sensor system illustrate that (i) the detection algorithm reliably flags attacks when the clean interval length τ exceeds roughly 2 · (q+T), (ii) the identification algorithm correctly recovers the compromised sensor set even when ℓ = p (all sensors attacked) outside the clean interval, and (iii) the false‑alarm rate remains low for modest watermark variance φ².

Overall, the work extends data‑driven security from the restrictive assumption of entirely clean historical data to a more practical setting where only a hidden clean segment is guaranteed. By leveraging random watermark excitation and rank‑based subspace techniques, the authors provide provable sufficient conditions for both detection and identification, together with concrete algorithms and complexity bounds. This contribution is significant for cyber‑physical systems where model identification is difficult, data may be partially compromised, and timely, accurate attack localization is essential for resilient operation.

Detection and Identification of Sensor Attacks Using Partially Attack-Free Data

💡 Research Summary

Comments & Academic Discussion

Leave a Comment