Multi-Station WiFi CSI Sensing Framework Robust to Station-wise Feature Missingness and Limited Labeled Data
We propose a WiFi Channel State Information (CSI) sensing framework for multi-station deployments that addresses two fundamental challenges in practical CSI sensing: station-wise feature missingness and limited labeled data. Feature missingness is commonly handled by resampling unevenly spaced CSI measurements or by reconstructing missing samples, while label scarcity is mitigated by data augmentation or self-supervised representation learning. However, these techniques are typically developed in isolation and do not jointly address long-term, structured station unavailability together with label scarcity. To bridge this gap, we explicitly incorporate station unavailability into both representation learning and downstream model training. Specifically, we adapt cross-modal self-supervised learning (CroSSL), a representation learning framework originally designed for time-series sensory data, to multi-station CSI sensing in order to learn representations that are inherently invariant to station-wise feature missingness from unlabeled data. Furthermore, we introduce Station-wise Masking Augmentation (SMA) during downstream model training, which exposes the model to realistic station unavailability patterns under limited labeled data. Our experiments show that neither missingness-invariant pre-training nor station-wise augmentation alone is sufficient; their combination is essential to achieve robust performance under both station-wise feature missingness and label scarcity. The proposed framework provides a practical and robust foundation for multi-station WiFi CSI sensing in real-world deployments.
💡 Research Summary
The paper tackles two practical challenges that arise simultaneously in multi‑station Wi‑Fi Channel State Information (CSI) sensing: (1) long‑term, station‑wise feature missingness caused by heterogeneous traffic and network contention, and (2) scarcity of labeled data, which is common because CSI is highly environment‑dependent and costly to annotate. Existing works address either missingness (through resampling or reconstruction) or label scarcity (through data augmentation or self‑supervised learning) but assume the other problem is solved, leading to poor performance when both occur together.
To bridge this gap, the authors propose a unified framework that integrates (a) a missingness‑invariant self‑supervised pre‑training stage and (b) a station‑wise masking augmentation (SMA) applied during downstream supervised training. The self‑supervised component adapts Cross‑modal Self‑Supervised Learning (CroSSL), originally designed for multi‑sensor time‑series, to the CSI domain. Each Wi‑Fi station is treated as a separate “view”. Random station‑mask patterns are applied to generate two views from the same unlabeled CSI segment, and a contrastive loss forces the encoder to produce consistent embeddings regardless of which stations are present. This forces the learned representation to be robust to any combination of missing stations.
During downstream training, SMA deliberately masks out whole stations in the labeled samples according to realistic missingness patterns observed in the network. By exposing the classifier/regressor to the same type of incompleteness it will face at inference, the model learns to rely on cross‑station correlations rather than on any single station’s data, thereby reducing over‑fitting when labeled data are scarce.
The methodology is evaluated on two real‑world multi‑station CSI datasets collected with commodity 802.11 devices: an office‑like environment and a factory‑like environment. Both datasets include synchronized video frames used to generate ground‑truth labels for tasks such as human localization and activity classification. Experiments vary the proportion of labeled data from 1 % to 20 % and compare against several baselines: (i) self‑supervised pre‑training without missingness awareness (AutoFi, multi‑device SSL), (ii) traditional data augmentation (temporal shifting, noise injection), and (iii) reconstruction‑based missingness handling. Results show that (1) missingness‑invariant pre‑training alone yields limited gains when labels are abundant, (2) SMA alone is insufficient when labels are extremely limited, and (3) the combination of CroSSL‑based pre‑training and SMA consistently outperforms all baselines, achieving 8–12 % higher accuracy on average across tasks and label ratios. Notably, performance degradation remains minimal even when up to 30 % of stations are missing during inference.
A key practical contribution is that the masking patterns used in both pre‑training and SMA are sampled from actual network logs rather than synthetic distributions, ensuring that the learned robustness translates directly to real deployments. The authors also discuss the computational overhead, which is modest because the masking operation is cheap and the CroSSL encoder can be trained on standard GPUs.
In summary, the paper demonstrates that jointly addressing station‑wise feature missingness and label scarcity through a missingness‑aware self‑supervised objective and a targeted augmentation strategy yields a robust, label‑efficient CSI sensing pipeline suitable for real‑world multi‑station Wi‑Fi deployments. Future work may explore dynamic masking schedules informed by predicted missingness, meta‑learning for completely unlabeled scenarios, and extension to other wireless sensing modalities such as mmWave or BLE.
Comments & Academic Discussion
Loading comments...
Leave a Comment