Fast Wrong-way Cycling Detection in CCTV Videos: Sparse Sampling is All You Need
Effective monitoring of unusual transportation behaviors, such as wrong-way cycling (i.e., riding a bicycle or e-bike against designated traffic flow), is crucial for optimizing law enforcement deployment and traffic planning. However, accurately recording all wrong-way cycling events is both unnecessary and infeasible in resource-constrained environments, as it requires high-resolution cameras for evidence collection and event detection. To address this challenge, we propose WWC-Predictor, a novel method for efficiently estimating the wrong-way cycling ratio, defined as the proportion of wrong-way cycling events relative to the total number of cycling movements over a given time period. The core innovation of our method lies in accurately detecting wrong-way cycling events in sparsely sampled frames using a light-weight detector, then estimating the overall ratio using an autoregressive moving average model. To evaluate the effectiveness of our method, we construct a benchmark dataset consisting of 35 minutes of video sequences with minute-level annotations.Our method achieves an average error rate of a mere 1.475% while consuming only 19.12% GPU time required by conventional tracking methods, validating its effectiveness in estimating the wrong-way cycling ratio. Our source code is publicly available at: https://github.com/VICA-Lab-HKUST-GZ/WWC-Predictor.
💡 Research Summary
The paper addresses the practical problem of monitoring wrong‑way cycling (riding a bicycle or e‑bike against the designated traffic flow) using existing CCTV infrastructure without the need for high‑resolution cameras or computationally intensive multi‑object tracking (MOT). The authors propose WWC‑Predictor, a two‑stage framework that first performs sparse frame sampling and a lightweight two‑frame detector, then estimates the overall wrong‑way cycling ratio with a statistical time‑series model.
In the sparse detection stage, the video is uniformly sampled at a fixed interval (e.g., one‑minute gaps) to produce pairs of consecutive frames. Each pair is processed by the “Two‑Frame Wrong‑Way Cycling Detector”. This detector consists of three components: (1) motion‑based orientation estimation, which uses YOLOv5‑tiny to detect non‑motor vehicles, matches detections across the two frames via an IoU matrix filtered by a high‑overlap threshold, applies the Hungarian algorithm for optimal bipartite matching, and computes a direction angle from the centroid displacement; (2) appearance‑based orientation estimation, which feeds a cropped vehicle image into a ResNet‑101 backbone followed by a Phase‑Shifting Coder (PSC). The PSC encodes the cyclic angle into a three‑dimensional continuous vector, allowing smooth regression of orientation; (3) ensemble validation (AND‑Strategy), which accepts a detection only if both motion‑based and appearance‑based estimates agree, thereby reducing false positives caused by occlusion or minimal motion. The detector outputs per‑pair counts of right‑way (D_R) and wrong‑way (D_W) events.
In the temporal estimation stage, the sequence of (D_R, D_W) counts is treated as a time series. An Autoregressive Moving Average (ARMA) model is fitted to capture the temporal dependencies and to extrapolate the total number of events over the whole video. The final wrong‑way cycling ratio is computed as ΣD_W / (ΣD_R + ΣD_W).
To evaluate the approach, the authors built a benchmark consisting of 35 minutes of CCTV footage (four fully annotated videos) with minute‑level ground‑truth labels, 405 images for non‑motor vehicle detection, and 1,199 images for orientation annotation. Experiments show that WWC‑Predictor achieves an average absolute error of 1.475 % in estimating the wrong‑way cycling ratio, while consuming only 19.12 % of the GPU time required by conventional tracking‑based pipelines. Compared to dense‑sampling methods, WWC‑Predictor processes 6–10 × fewer frames and uses 4–6 × less computational resources, yet delivers comparable accuracy.
The paper’s contributions are threefold: (1) a novel two‑frame detector that robustly fuses motion and appearance cues for orientation estimation; (2) a temporal estimator that leverages ARMA modeling to convert sparse detections into a reliable video‑level metric; (3) a publicly released dataset and codebase to foster further research on wrong‑way cycling detection.
Strengths include the clear reduction in hardware requirements, the clever use of PSC to handle the periodic nature of orientation, and the statistical grounding provided by ARMA, which makes the system resilient to irregular sampling. The open‑source release enhances reproducibility.
Limitations involve the relatively small and homogeneous benchmark (only four locations, limited weather and lighting conditions), potential sensitivity of ARMA hyper‑parameters to traffic pattern changes, and reliance on a reasonably clear visual appearance for the PSC‑based model, which may degrade on very low‑resolution feeds.
Future directions suggested by the authors and inferred from the work are: expanding the dataset to diverse urban settings, exploring more expressive time‑series models (e.g., ARIMA, LSTM, Transformer) or hybrid statistical‑deep approaches, integrating multi‑camera graph‑signal processing to capture spatial correlations across intersections, and further compressing the appearance model for ultra‑low‑resolution scenarios.
Overall, WWC‑Predictor demonstrates that accurate city‑scale monitoring of wrong‑way cycling can be achieved with minimal sampling and lightweight computation, offering a practical solution for traffic authorities seeking scalable, cost‑effective safety analytics.
Comments & Academic Discussion
Loading comments...
Leave a Comment