Experimental Demonstration of Online Learning-Based Concept Drift Adaptation for Failure Detection in Optical Networks

Experimental Demonstration of Online Learning-Based Concept Drift Adaptation for Failure Detection in Optical Networks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present a novel online learning-based approach for concept drift adaptation in optical network failure detection, achieving up to a 70% improvement in performance over conventional static models while maintaining low latency.


💡 Research Summary

The paper introduces an online learning framework designed to handle concept drift (CD) in optical network failure detection. Traditional machine learning approaches for fault management are typically trained on static, historical datasets and struggle when the underlying data distribution changes due to equipment aging, unexpected malfunctions, or the appearance of rare hard failures. To address this, the authors propose a method that continuously updates the model as new telemetry arrives, thereby adapting to evolving network conditions.

The experimental setup consists of a testbed that generates two complementary datasets: a Soft Failure Dataset (SFD) representing normal and soft‑failure conditions, and a Hard Failure Dataset (HFD) that mimics rare, abrupt failures. Both datasets contain timestamps, device identifiers, BER, and OSNR for transmitter and receiver; only BER and OSNR are used for training. The SFD is used for batch training of both static and online models, while the HFD is streamed sample‑by‑sample. In the static scenario, the model only predicts on each new sample; in the online scenario, the model predicts and updates its parameters with the same sample, simulating a real‑time learning loop.

Concept drift detection is performed with the Page‑Hinkley Test (PHT), applied primarily to the OSNR‑SPO2 feature because it shows the strongest correlation with the target label and the most pronounced distribution shift between soft and hard failures. PHT flags a drift when the cumulative sum of deviations exceeds a predefined threshold, allowing the system to recognize when the data distribution has changed.

Three classifiers are evaluated: Adaptive Random Forest (ARF), Logistic Regression (LR), and Naïve Bayes (NB). These span a range of complexities from linear to ensemble methods, providing a broad view of online learning robustness. Experiments run on an Intel i5 13th‑gen CPU (16 GB RAM) without GPU acceleration. Performance is measured using rolling accuracy and Area Under the ROC Curve (AUC) computed over sliding windows of 500 samples.

Results show that online models consistently outperform their static counterparts. The online LR model achieves up to a 70 % increase in rolling accuracy relative to the static LR, while online ARF improves accuracy by up to 55 %. When hard‑failure samples are artificially oversampled, the online ARF quickly regains 100 % accuracy, whereas the static ARF’s performance degrades further, illustrating the advantage of continual adaptation. In terms of AUC, static models hover around 0.5 (random guessing), whereas online models stabilize near 0.75 after initial drift periods.

Latency analysis reveals that online updates add modest overhead: median per‑event prediction + update times are 0.009 s for LR, 0.083 s for NB, and 0.404 s for ARF. Even the most demanding ARF incurs less than 1 ms of additional latency, which the authors argue is acceptable given the substantial accuracy gains.

The study concludes that online learning provides a practical, model‑agnostic solution for handling concept drift in optical network fault detection, delivering significant accuracy improvements while maintaining low computational cost. Limitations include the reliance on fully labeled streams, the use of a single drift‑sensitive feature (OSNR), and the fact that the datasets are generated in a controlled laboratory environment. Future work is suggested to explore unsupervised online learning, multi‑feature drift detection, and long‑term deployment in real operational networks.


Comments & Academic Discussion

Loading comments...

Leave a Comment