A Multi-directional Meta-Learning Framework for Class-Generalizable Anomaly Detection
In this paper, we address the problem of class-generalizable anomaly detection, where the objective is to develop a unified model by focusing our learning on the available normal data and a small amount of anomaly data in order to detect the completely unseen anomalies, also referred to as the out-of-distribution (OOD) classes. Adding to this challenge is the fact that the anomaly data is rare and costly to label. To achieve this, we propose a multidirectional meta-learning algorithm – at the inner level, the model aims to learn the manifold of the normal data (representation); at the outer level, the model is meta-tuned with a few anomaly samples to maximize the softmax confidence margin between the normal and anomaly samples (decision surface calibration), treating normals as in-distribution (ID) and anomalies as out-of-distribution (OOD). By iteratively repeating this process over multiple episodes of predominantly normal and a small number of anomaly samples, we realize a multidirectional meta-learning framework. This two-level optimization, enhanced by multidirectional training, enables stronger generalization to unseen anomaly classes.
💡 Research Summary
The paper tackles the problem of class‑generalizable anomaly detection, where a model must detect completely unseen anomalous classes (out‑of‑distribution, OOD) while being trained on abundant normal data and only a few labeled anomalies. To address the scarcity and labeling cost of anomalies, the authors propose a two‑level, multi‑directional meta‑learning framework.
In the inner loop, the model (encoder θ and classifier ϕ) is trained exclusively on normal samples using a one‑class binary cross‑entropy loss that pushes the soft‑max confidence of normal data toward 1. This step learns a compact manifold of normal data in a latent space; optionally the temperature scaling parameter T is also learned.
The outer loop performs meta‑tuning with a few anomaly (meta‑OOD) samples. Holding the encoder parameters θ fixed, the classifier ϕ and temperature T are updated to maximize the confidence margin between normal and anomalous inputs. The outer objective combines three terms: (i) a BCE loss that encourages high confidence on normals, (ii) a BCE loss that forces low confidence on anomalies, and (iii) a hinge‑margin loss that explicitly widens the average confidence gap to a predefined margin m, weighted by α.
The novelty lies in the “multi‑directional” episode construction. The dataset is partitioned into families (classes). Each episode samples a distinct pair (C_ID, C_meta): a set of normal families for the inner loop and a disjoint set of anomaly families for the outer loop. By repeatedly reshuffling these pairings across many episodes, the model experiences numerous transfer directions (e.g., normal‑A → anomaly‑B, normal‑C → anomaly‑D, etc.). This design forces the encoder θ to learn domain‑invariant representations while the classifier learns a robust decision surface that generalizes across many unseen OOD domains.
Mathematically, the bilevel problem is:
θ* = arg min_θ E_{x∼P_id}
Comments & Academic Discussion
Loading comments...
Leave a Comment