MoTiC: Momentum Tightness and Contrast for Few-Shot Class-Incremental Learning

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Few-Shot Class-Incremental Learning (FSCIL) must contend with the dual challenge of learning new classes from scarce samples while preserving old class knowledge. Existing methods use the frozen feature extractor and class-averaged prototypes to mitigate against catastrophic forgetting and overfitting. However, new-class prototypes suffer significant estimation bias due to extreme data scarcity, whereas base-class prototypes benefit from sufficient data. In this work, we theoretically demonstrate that aligning the new-class priors with old-class statistics via Bayesian analysis reduces variance and improves prototype accuracy. Furthermore, we propose large-scale contrastive learning to enforce cross-category feature tightness. To further enrich feature diversity and inject prior information for new-class prototypes, we integrate momentum self-supervision and virtual categories into the Momentum Tightness and Contrast framework (MoTiC), constructing a feature space with rich representations and enhanced interclass cohesion. Experiments on three FSCIL benchmarks produce state-of-the-art performances, particularly on the fine-grained task CUB-200, validating our method’s ability to reduce estimation bias and improve incremental learning robustness.

💡 Research Summary

Few‑Shot Class‑Incremental Learning (FSCIL) requires a model to continuously incorporate new classes from only a few labeled examples while retaining knowledge of previously learned classes. Existing approaches typically freeze a backbone trained on a large base dataset and represent each class by the average of its feature embeddings (prototypes). This works well for base classes, which have abundant data, but new‑class prototypes suffer from severe estimation bias because only a handful of samples are available, leading to high variance and poor generalization.

The paper first provides a Bayesian analysis of this problem. If a new class c has K support samples {fθ(xi)} and the feature encoder outputs are modeled as N(μc,σ²I), the maximum‑likelihood prototype μ̂MLE = (1/K)∑ fθ(xi) has variance σ²/K, which becomes large when K is small. By introducing a Gaussian prior centered on a well‑estimated prototype of a similar old class c′, μc ∼ N(μc′,τ²I), the posterior mean becomes

μ̂Bayes = (σ⁻²∑ fθ(xi) + τ⁻²μc′) / (Kσ⁻² + τ⁻²).

When τ² is small (i.e., the old class is semantically close or its representation is rich), the posterior variance drops dramatically, showing that richer prior information can dramatically improve few‑shot prototype estimation.

Motivated by this insight, the authors design the Momentum Tightness and Contrast (MoTiC) framework, which supplies prior information from three complementary angles:

Feature richness via MoCo‑style momentum contrast – A query encoder fθ and a key encoder fθ′ share the same architecture; the key encoder is updated by an exponential moving average of the query encoder (θ′←mθ′+(1−m)θ). A large, consistent feature queue stores key embeddings from recent batches, providing a global “memory” of the feature space. The self‑supervised contrastive loss L_ssc encourages each query to be close to its positive key while being distinguished from all other keys in the queue, enriching the representation without overfitting to the few new‑class samples.
Inter‑class tightness loss (L_MoTi) – Unlike conventional supervised contrastive learning that pulls same‑class samples together, MoTiC pushes samples from different classes closer. For a query q with label c_q, the loss

L_MoTi = −(1/B)∑ log

MoTiC: Momentum Tightness and Contrast for Few-Shot Class-Incremental Learning

💡 Research Summary

Comments & Academic Discussion

Leave a Comment