Near-Universal Multiplicative Updates for Nonnegative Einsum Factorization
Despite the ubiquity of multiway data across scientific domains, there are few user-friendly tools that fit tailored nonnegative tensor factorizations. Researchers may use gradient-based automatic differentiation (which often struggles in nonnegative settings), choose between a limited set of methods with mature implementations, or implement their own model from scratch. As an alternative, we introduce NNEinFact, an einsum-based multiplicative update algorithm that fits any nonnegative tensor factorization expressible as a tensor contraction by minimizing one of many user-specified loss functions (including the $(α,β)$-divergence). To use NNEinFact, the researcher simply specifies their model with a string. NNEinFact converges to a stationary point of the loss, supports missing data, and fits to tensors with hundreds of millions of entries in seconds. Empirically, NNEinFact fits custom models which outperform standard ones in heldout prediction tasks on real-world tensor data by over $37%$ and attains less than half the test loss of gradient-based methods while converging up to 90 times faster.
💡 Research Summary
The paper addresses a practical bottleneck in modern data science: fitting customized non‑negative tensor factorizations (NTF) at scale. Existing tools either require deep expertise to hand‑code a model, rely on a narrow set of mature algorithms, or use automatic‑differentiation (AD) based gradient descent, which is notoriously slow and unstable under non‑negativity constraints. To bridge this gap, the authors introduce NNEinFact (Near‑Universal Multiplicative Updates for Nonnegative Einsum Factorization), a framework that lets a user specify any NTF model that can be expressed as an einsum contraction via a simple string, and then fits the model using a multiplicative‑update algorithm that works with a broad family of loss functions, including the (α,β)‑divergence, KL, IS, Hellinger, χ², and many others.
Mathematical formulation
Given an M‑mode non‑negative data tensor Y ∈ ℝ^{I₁×…×I_M}_+, the goal is to approximate it by a reconstructed tensor Ŷ built from L factor tensors Θ^{(ℓ)}. Each factor’s indices are split into observed indices i_ℓ (shared with Y) and contracted indices r_ℓ (summed out). The generalized factorization has the element‑wise form
ŷ_i = Σ_r Π_{ℓ=1}^L θ^{(ℓ)}_{i_ℓ, r_ℓ}.
In einsum notation this is compactly written as a string such as “i1 r1, i2 r2, … → i1 i2 …”. The forward pass is therefore a single call to einsum(model_str, *Θ).
Algorithmic core
The authors derive multiplicative updates via a majorization‑minimization (MM) framework. For a chosen loss L(Y, Ŷ) they construct an auxiliary function Q(Θ|Θ^{(t)}) that upper‑bounds the loss and touches it at the current iterate. Because Q is separable across each factor tensor, minimizing Q yields a closed‑form non‑negative update:
θ^{(ℓ)} ← θ^{(ℓ)} ⊙
Comments & Academic Discussion
Loading comments...
Leave a Comment