Collaborative Inference of Coexisting Information Diffusions

Collaborative Inference of Coexisting Information Diffusions
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Recently, \textit{diffusion history inference} has become an emerging research topic due to its great benefits for various applications, whose purpose is to reconstruct the missing histories of information diffusion traces according to incomplete observations. The existing methods, however, often focus only on single information diffusion trace, while in a real-world social network, there often coexist multiple information diffusions over the same network. In this paper, we propose a novel approach called Collaborative Inference Model (CIM) for the problem of the inference of coexisting information diffusions. By exploiting the synergism between the coexisting information diffusions, CIM holistically models multiple information diffusions as a sparse 4th-order tensor called Coexisting Diffusions Tensor (CDT) without any prior assumption of diffusion models, and collaboratively infers the histories of the coexisting information diffusions via a low-rank approximation of CDT with a fusion of heterogeneous constraints generated from additional data sources. To improve the efficiency, we further propose an optimal algorithm called Time Window based Parallel Decomposition Algorithm (TWPDA), which can speed up the inference without compromise on the accuracy by utilizing the temporal locality of information diffusions. The extensive experiments conducted on real world datasets and synthetic datasets verify the effectiveness and efficiency of CIM and TWPDA.


💡 Research Summary

The paper tackles the problem of reconstructing the complete histories of multiple information diffusions that coexist on the same social network. Traditional diffusion‑history inference methods focus on a single cascade and rely on explicit diffusion models (e.g., Independent Cascade), which limits their applicability when several memes spread simultaneously and observations are extremely sparse.
To address this, the authors introduce the Collaborative Inference Model (CIM). They encode all infections of M memes over N nodes across Q time steps into a fourth‑order tensor A∈ℝ^{N×N×M×Q}, called the Coexisting Diffusions Tensor (CDT). Each entry records how many times a source node infected a destination node with a particular meme at a specific time. Because the observed entries are only a tiny fraction of the tensor, they approximate A by a Tucker decomposition: a core tensor G of rank R and four factor matrices S (source), D (destination), C (meme), and T (time).
The approximation is obtained by minimizing a loss function composed of (1) reconstruction error on observed cells, (2) L2 regularization, and four heterogeneous constraints that inject external knowledge:
• Source‑Destination Affinity (SDA) – an asymmetric affinity matrix X derived from interaction logs, factorized as S·Dᵀ.
• Node‑Meme Affinity (NMA) – a matrix Y of destination‑meme infection frequencies, factorized as D·Cᵀ.
• Meme Correlation (MC) – a co‑occurrence matrix Z of memes, encouraging similar latent vectors in C via a trace term tr(Cᵀ(K−Z)C).
• Temporal Smoothness (TS) – a penalty ‖T−UT‖² that forces adjacent time factors to be close, where U links consecutive time steps.
These constraints are weighted by hyper‑parameters λ₂…λ₅, allowing the model to compensate for extreme sparsity.

Optimization is performed with a gradient‑descent based Native Decomposition Algorithm (NDA). However, directly applying NDA to the full CDT is infeasible for large networks because the tensor size grows as N²·M·Q. The authors therefore propose the Time‑Window based Parallel Decomposition Algorithm (TWPDA). TWPDA splits the time dimension into overlapping windows, runs NDA independently on each window in parallel, and stitches the results together using the temporal smoothness constraint to ensure consistency across window boundaries. This reduces memory consumption to O(N²·M·W) (W = window length) and yields near‑linear speed‑up on multi‑core or cluster environments.

Experiments on real Twitter and Weibo datasets (hundreds of thousands of nodes, dozens of memes, tens of time steps) and on synthetic data demonstrate that CIM combined with TWPDA achieves substantially higher reconstruction accuracy (up to 18 % improvement in F1 score) than baseline single‑cascade methods, especially when observation rates drop below 5 %. Moreover, TWPDA accelerates the decomposition by a factor of three on average without sacrificing accuracy. Ablation studies confirm that each heterogeneous constraint contributes positively, with SDA and NMA providing the largest gains, while MC and TS improve robustness in very sparse temporal slices.

In summary, the paper makes three key contributions: (1) a unified tensor representation that captures synergistic information among coexisting diffusions, (2) a flexible regularization framework that leverages heterogeneous auxiliary data, and (3) an efficient parallel decomposition scheme that makes large‑scale collaborative diffusion inference practical. Future work is suggested on extending the approach to hypergraph networks and online streaming scenarios.


Comments & Academic Discussion

Loading comments...

Leave a Comment