A Multi-phase Approach for Improving Information Diffusion in Social Networks

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

For maximizing influence spread in a social network, given a certain budget on the number of seed nodes, we investigate the effects of selecting and activating the seed nodes in multiple phases. In particular, we formulate an appropriate objective function for two-phase influence maximization under the independent cascade model, investigate its properties, and propose algorithms for determining the seed nodes in the two phases. We also study the problem of determining an optimal budget-split and delay between the two phases.

💡 Research Summary

The paper tackles the classic influence‑maximization problem under a fixed seed budget k, but departs from the usual single‑phase approach by allowing the seed set to be split across two phases. Working with the Independent Cascade (IC) diffusion model, the authors formalize a two‑phase objective function g(S₁) that captures the expected total number of activated nodes after an initial seed set S₁ (size k₁) is deployed, the diffusion runs for a delay d, and a second seed set S₂ (size k₂, k₂ ≤ k − k₁) is added based on the observed partial spread. The function g is shown to be non‑negative and monotone for fixed k₂ and d, but it is neither submodular nor supermodular, making the classic (1‑1/e) greedy guarantee inapplicable.

Because computing the exact optimal second‑phase seeds S_O(Y,k₂) for each possible observation Y is infeasible, the authors introduce two approximations. The first, f(S₁), replaces S_O with a greedy‑selected set S_G, yielding a (1‑1/e)(1‑ε) approximation to g, where ε depends on the number of Monte‑Carlo simulations. The second, h(S₁), uses a lightweight heuristic called Generalized Degree Discount (GDD). GDD iteratively picks a node with the highest value of (1 − p_{xv}) · (1 + ∑{y∈Y} p{vy}), where X are already‑selected in‑neighbors and Y are out‑neighbors not yet selected. Empirically, h preserves the ordering of f and correlates well with f, which is crucial for algorithms that rely on ratios of function values, such as the Fully Adaptive Cross‑Entropy (FACE) method.

Algorithm 1 presents a generic two‑phase framework: (1) use any single‑phase influence‑maximization algorithm A (greedy, PMIA, FACE, etc.) to select k₁ seeds, run the IC process for d steps; (2) delete the already‑activated nodes A_Y, treat the recently activated set R_Y as a partial seed, and run A again to pick k₂ additional seeds. Two special objective choices are examined – “prophetic” (F₁ = h, F₂ = σ) and “myopic” (F₁ = σ, F₂ = σ) – to study how the definition of the first‑phase utility influences the final spread.

Extensive experiments on the Les Miserables and High‑Energy‑Physics‑Theory collaboration networks, using both Weighted Cascade and Trivalency edge weight models, explore a range of delay values d and temporal decay factors δ (Γ(t)=δ^t). When δ = 1 (no decay), the optimal delay equals the diffusion horizon D, and an almost equal split of the budget (k₁≈k₂) yields the highest spread. For δ < 1 (later activations are less valuable), a short delay combined with allocating most of the budget to the first phase performs best. Across all settings, the two‑phase approach improves expected influence by roughly 5–10 % compared with the best single‑phase baseline, a gain that can be financially significant in marketing or customer‑retention contexts.

The authors also provide practical guidelines: under strict time constraints, stick to a single phase; under moderate constraints, use a short delay and allocate the majority of the budget to the first phase; when time is not an issue, employ a longer delay with a near‑equal budget split. They note that the myopic variant runs considerably faster than the prophetic one while achieving comparable spread.

Future work is outlined, emphasizing the need for scalable algorithms that jointly optimize k₁, d, and S₁, possibly exploiting the observed unimodal patterns in the performance curves. More realistic temporal decay functions and extensions to three or more phases are suggested as promising directions for further research.

A Multi-phase Approach for Improving Information Diffusion in Social Networks

💡 Research Summary

Comments & Academic Discussion

Leave a Comment