HGEN: Heterogeneous Graph Ensemble Networks
This paper presents HGEN that pioneers ensemble learning for heterogeneous graphs. We argue that the heterogeneity in node types, nodal features, and local neighborhood topology poses significant challenges for ensemble learning, particularly in accommodating diverse graph learners. Our HGEN framework ensembles multiple learners through a meta-path and transformation-based optimization pipeline to uplift classification accuracy. Specifically, HGEN uses meta-path combined with random dropping to create Allele Graph Neural Networks (GNNs), whereby the base graph learners are trained and aligned for later ensembling. To ensure effective ensemble learning, HGEN presents two key components: 1) a residual-attention mechanism to calibrate allele GNNs of different meta-paths, thereby enforcing node embeddings to focus on more informative graphs to improve base learner accuracy, and 2) a correlation-regularization term to enlarge the disparity among embedding matrices generated from different meta-paths, thereby enriching base learner diversity. We analyze the convergence of HGEN and attest its higher regularization magnitude over simple voting. Experiments on five heterogeneous networks validate that HGEN consistently outperforms its state-of-the-art competitors by substantial margin.
💡 Research Summary
The paper introduces HGEN (Heterogeneous Graph Ensemble Networks), the first framework that brings ensemble learning to heterogeneous graphs. The authors first decompose a heterogeneous graph into multiple homogeneous meta‑path graphs, each represented by an adjacency matrix A_i and node feature matrix X_i. For every meta‑path, they generate k “allele” GNNs by applying random feature dropout to X_i and initializing each GNN differently. This creates diverse base learners from the same structural view while preserving the underlying relational semantics of the meta‑path.
To fuse the outputs of the allele GNNs, HGEN employs a residual‑attention mechanism. The embeddings from each allele GNN are projected with learnable weights, averaged across the k learners, and then centered by subtracting the row‑wise mean. A min‑max normalization yields a residual attention matrix ˜Θ that adaptively weights each learner: strong learners receive higher scores, weak learners lower scores. Importantly, the residual branch provides an identity‑mapping shortcut that stabilizes training of the highly non‑linear attention scores. The fused embedding for meta‑path i, denoted H·∪_i, is a weighted sum of the allele embeddings and contains no additional learnable parameters beyond the attention projection.
Diversity across meta‑paths is encouraged by a correlation regularizer. After pooling each H·∪_i into a graph‑level vector ˜H_i, the authors compute a correlation matrix S = ˜H·˜Hᵀ and add an L1 penalty λ‖S‖₁ to the loss. Larger λ forces the embeddings from different meta‑paths to become more orthogonal, reducing redundancy and increasing the ensemble’s effective variance reduction.
The overall objective combines standard cross‑entropy loss with the correlation regularizer. Training complexity is analyzed: a single GNN costs O(nf+e), and the whole HGEN scales as O(m·k·T), where m is the number of meta‑paths, k the number of allele GNNs per meta‑path, and T the cost of training one GNN. Because m and k are typically much smaller than T, the method remains linearly scalable in both ensemble size and graph size.
Experiments on five real heterogeneous datasets (academic citation, social networks, urban mobility, biomedical, etc.) compare HGEN against recent heterogeneous‑graph baselines such as GEN, GEL, and GNN‑Ensemble. HGEN consistently outperforms these methods by 4–7 percentage points in accuracy, F1, and AUC. Ablation studies show that removing feature dropout, residual‑attention, or the correlation regularizer each degrades performance, confirming that all four components (meta‑path decomposition, allele GNN generation, residual‑attention fusion, and correlation regularization) are essential. Moreover, varying λ demonstrates that stronger regularization widens the distribution of per‑meta‑path predictions, evidencing increased diversity.
In summary, HGEN provides a principled, theoretically grounded, and empirically validated solution for heterogeneous graph ensemble learning. By integrating meta‑path based graph splitting, stochastic allele generation, adaptive residual‑attention weighting, and explicit diversity regularization, it achieves superior predictive performance while maintaining linear scalability, opening new avenues for robust graph‑based AI in complex multi‑type network domains.
Comments & Academic Discussion
Loading comments...
Leave a Comment