Poster
Mixture of Weak and Strong Experts on Graphs
Hanqing Zeng · Hanjia Lyu · Diyi Hu · Yinglong Xia · Jiebo Luo
Halle B
Realistic graphs contain both rich self-features and informative neighborhood structures, jointly handled by a GNN in the typical setup. We propose to decouple the two modalities by mixture of weak and strong experts (Mowst), where the weak expert is a light-weight Multi-layer Perceptron (MLP) , and the strong expert is an off-the-shelf Graph Neural Network (GNN). To adapt the experts' collaboration to different target nodes, we propose a "confidence" mechanism based on the dispersion of the weak expert's prediction logits. The strong expert is conditionally activated in the low-confidence region when either the node's classification relies on neighborhood information, or the weak expert has low model quality. We reveal interesting training dynamics by analyzing the influence of the confidence function on loss: our training algorithm encourages specialization of each expert by effectively generating a soft splitting of the graph. In addition, our "confidence" design imposes a desirable bias towards the strong expert to benefit from the better generalization capability of GNNs. Mowst is easy to optimize and achieves strong expressive power, with computation cost comparable to a single GNN. Empirically, Mowst shows significant accuracy improvement on 6 standard node classification benchmarks (including both homophilous and heterophilous graphs).