AnchorGT: Efficient and Flexible Attention Architecture for Scalable Graph Transformers
AnchorGT: Efficient and Flexible Attention Architecture for Scalable Graph Transformers
Wenhao Zhu, Guojie Song, Liang Wang, Shaoguo Liu
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 5707-5715.
https://doi.org/10.24963/ijcai.2024/631
Graph Transformers (GTs) have significantly advanced the field of graph representation learning by overcoming the limitations of message-passing graph neural networks (GNNs) and demonstrating promising performance and expressive power. However, the quadratic complexity of self-attention mechanism in GTs has limited their scalability, and previous approaches to address this issue often suffer from expressiveness degradation or lack of versatility. To address this issue, we propose AnchorGT, a novel attention architecture for GTs with global receptive field and almost linear complexity, which serves as a flexible building block to improve the scalability of a wide range of GT models. Inspired by anchor-based GNNs, we employ structurally important k-dominating node set as anchors and design an attention mechanism that focuses on the relationship between individual nodes and anchors, while retaining the global receptive field for all nodes. With its intuitive design, AnchorGT can easily replace the attention module in various GT models with different network architectures and structural encodings, resulting in reduced computational overhead without sacrificing performance. In addition, we theoretically prove that AnchorGT attention can be strictly more expressive than Weisfeiler-Lehman test, showing its superiority in representing graph structures. Our experiments on three state-of-the-art GT models demonstrate that their AnchorGT variants can achieve similar results while being faster and significantly more memory efficient.
Keywords:
Machine Learning: ML: Sequence and graph learning