MPGraf: a Modular and Pre-trained Graphformer for Learning to Rank at Web-scale (Extended Abstract)

MPGraf: a Modular and Pre-trained Graphformer for Learning to Rank at Web-scale (Extended Abstract)

Yuchen Li, Haoyi Xiong, Linghe Kong, Zeyi Sun, Hongyang Chen, Shuaiqiang Wang, Dawei Yin

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Sister Conferences Best Papers. Pages 8439-8443. https://doi.org/10.24963/ijcai.2024/937

Both Transformer and Graph Neural Networks (GNNs) have been used in learning to rank (LTR), however, they adhere to two distinct yet complementary problem formulations, i.e., ranking score regression based on query-webpage pairs and link prediction within query-webpage bipartite graphs, respectively. Though it is possible to pre-train GNNs or Transformers on source datasets and fine-tune them subject to sparsely annotated LTR datasets separately, the source-target distribution shifts across the pairs and bipartite graphs domains make it extremely difficult to integrate these diverse models into a single LTR framework at a web-scale. We introduce the novel MPGraf model, which utilizes a modular and capsule-based pre-training approach, aiming to incorporate regression capacities from Transformers and link prediction capabilities of GNNs cohesively. We conduct extensive experiments to evaluate the performance of MPGraf using real-world datasets collected from large-scale search engines. The results show that MPGraf can outperform baseline algorithms on several major metrics. Further, we deploy and evaluate MPGraf atop a large-scale search engine with realistic web traffic via A/B tests, where we can still observe significant improvement. MPGraf performs consistently in both offline and online evaluations.
Keywords:
Data Mining: DM: Mining text, web, social media