Beyond the Click-Through Rate: Web Link Selection with Multi-level Feedback

Beyond the Click-Through Rate: Web Link Selection with Multi-level Feedback

Kun Chen, Kechao Cai, Longbo Huang, John C.S. Lui

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 3308-3314. https://doi.org/10.24963/ijcai.2018/459

The web link selection problem is to select a small subset of web links from a large web link pool, and to place the selected links on a web page that can only accommodate a limited number of links, e.g., advertisements, recommendations, or news feeds. Despite the long concerned click-through rate which reflects the attractiveness of the link itself, revenue can only be obtained from user actions after clicks, e.g., purchasing after being directed to the product pages by recommendation links. Thus, web links have an intrinsic multi-level feedback structure. With this observation, we consider the context-free web link selection problem, where the objective is to maximize revenue while ensuring that the attractiveness is no less than a preset threshold. The key challenge of the problem is that each link's multi-level feedbacks are stochastic, and unobservable unless the link is selected. We model this problem with a constrained stochastic multi-armed bandit formulation, and design an efficient link selection algorithm, called Constrained Upper Confidence Bound algorithm (Con-UCB). We prove O(sqrt(T ln(T))) bounds on both regret and violation of the attractiveness constraint. We also conduct extensive experiments on three real-world datasets, and show that Con-UCB outperforms state-of-the-art context-free bandit algorithms concerning the multi-level feedback structure.
Keywords:
Machine Learning Applications: Applications of Reinforcement Learning
Multidisciplinary Topics and Applications: AI and the Web