Probabilistic Masked Attention Networks for Explainable Sequential Recommendation

Probabilistic Masked Attention Networks for Explainable Sequential Recommendation

Huiyuan Chen, Kaixiong Zhou, Zhimeng Jiang, Chin-Chia Michael Yeh, Xiaoting Li, Menghai Pan, Yan Zheng, Xia Hu, Hao Yang

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 2068-2076. https://doi.org/10.24963/ijcai.2023/230

Transformer-based models are powerful for modeling temporal dynamics of user preference in sequential recommendation. Most of the variants adopt the Softmax transformation in the self-attention layers to generate dense attention probabilities. However, real-world item sequences are often noisy, containing a mixture of true-positive and false-positive interactions. Such dense attentions inevitably assign probability mass to noisy or irrelevant items, leading to sub-optimal performance and poor explainability. Here we propose a Probabilistic Masked Attention Network (PMAN) to identify the sparse pattern of attentions, which is more desirable for pruning noisy items in sequential recommendation. Specifically, we employ a probabilistic mask to achieve sparse attentions under a constrained optimization framework. As such, PMAN allows to select which information is critical to be retained or dropped in a data-driven fashion. Experimental studies on real-world benchmark datasets show that PMAN is able to improve the performance of Transformers significantly.
Keywords:
Data Mining: DM: Collaborative filtering
Data Mining: DM: Information retrieval
Data Mining: DM: Recommender systems
AI Ethics, Trust, Fairness: ETF: Explainability and interpretability