Unveiling Concepts Learned by a World-Class Chess-Playing Agent

Unveiling Concepts Learned by a World-Class Chess-Playing Agent

Aðalsteinn Pálsson, Yngvi Björnsson

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 4864-4872. https://doi.org/10.24963/ijcai.2023/541

In recent years, the state-of-the-art agents for playing abstract board games, like chess and others, have moved from using intricate hand-crafted models for evaluating the merits of individual game states toward using neural networks (NNs). This development has eased the encapsulation of the relevant domain-specific knowledge and resulted in much-improved playing strength. However, this has come at the cost of making the resulting models ill-interpretable and challenging to understand and use for enhancing human knowledge. Using a world-class superhuman-strength chess-playing engine as our testbed, we show how recent model probing interpretability techniques can shed light on concepts learned by the engine's NN. Furthermore, to gain additional insight, we contrast the game-state evaluations of the NN to that of its counterpart hand-crafted evaluation model and identify and explain some of the main differences.
Keywords:
Multidisciplinary Topics and Applications: MDA: Game playing
Machine Learning: ML: Explainable/Interpretable machine learning
Search: S: Game playing