EMOTE: An Explainable Architecture for Modelling the Other through Empathy

EMOTE: An Explainable Architecture for Modelling the Other through Empathy

Manisha Senadeera, Thommen Karimpanal George, Stephan Jacobs, Sunil Gupta, Santu Rana

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 4876-4884. https://doi.org/10.24963/ijcai.2024/539

Empathy allows us to assume others are like us and have goals analogous to our own. This can also at times be applied to multi-agent games - e.g. Agent 1's attraction to green balls is analogous to Agent 2's attraction to red balls. Drawing inspiration from empathy, we propose EMOTE, a simple and explainable inverse reinforcement learning (IRL) approach designed to model another agent's action-value function and from it, infer a unique reward function. This is done by referencing the learning agent's own action value function, removing the need to maintain independent action-value estimates for the modelled agents whilst simultaneously addressing the ill-posed nature of IRL by inferring a unique reward function. We experiment on minigrid environments showing EMOTE: (a) produces more consistent reward estimates relative to other IRL baselines (b) is robust in scenarios with composite reward and action-value functions (c) produces human-interpretable states, helping to explain how the agent views other agents.
Keywords:
Machine Learning: ML: Multiagent Reinforcement Learning
Agent-based and Multi-agent Systems: MAS: Multi-agent learning
AI Ethics, Trust, Fairness: ETF: Explainability and interpretability
Machine Learning: ML: Reinforcement learning