Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes

Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes

Shun Zhang, Edmund H. Durfee, Satinder Singh

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 4867-4873. https://doi.org/10.24963/ijcai.2018/676

As it achieves a goal on behalf of its human user, an autonomous agent's actions may have side effects that change features of its environment in ways that negatively surprise its user. An agent that can be trusted to operate safely should thus only change features the user has explicitly permitted. We formalize this problem, and develop a planning algorithm that avoids potentially negative side effects given what the agent knows about (un)changeable features. Further, we formulate a provably minimax-regret querying strategy for the agent to selectively ask the user about features that it hasn't explicitly been told about. We empirically show how much faster it is than a more exhaustive approach and how much better its queries are than those found by the best known heuristic.
Keywords:
Planning and Scheduling: Markov Decisions Processes
Planning and Scheduling: Planning with Incomplete information
Humans and AI: Human-AI Collaboration