A Survey of Constraint Formulations in Safe Reinforcement Learning

Akifumi Wachi; Xun Shen; Yanan Sui

doi:10.24963/ijcai.2024/913

A Survey of Constraint Formulations in Safe Reinforcement Learning

Akifumi Wachi, Xun Shen, Yanan Sui

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

Survey Track. Pages 8262-8271. https://doi.org/10.24963/ijcai.2024/913

PDF BibTeX

Safety is critical when applying reinforcement learning (RL) to real-world problems. As a result, safe RL has emerged as a fundamental and powerful paradigm for optimizing an agent’s policy while incorporating notions of safety. A prevalent safe RL approach is based on a constrained criterion, which seeks to maximize the expected cumulative reward subject to specific safety constraints. Despite recent effort to enhance safety in RL, a systematic understanding of the field remains difficult. This challenge stems from the diversity of constraint representations and little exploration of their interrelations. To bridge this knowledge gap, we present a comprehensive review of representative constraint formulations, along with a curated selection of algorithms designed specifically for each formulation. In addition, we elucidate the theoretical underpinnings that reveal the mathematical mutual relations among common problem formulations. We conclude with a discussion of the current state and future directions of safe reinforcement learning research

Keywords:

Machine Learning: ML: Reinforcement learning

AI Ethics, Trust, Fairness: ETF: Safety and robustness

Planning and Scheduling: PS: Markov decisions processes