Safety verification of model based reinforcement learning controllers using reachability analysis GuptaAkshita 2019 <div>Reinforcement Learning (RL) is a data-driven technique which is finding increasing application in the development of controllers for sequential decision making problems. Their wide adoption can be attributed to the fact that the development of these controllers is independent of the</div><div>knowledge of the system and thus can be used even when the environment dynamics are unknown. Model-Based RL controllers explicitly model the system dynamics from the observed (training) data using a function approximator, followed by using a path planning algorithm to obtain the optimal control sequence. While these controllers have been proven to be successful in simulations, lack of strong safety guarantees in the presence of noise makes them ill-posed for deployment on hardware, specially in safety critical systems. The proposed work aims at bridging this gap by providing a verification framework to evaluate the safety guarantees for a Model-Based RL controller. Our method builds upon reachability analysis to determine if there is any action which can drive the system into a constrained (unsafe) region. Consequently, our method can provide a binary yes or no answer to whether all the initial set of states are (un)safe to propagate trajectories from in the presence of some bounded noise.</div>