Safety verification of model based reinforcement learning controllers using reachability analysis
thesisposted on 13.08.2019 by Akshita Gupta
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
Reinforcement Learning (RL) is a data-driven technique which is finding increasing application in the development of controllers for sequential decision making problems. Their wide adoption can be attributed to the fact that the development of these controllers is independent of the
knowledge of the system and thus can be used even when the environment dynamics are unknown. Model-Based RL controllers explicitly model the system dynamics from the observed (training) data using a function approximator, followed by using a path planning algorithm to obtain the optimal control sequence. While these controllers have been proven to be successful in simulations, lack of strong safety guarantees in the presence of noise makes them ill-posed for deployment on hardware, specially in safety critical systems. The proposed work aims at bridging this gap by providing a verification framework to evaluate the safety guarantees for a Model-Based RL controller. Our method builds upon reachability analysis to determine if there is any action which can drive the system into a constrained (unsafe) region. Consequently, our method can provide a binary yes or no answer to whether all the initial set of states are (un)safe to propagate trajectories from in the presence of some bounded noise.