Safe, human-like, decision-making for autonomous driving

Sammanfattning: Autonomous driving technology can significantly improve transportation by saving lives and social costs and increasing traffic efficiency and availability. Decision-making is a critical component of driving ability. Complex traffic environments and interactions between road users bring about many challenges in decision-making. Besides safety and efficiency, the decision-making should also be adaptable to various driving scenarios and social norms. Human drivers’ behaviors provide examples of solving intensive interactions and following driving courtesies. In this thesis, we introduce a systematic solution for decision-making to be safe, efficient, and human-like. We formulate the tactical decision-making task in driving as a sequential decision-making problem and describe it with Markov decision processes (MDP). Reinforcement learning (RL) techniques are adopted as the backbone in solving the MDP. To ensure safety, we propose a system architecture to combine shielding with RL. Shielding is a formal method to prevent learning methods from taking dangerous actions. Furthermore, the human driving experience is used to improve the data efficiency for RL methods and make the driving policy more human-like. Although RL methods can solve decision-making problems, the performance heavily depends on reward functions. Since the true reward function for driving is unknown, we address this problem using imitation learning. Adversarial inverse reinforcement learning (AIRL) extracts both the reward function and the driving policy from expert driving demonstrations. To improve and stabilize the performance, we propose reward augmentations for AIRL and demonstrate better results.

  Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.