Decision-Making in Autonomous Driving using Reinforcement Learning

Sammanfattning: The main topic of this thesis is tactical decision-making for autonomous driving. An autonomous vehicle must be able to handle a diverse set of environments and traffic situations, which makes it hard to manually specify a suitable behavior for every possible scenario. Therefore, learning-based strategies are considered in this thesis, which introduces different approaches based on reinforcement learning (RL). A general decision-making agent, derived from the Deep Q-Network (DQN) algorithm, is proposed. With few modifications, this method can be applied to different driving environments, which is demonstrated for various simulated highway and intersection scenarios. A more sample efficient agent can be obtained by incorporating more domain knowledge, which is explored by combining planning and learning in the form of Monte Carlo tree search and RL. In different highway scenarios, the combined method outperforms using either a planning or a learning-based strategy separately, while requiring an order of magnitude fewer training samples than the DQN method. A drawback of many learning-based approaches is that they create black-box solutions, which do not indicate the confidence of the agent's decisions. Therefore, the Ensemble Quantile Networks (EQN) method is introduced, which combines distributional RL with an ensemble approach, to provide an estimate of both the aleatoric and the epistemic uncertainty of each decision. The results show that the EQN method can balance risk and time efficiency in different occluded intersection scenarios, while also identifying situations that the agent has not been trained for. Thereby, the agent can avoid making unfounded, potentially dangerous, decisions outside of the training distribution. Finally, this thesis introduces a neural network architecture that is invariant to permutations of the order in which surrounding vehicles are listed. This architecture improves the sample efficiency of the agent by the factorial of the number of surrounding vehicles.