Sökning: "Ronald Ortner"
Hittade 1 avhandling innehållade orden Ronald Ortner.
1. Minimizing Regret in Combinatorial Bandits and Reinforcement Learning
Sammanfattning : This thesis investigates sequential decision making tasks that fall in the framework of reinforcement learning (RL). These tasks involve a decision maker repeatedly interacting with an environment modeled by an unknown finite Markov decision process (MDP), who wishes to maximize a notion of reward accumulated during her experience. LÄS MER
Resultatsidor:
1