Sökning: "Regret Minimization"

Hittade 5 avhandlingar innehållade orden Regret Minimization.

  1. 1. Regret Minimization in Structured Reinforcement Learning

    Författare :Damianos Tranos; Alexandre Proutiere; Yevgeny Seldin; KTH; []
    Nyckelord :TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; Reinforcement Learning; Electrical Engineering; Elektro- och systemteknik;

    Sammanfattning : We consider a class of sequential decision making problems in the presence of uncertainty, which belongs to the field of Reinforcement Learning (RL). Specifically, we study discrete Markov decision Processes (MDPs) which model a decision maker or agent that interacts with a stochastic and dynamic environment and receives feedback from it in the form of a reward. LÄS MER

  2. 2. Minimizing Regret in Combinatorial Bandits and Reinforcement Learning

    Författare :Mohammad Sadegh Talebi Mazraeh Shahi; Alexandre Proutiere; Mikael Johansson; Ronald Ortner; KTH; []
    Nyckelord :Multi-armed Bandits; Reinforcement Learning; Regret Minimization; Statistics; Electrical Engineering; Elektro- och systemteknik;

    Sammanfattning : This thesis investigates sequential decision making tasks that fall in the framework of reinforcement learning (RL). These tasks involve a decision maker repeatedly interacting with an environment modeled by an unknown finite Markov decision process (MDP), who wishes to maximize a notion of reward accumulated during her experience. LÄS MER

  3. 3. Inference and Online Learning in Structured Stochastic Systems

    Författare :Kaito Ariu; Alexandre Proutiere; Mikael Johansson; Wouter Koolen; KTH; []
    Nyckelord :TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; Electrical Engineering; Elektro- och systemteknik;

    Sammanfattning : This thesis contributes to the field of stochastic online learning problems, with a collection of six papers each addressing unique aspects of online learning and inference problems under specific structures. The first four papers focus on exploration and inference problems, uncovering fundamental information-theoretic limits and efficient algorithms under various structures. LÄS MER

  4. 4. Bandit Methods for Network Optimization : Safety, Exploration, and Coordination

    Författare :Filippo Vannella; Alexandre Proutiere; Vincent Tan; KTH; []
    Nyckelord :TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; Electrical Engineering; Elektro- och systemteknik;

    Sammanfattning : The increasing complexity of modern mobile networks poses unprecedented challenges to their optimization. Mobile Network Operators (MNOs) need to control a large number of network parameters to satisfy the users’ demands. LÄS MER

  5. 5. Combinatorial Semi-Bandit Methods for Navigation of Electric Vehicles

    Författare :Niklas Åkerblom; Chalmers tekniska högskola; []
    Nyckelord :NATURVETENSKAP; NATURAL SCIENCES; energy-efficient navigation; online learning; multi-armed bandit problem; Thompson sampling; combinatorial semi-bandit problem;

    Sammanfattning : Climate change is one of the most urgent global challenges humanity is currently facing. As major contributors of greenhouse gas emissions, the transport and automotive sectors have crucial roles to play in solving the problem. LÄS MER