Sökning: "Mohammad Sadegh Talebi Mazraeh Shahi"

Hittade 2 avhandlingar innehållade orden Mohammad Sadegh Talebi Mazraeh Shahi.

  1. 1. Minimizing Regret in Combinatorial Bandits and Reinforcement Learning

    Författare :Mohammad Sadegh Talebi Mazraeh Shahi; Alexandre Proutiere; Mikael Johansson; Ronald Ortner; KTH; []
    Nyckelord :Multi-armed Bandits; Reinforcement Learning; Regret Minimization; Statistics; Electrical Engineering; Elektro- och systemteknik;

    Sammanfattning : This thesis investigates sequential decision making tasks that fall in the framework of reinforcement learning (RL). These tasks involve a decision maker repeatedly interacting with an environment modeled by an unknown finite Markov decision process (MDP), who wishes to maximize a notion of reward accumulated during her experience. LÄS MER

  2. 2. Online Combinatorial Optimization under Bandit Feedback

    Författare :Mohammad Sadegh Talebi Mazraeh Shahi; Alexandre Proutiere; Vianney Perchet; KTH; []
    Nyckelord :Combinatorial Optimization; Online Learning; Multi-armed Bandits; Sequential Decision Making; Matematik; Mathematics; Datalogi; Computer Science;

    Sammanfattning : Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making problems with an exploration vs. exploitation trade-off. In such problems, the decision maker selects an arm in each round and observes a realization of the corresponding unknown reward distribution. LÄS MER