Sökning: "Odalric-Ambrym Maillard"

Hittade 1 avhandling innehållade orden Odalric-Ambrym Maillard.

1. Efficient Online Learning under Bandit Feedback

Författare :Stefan Magureanu; Alexandre Proutiere; Odalric-Ambrym Maillard; KTH; []
Nyckelord :TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; multi-armed bandits; reinforcement learning; learning to rank; Electrical Engineering; Elektro- och systemteknik;

Sammanfattning : In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function of the arm and extend these results to bandits with arbitrary structure that is known to the decision maker. LÄS MER

Resultatsidor:

1