CRIStAL - Centre de Recherche en Informatique et Automatique de Lille

Colloque Polaris

on February 25, 2025 at 2:30 pm

Speaker : Emilie Kaufmann

Solving pure exploration tasks in bandit models and beyond

Bandit models are well-studied in the machine learning community due to numerous applications to online content optimization. In a multi-armed bandit model, an agent sequentially tries different actions, called arms. Each arm is associated with an unknown probability distribution. In a pure exploration task, the agent wants to learn something about these distributions (e.g., which arms has the largest expectation) by querying as few samples as possible from them. I will first present a generic algorithm, called Track and Stop, that attains the minimal sample complexity in an asymptotic regime in which the error probability is small. The high computational cost of Track and Stop lead to the development of other types of algorithms that are computationally more appealing while keeping a (near)-optimal sample complexity. In particular, I will advocate the use of the so-called Top Two algorithms. Finally, I will present some results about pure exploration in Markov Decision processes in which the agent’s state can evolve with the chosen action and the goal is to learn a good policy (mapping from state to action).

More...

Amphi Ircica - 50 avenue Halley - Haute Borne - Villeneuve d'Ascq

HDR defence
September 4, 2025
HDR defence
September 4, 2025

Antoine Limasset (habilitation thesis)
September 4, 2025 at 9:00 am
Camille Marchet Picy (habilitation thesis)
September 4, 2025 at 1:30 pm

More news

Colloque Polaris

Speaker : Emilie Kaufmann

Agenda

Defenses

AGENDA

UTILITIES

Recruitment