Thèse de Romain Warlop

Novel Learning and Exploration-Exploitation Methodes for Effective Recommender Systems

In this Ph. D. thesis we intend to deal with this problem by developing novel and more sophisticated recommendation stragegies in which the collection of data and the improvement of the performance are considered as a unique process, where the trade-off between the quality of the data and the performance of the recommendation strategy is optimized over time. In particular, we plan to leverage the multi-armed bandit framework, which focuses exactly on how to trade-off the exploration of different options (i.e. the items) to gain information, with the need of optimizing the strategy by exploiting the data currently available to maximize the performance. While much theory and many solutions have been developed to solve the exploration- exploitation dimemma, they are often designed for relatively simple problems and could not by directly applied to the problem of recommandation systems. As a result, the core of this Ph. D. program is to develop novel algorithmic solutions effectively integrting the results obtained in the field of multi-armed bandit with the technologies available in RSs. The benefit of such solutions is that the RS itself is in charge of constantly generating data that are useful to improve its own performance. This would greatly improve the overall recommendation process, which could automatically and smoothly improve over time by deriectly learning from data sith very limited need of external supervision.

Jury

Directeurs de Thèse : Jérémie MARY, Alessandro LAZARIC Rapporteurs : Michèle SEBAG, Liva RALAIVOLA Membres : Florence D'ALCHE-BUC, Rémi GILLERON

Thèse de l'équipe soutenue le 19/10/2018