Determinantal point processes (DPPs) generate random configuration of points where the points tend to repel each other. The notion of repulsion is encoded by the sub-determinants of a kernel matrix, in the sense of kernel methods in machine learning. This special algebraic form makes DPPs attractive both in statistical and computational terms. This thesis focuses on sampling from such processes, that is on developing simulation methods for DPPs. Applications include numerical integration, recommender systems or the summarization of a large corpus of data. In the finite setting, we establish the correspondence between sampling from a specific type of DPPs, called projection DPPs, and solving a randomized linear program. In this light, we devise an efficient Markov chain-based sampling method. In the continuous case, some classical DPPs can be sampled by computing the eigenvalues of carefully randomized tridiagonal matrices. We provide an elementary and unifying treatment of such models, from which we derive an approximate sampling method for more general models. In higher dimension, we consider a special class of DPPs used for numerical integration. We implement a tailored version of a known exact sampler, which allows us to compare the properties of Monte Carlo estimators in new regimes. In the context of reproducible research, we develop an open-source Python toolbox, named DPPy, which implements the state-of-the-art sampling methods for DPPs.
Directeur de thèse : Michal VALKO, Chargé de recherche INRIA, Lille - DeepMind, Paris Rémi BARDENET, Chargé de Recherche CNRS, Université de Lille Rapporteurs : Agnès Desolneux, Directrice de Recherche CNRS, ENS Paris-Saclay Romain Couillet, Professeur des universits, CentraleSupélec Paris Membres : Pierre-Olivier Amblard, Directeur de Recherche CNRS, Université de Grenoble-Alpes Frédéric Lavancier, Maître de conférences, Université de Nantes Sheehan Olver, Reader, Imperial College, London Michalis Titsias, Research Scientist, DeepMind, London
Thesis of the teams SIGMA and defended on 19/05/2020