Developing intelligent systems dedicated to human behavior understanding has been a very hot research topic in the few recent decades. Indeed, it is crucial to understand the human behavior in order to make machines able to interact with, assist, and help humans in their daily life. Recent breakthroughs in computer vision and machine learning have made this possible. For instance, human-related computer vision problems can be approached by first detecting and tracking 2D or 3D landmark points from visual data. Two relevant examples of this are given by the facial landmarks detected on the human face and the skeletons tracked along videos of human bodies. These techniques generate temporal sequences of landmark configurations, which exhibit several distortions in their analysis, especially in uncontrolled environments, due to view variations, inaccurate detection and tracking, missing data, etc. In this thesis, we propose two novel space-time representations of human landmark sequences along with suitable computational tools for human behavior understanding. Firstly, we propose a representation based on trajectories of Gram matrices of human landmarks. Gram matrices are positive semi-definite matrices of fixed rank and lie on a nonlinear manifold where standard computational and machine learning techniques could not be applied in a straightforward way. To overcome this issue, we make use of some notions of the Riemannian geometry and derive suitable computational tools for analyzing Gram trajectories. We evaluate the proposed approach in several human related applications involving 2D and 3D landmarks of human faces and bodies such us emotion recognition from facial expression and body movements and also action recognition from skeletons. Secondly, we propose another representation based on the barycentric coordinates of 2D facial landmarks. While being related to the Gram trajectory representation and robust to view variations, the barycentric representation allows to directly work with standard computational tools. The evaluation of this second approach is conducted on two face analysis tasks namely, facial expression recognition and depression severity level assessment. The obtained results with the two proposed approaches on real benchmarks are competitive with respect to recent state-of-the-art methods.
Directeur de thèse : M. Mohamed DAOUDI, Prof. lMT Lille Douai Rapporteurs : M. Nicu SEBE, Prof. University of Trento, M. Frédéric JURIE, Prof. Université de Caen Co-directeur de tèse : M. Boulbaba BEN AMOR, Prof. IMT Lille Douai Examinateurs : M. Juan Carlos ALVAREZ PAIVA, Prof. Université de Lille. Mme Catherine ACHARD, Maître de conférences/HDR, Sorbonne Université. Mme Tinne TUYTELAARS, Prof. KU Leuven
Thesis of the team 3D SAM defended on 12/12/2018