Object classification in images and videos

Yi Ji

Résumé

In this dissertation, we address the problem of generative object categorization in computer vision. Then, we apply to the classification of facial expressions. For the first part, we are inspired by the method Hierarchical Dirichlet Processes to generate intermediate mixture components to improve recognition and categorization, as it shares with documents modelling topic two similar aspects: its nonparametric and its hierarchical nature. After we obtain the set of components, instead of boosting the features as Viola and Jones, we try to boost the components in the intermediate layer to find the most distinctive ones. We consider that these components are more important for object class recognition than others and use them to improve the classification. Our target is to understand the correct classification of objects, and also to discover the essential latent themes sharing across multiple categories of object and the particular distribution of the latent themes for a special category. In the second part, regarding the relation between basic expressions and corresponding facial deformation models, we propose two new textons, VTB and moments on spatiotemporal plane, to describe the transformation of human face during facial expressions. These descriptors aim to catch both general shape changes and motion texture details. The dynamic deformation of facial components is so captured by modelling the temporal behaviour of facial expression. Finally, SVM based system is used to efficiently recognize the expression for a single image in sequence, then, the weighted probabilities of all the frames are used to predict the class of the current sequence. My thesis includes finding the proper methods to describe the static and dynamic aspects during facial expression. I also aim to design new descriptors to denote characteristics of facial muscle movements, and furthermore, identify the category of emotion.

Dans cette thèse, nous avons abordé la problématique de la classification d'objets puis nous l'avons appliqué à la classification et la reconnaissance des expressions faciales. D'abord, nous nous sommes inspirés des processus de Dirichlet, comme des distributions dans l'espace des distributions, qui génèrent des composantes intermédiaires permettant d'améliorer la catégorisation d'objets. Ce modèle, utilisé notamment dans la classification sémantique de documents, se caractérise par le fait d'être non paramétrique, et d'être hiérarchique. Dans une première phase, l'ensemble des composantes intermédiaires de base sont extraites en utilisant l'apprentissage bayésien par MCMC puis une sélection itérative des classifiers faibles les plus distinctifs parmi toutes les composantes est opéré par Adaboost. Notre objectif est de cerner les distributions des composantes latentes aussi bien celles partagées par les différentes classes que celles associées à une catégorie particulière. Nous avons cherché dans cette seconde partie à appliquer notre approche de classification aux expressions faciales. Ce travail a consisté à trouver les méthodes adéquates pour décrire les aspects statiques et dynamiques au cours de l'expression faciale, et donc à concevoir de nouveaux descripteurs capables de représenter les caractéristiques des mouvements des muscles faciaux, et par là même, identifier la catégorie de l'expression.

Object classification in images and videos

Reconnaissance d'objets dans d'images et des vidéos

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager