Structured Mixture of Linear Mappings in High Dimension

Chun-Chen Tu 1 Florence Forbes 2 Benjamin Lemasson 3 Naisyin Wang 1
2 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
3 Equipe 5 : NeuroImagerie Fonctionnelle et Perfusion Cérébrale
UJF - Université Joseph Fourier - Grenoble 1, CEA - Commissariat à l'énergie atomique et aux énergies alternatives, INSERM - Institut National de la Santé et de la Recherche Médicale : U836, [GIN] Grenoble Institut des Neurosciences
Abstract : When analyzing data with complex structures such as high dimensionality and non-linearity, one often needs sophisticated models to capture the intrinsic complexity. However, practical implementation using these models could be difficult. Striking a balance between parsimony and model flexibility is essential to tackle data complexity while maintaining feasibility and satisfactory prediction performances. In this work, we proposed the use of Structured Mixture of Gaussian Locally Linear Mapping (SMoGLLiM) when there is a need to use high-dimensional predictors to predict low-dimensional responses and there is a possibility that the underlying associations could be heterogeneous or non-linear. Besides using mixtures of linear associations to approximate non-linear patterns locally and using inverse regression to mitigate the complications due to high-dimensional predictors, SMoGLLiM also aims at achieving robustness by adopting cluster-size constraints and trimming abnormal samples. Its hierarchical structure enables covariance matrices and latent factors being shared across smaller clusters, which effectively reduce the number of parameters. An Expectation-Maximization (EM) algorithm is devised for parameter estimation and, with analytical solutions; the estimation process is computa-tionally efficient. Numerical results obtained from three real-world datasets demonstrate the flexibility and ability of SMoGLLiM in accommodating complex data structure. They include using high-dimensional face images to predict the parameters under which the images were taken, predicting the sucrose levels by the high-dimensional hyperspectral measurements obtained from different types of orange juice and a magnetic resonance vascular fingerprinting (MRvF) study in which researchers are interested at using the so-called MRv fingerprints at voxel level to predict the microvascular properties in brain. The three datasets bear different features and presents different types of challenges. For example , the size of the MRv fingerprint dataset demands special consideration to reduce computational burden. With the hierarchical structure of SMoGLLiM, we are able to adopt parallel computing techniques to reduce the model building time by 97%. These examples illustrate the wide range of applicability of SMoGLLiM on handling different kinds of complex data structure.
Type de document :
Pré-publication, Document de travail
2018
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01700053
Contributeur : Florence Forbes <>
Soumis le : samedi 3 février 2018 - 10:23:29
Dernière modification le : mercredi 11 avril 2018 - 01:57:47
Document(s) archivé(s) le : jeudi 3 mai 2018 - 09:30:47

Fichier

SMoGLLiM_manuscript_20171221.p...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01700053, version 1

Collections

Citation

Chun-Chen Tu, Florence Forbes, Benjamin Lemasson, Naisyin Wang. Structured Mixture of Linear Mappings in High Dimension . 2018. 〈hal-01700053〉

Partager

Métriques

Consultations de la notice

290

Téléchargements de fichiers

38