Optimal Rates of Statistical Seriation

Nicolas Flammarion 1, 2 Cheng Mao 3 Philippe Rigollet 3
2 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
Abstract : Given a matrix the seriation problem consists in permuting its rows in such way that all its columns have the same shape, for example, they are monotone increasing. We propose a statistical approach to this problem where the matrix of interest is observed with noise and study the corresponding minimax rate of estimation of the matrices. Specifically, when the columns are either unimodal or monotone, we show that the least squares estimator is optimal up to logarithmic factors and adapts to matrices with a certain natural structure. Finally, we propose a computationally efficient estimator in the monotonic case and study its performance both theoretically and experimentally. Our work is at the intersection of shape constrained estimation and recent work that involves permutation learning, such as graph denoising and ranking.
Type de document :
Pré-publication, Document de travail
V2 corrects an error in Lemma A.1, v3 corrects appendix F on unimodal regression where the bounds.. 2016
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01405738
Contributeur : Nicolas Flammarion <>
Soumis le : mercredi 30 novembre 2016 - 13:08:00
Dernière modification le : jeudi 26 avril 2018 - 10:29:13

Lien texte intégral

Identifiants

  • HAL Id : hal-01405738, version 1
  • ARXIV : 1607.02435

Collections

Citation

Nicolas Flammarion, Cheng Mao, Philippe Rigollet. Optimal Rates of Statistical Seriation. V2 corrects an error in Lemma A.1, v3 corrects appendix F on unimodal regression where the bounds.. 2016. 〈hal-01405738〉

Partager

Métriques

Consultations de la notice

136