Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Online clustering of individual sequences

Abstract : We know that $\ell_0$-penalized methods have good theoretical properties but unfortunately high computational cost. On the contrary, convex relaxations - such as the Lasso - have been introduced but their theoretical guarantees hold for restricted models. To tackle this impasse, \cite{dalalyantsybakov2,dalalyantsybakov3} come up with sparsity priors in a Pac-Bayesian framework. They give rise to good theoretical properties, i.e. sparsity oracle inequalities, reached by computationally attractive sequential procedures. In this paper, we investigate this issue in clustering. We construct online \textit{clustering} algorithms which learn according to the following game protocol. At each trial $t\geq 1$, nature reveals a deterministic $x_t\in\R^d$, $d\geq 1$. A forecaster predicts the next value with several - and as small as possible - proposals. Then, nature reveals the next value and the forecaster pays the minimal distance between this value and its set of proposals. To deal with this problem, we use the Pac-Bayesian theory with group-sparsity priors. It gives sparsity regret bounds and allows us to perform online clustering of a possible non-stationnary process, without any knowledge about the number of clusters. These results can be applied to the classical i.i.d. case to deal with the problem of model selection clustering as well as high dimensional clustering.
Complete list of metadata
Contributor : Sébastien Loustau Connect in order to contact the contributor
Submitted on : Friday, February 7, 2014 - 2:47:40 PM
Last modification on : Wednesday, October 20, 2021 - 3:18:45 AM
Long-term archiving on: : Monday, May 12, 2014 - 12:05:30 PM


Files produced by the author(s)


  • HAL Id : hal-00943384, version 1



Sébastien Loustau. Online clustering of individual sequences. 2014. ⟨hal-00943384⟩



Les métriques sont temporairement indisponibles