# Online clustering of individual sequences

Abstract : We know that $\ell_0$-penalized methods have good theoretical properties but unfortunately high computational cost. On the contrary, convex relaxations - such as the Lasso - have been introduced but their theoretical guarantees hold for restricted models. To tackle this impasse, \cite{dalalyantsybakov2,dalalyantsybakov3} come up with sparsity priors in a Pac-Bayesian framework. They give rise to good theoretical properties, i.e. sparsity oracle inequalities, reached by computationally attractive sequential procedures. In this paper, we investigate this issue in clustering. We construct online \textit{clustering} algorithms which learn according to the following game protocol. At each trial $t\geq 1$, nature reveals a deterministic $x_t\in\R^d$, $d\geq 1$. A forecaster predicts the next value with several - and as small as possible - proposals. Then, nature reveals the next value and the forecaster pays the minimal distance between this value and its set of proposals. To deal with this problem, we use the Pac-Bayesian theory with group-sparsity priors. It gives sparsity regret bounds and allows us to perform online clustering of a possible non-stationnary process, without any knowledge about the number of clusters. These results can be applied to the classical i.i.d. case to deal with the problem of model selection clustering as well as high dimensional clustering.
Keywords :
Document type :
Preprints, Working Papers, ...
Domain :
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00943384
Contributor : Sébastien Loustau <>
Submitted on : Friday, February 7, 2014 - 2:47:40 PM
Last modification on : Monday, March 9, 2020 - 6:15:54 PM
Long-term archiving on: : Monday, May 12, 2014 - 12:05:30 PM

### File

onlineclustering.pdf
Files produced by the author(s)

### Identifiers

• HAL Id : hal-00943384, version 1

### Citation

Sébastien Loustau. Online clustering of individual sequences. 2014. ⟨hal-00943384⟩

Record views