Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Noisy Quantization: theory and practice

Abstract : The effect of errors in variables in quantization is investigated. Given a noisy sample $Z_i=X_i+\epsilon_i,i=1,\ldots,n$, where $(X_i)_{i=1, \ldots ,n}$ are i.i.d. with law $P$, we want to find the best approximation of the probability distribution $P$ with $k\geq 1$ points called codepoints. We prove general excess risk bounds with fast rates for an empirical minimization based on a deconvolution kernel estimator. These rates depend on the behaviour of the density of $P$ and the asymptotic behaviour of the characteristic function of the noise $\epsilon$. This general study can be applied to the problem of $k$-means clustering with noisy data. For this purpose, we introduce a deconvolution $k$-means stochastic minimization which reaches fast rates of convergence under standard Pollard's regularity assumptions. We also introduce a new algorithm to deal with $k$-means clustering with errors in variables. Following the theoretical study, the algorithm mixes different tools from the inverse problem literature and the machine learning community. Coarsely, it is based on a two-step procedure: (1) a deconvolution step to deal with noisy inputs and (2) Newton's iterations as the popular $k$-means.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas
Contributor : Sébastien Loustau <>
Submitted on : Wednesday, September 10, 2014 - 3:35:40 PM
Last modification on : Monday, March 9, 2020 - 6:15:54 PM
Long-term archiving on: : Thursday, December 11, 2014 - 11:45:20 AM


Files produced by the author(s)


  • HAL Id : hal-01060380, version 1



Camille Brunet, Sébastien Loustau. Noisy Quantization: theory and practice. 2014. ⟨hal-01060380⟩



Record views


Files downloads