Unsupervised Neural Segmentation and Clustering for Unit Discovery in Sequential Data

Abstract : We study the problem of unsupervised segmentation and clustering of handwritten lines with applications to character discovery. We propose a constrained variant of Vector Quantized Variational Autoencoder (VQ-VAE) which produces a discrete and piecewise-constant encoding of the data. We show that the constrained quantization task is dual to a Markovian dynamics prior placed on the latent codes. Such view facilitates a probabilistic interpretation of the constraints and allows efficient inference. We demonstrate the effectiveness of the proposed method in the context of unsupervised handwriting character discovery in 17th-century scanned manuscripts.
Complete list of metadatas

Cited literature [14 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02399138
Contributor : Ricard Marxer <>
Submitted on : Sunday, December 8, 2019 - 9:57:22 PM
Last modification on : Tuesday, December 17, 2019 - 2:26:48 AM

File

PGR009.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02399138, version 1

Citation

Jan Chorowski, Nanxin Chen, Ricard Marxer, Hans Dolfing, Adrian Łańcucki, et al.. Unsupervised Neural Segmentation and Clustering for Unit Discovery in Sequential Data. NeurIPS 2019 workshop - Perception as generative reasoning - Structure, Causality, Probability, Dec 2019, Vancouver, Canada. ⟨hal-02399138⟩

Share

Metrics

Record views

34

Files downloads

38