Triplet CNN-based word spotting of historical Arabic documents

Abstract : Word Spotting of Historical Arabic Documents is a challenging task due to the complexity of document layouts. This paper proposes a novel word spotting approach that consists of learning feature representation to describe word images. The objective is to investigate optimal embedding spaces to extract a discriminative word image representation. The proposed approach consists of two steps: i) construct a CNN-based embedding space with triplet-loss and then ii) match embedding representations using the Euclidean distance. For training, the CNN takes as input a set of triplet samples (anchor, positive sample and negative sample). Then, the triplet loss serves to create a novel space by minimizing intra-classes distances and maximizing inter-classes distances. The proposed approach is evaluated on the VML-HD dataset and the experiments show its effectiveness compared to the state of the art.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02473637
Contributor : Mohamed Ibn Khedher <>
Submitted on : Monday, February 10, 2020 - 7:07:18 PM
Last modification on : Tuesday, February 18, 2020 - 2:36:01 PM

Identifiers

  • HAL Id : hal-02473637, version 1

Citation

Abir Fathallah, Mohamed Ibn Khedher, Mounim El Yacoubi, Najoua Essoukri Ben Amara. Triplet CNN-based word spotting of historical Arabic documents. ICONIP 2019: 26th International Conference on Neural Information Processing of the Asia-Pacific Neural Network Society, Dec 2019, Sydney, Australia. ⟨hal-02473637⟩

Share

Metrics

Record views

39