Skip to Main content Skip to Navigation
Conference papers

Segmentation-free speech text recognition for comic books

Abstract : Speech text in comic books is written in a particular manner by the scriptwriter which raises unusual challenges for text recognition. We first detail these challenges and present different approaches to solve them. We compare the performances of pre-trained OCR and segmentation-free approach for speech text of comic books written in Latin script. We demonstrate that few good quality pre-trained OCR output samples, associated with other unlabeled data with the same writing style, can feed a segmentation-free OCR and improve text recognition. Thanks to the help of the lexi-cality measure that automatically accept or reject the pre-trained OCR output as pseudo ground truth for a subsequent segmentation-free OCR training and recognition.
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01719619
Contributor : Christophe Rigaud <>
Submitted on : Friday, March 2, 2018 - 9:52:58 AM
Last modification on : Monday, May 4, 2020 - 4:42:03 PM
Document(s) archivé(s) le : Thursday, May 31, 2018 - 1:00:59 PM

Identifiers

Collections

Citation

Christophe Rigaud, Jean-Christophe Burie, Jean-Marc Ogier. Segmentation-free speech text recognition for comic books. 2nd International Workshop on coMics Analysis, Processing, and Understanding (MANPU), Nov 2017, Kyoto, Japan. ⟨10.1109/ICDAR.2017.288⟩. ⟨hal-01719619⟩

Share

Metrics

Record views

108

Files downloads

416