Skip to Main content Skip to Navigation
Conference papers

Segmentation-free speech text recognition for comic books

Abstract : Speech text in comic books is written in a particular manner by the scriptwriter which raises unusual challenges for text recognition. We first detail these challenges and present different approaches to solve them. We compare the performances of pre-trained OCR and segmentation-free approach for speech text of comic books written in Latin script. We demonstrate that few good quality pre-trained OCR output samples, associated with other unlabeled data with the same writing style, can feed a segmentation-free OCR and improve text recognition. Thanks to the help of the lexi-cality measure that automatically accept or reject the pre-trained OCR output as pseudo ground truth for a subsequent segmentation-free OCR training and recognition.
Complete list of metadata

Cited literature [17 references]  Display  Hide  Download
Contributor : Christophe Rigaud Connect in order to contact the contributor
Submitted on : Friday, March 2, 2018 - 9:52:58 AM
Last modification on : Thursday, May 12, 2022 - 3:37:34 PM
Long-term archiving on: : Thursday, May 31, 2018 - 1:00:59 PM




Christophe Rigaud, Jean-Christophe Burie, Jean-Marc Ogier. Segmentation-free speech text recognition for comic books. 2nd International Workshop on coMics Analysis, Processing, and Understanding (MANPU), Nov 2017, Kyoto, Japan. ⟨10.1109/ICDAR.2017.288⟩. ⟨hal-01719619⟩



Record views


Files downloads