Skip to Main content Skip to Navigation
Conference papers

A Comprehensive Neural-Based Approach for Text Recognition in Videos using Natural Language Processing

Khaoula Elagouni 1 Christophe Garcia 2 Pascale Sébillot 3
2 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
3 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This work aims at helping multimedia content understanding by deriving bene t from textual clues embedded in digital videos. For this, we developed a complete video Optical Character Recognition system (OCR), speci cally adapted to detect and recognize embedded texts in videos. Based on a neural approach, this new method outperforms related work, especially in terms of robustness to style and size variabilities, to background complexity and to low resolution of the image. A language model that drives several steps of the video OCR is also introduced in order to remove ambiguities due to a local letter by letter recognition and to reduce segmentation errors. This approach has been evaluated on a database of French TV news videos and achieves an outstanding character recognition rate of 95%, corresponding to 78% of words correctly recognized, which enables its incorporation into an automatic video indexing and retrieval system.
Complete list of metadata

Cited literature [24 references]  Display  Hide  Download
Contributor : Pascale Sébillot Connect in order to contact the contributor
Submitted on : Sunday, November 27, 2011 - 4:06:56 PM
Last modification on : Wednesday, June 16, 2021 - 3:35:01 AM
Long-term archiving on: : Tuesday, February 28, 2012 - 2:21:39 AM


Files produced by the author(s)


  • HAL Id : hal-00645219, version 1


Khaoula Elagouni, Christophe Garcia, Pascale Sébillot. A Comprehensive Neural-Based Approach for Text Recognition in Videos using Natural Language Processing. ACM International Conference on Multimedia Retrieval, ICMR, Apr 2011, Trento, Italy. 8 p., 2 columns. ⟨hal-00645219⟩



Record views


Files downloads