A Comprehensive Neural-Based Approach for Text Recognition in Videos using Natural Language Processing

Khaoula Elagouni 1 Christophe Garcia 2 Pascale Sébillot 3
2 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
3 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This work aims at helping multimedia content understanding by deriving bene t from textual clues embedded in digital videos. For this, we developed a complete video Optical Character Recognition system (OCR), speci cally adapted to detect and recognize embedded texts in videos. Based on a neural approach, this new method outperforms related work, especially in terms of robustness to style and size variabilities, to background complexity and to low resolution of the image. A language model that drives several steps of the video OCR is also introduced in order to remove ambiguities due to a local letter by letter recognition and to reduce segmentation errors. This approach has been evaluated on a database of French TV news videos and achieves an outstanding character recognition rate of 95%, corresponding to 78% of words correctly recognized, which enables its incorporation into an automatic video indexing and retrieval system.
Complete list of metadatas

Cited literature [24 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00645219
Contributor : Pascale Sébillot <>
Submitted on : Sunday, November 27, 2011 - 4:06:56 PM
Last modification on : Tuesday, February 26, 2019 - 11:20:53 AM
Long-term archiving on : Tuesday, February 28, 2012 - 2:21:39 AM

File

Paper_74.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00645219, version 1

Citation

Khaoula Elagouni, Christophe Garcia, Pascale Sébillot. A Comprehensive Neural-Based Approach for Text Recognition in Videos using Natural Language Processing. ACM International Conference on Multimedia Retrieval, ICMR, Apr 2011, Trento, Italy. 8 p., 2 columns. ⟨hal-00645219⟩

Share

Metrics

Record views

1267

Files downloads

742