Document Image Analysis solutions for Digital libraries

Abstract : Today the development of digital libraries is reaching technological limits due to the difficulty of automatically processing a growing mass of digitized images of documents from different origins. The main problem is the high cost of the digitization and retro-conversion processes which include image capture and indexation, metadata extraction, image storage, conversion in reusable electronic form, publication on the Internet and reduction of image weights for faster access. To reduce the cost of digitization and retro-conversion, we need to break technological bottlenecks like the development of "intelligent" digitizers which reduce manual intervention and produce the best quality images. Retro-conversion needs efficient software which analyze images contents and automatically extract all necessary information for image indexing. Other technological bottlenecks must also be considered like the need of an open file format, which can describe digitized documents as heterogeneous media. This article is not state-of-the-art in this domain, it just describes some cases, which we have studied in our laboratory during the past years.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01552445
Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Sunday, July 2, 2017 - 10:34:54 PM
Last modification on : Friday, January 11, 2019 - 5:08:46 PM

Identifiers

  • HAL Id : hal-01552445, version 1

Citation

Frank Le Bourgeois, Éric Trinh, Bénédicte Allier, Véronique Eglin, Hubert Emptoz. Document Image Analysis solutions for Digital libraries. IEEE International Conference on Document Image Analysis for Libraries (DIAL'04). January 23 - 24, 2004. Palo Alto, California, pp 2-24., Jan 2004, Palo Alto, United States. pp.2-24. ⟨hal-01552445⟩

Share

Metrics

Record views

92