Textual indexation of ancient documents

Abstract : In the past years many levels of indexation have been developped to allow a fast retrieval of digitized documents. Among all the ways of indexing a document, textual indexation allows the finest querries on a the documents' content. Usually, the plain text transcription of a digitized document is obtained by applying an OCR (Optical Character Recognition) software on it. What if the OCR fails? Indeed OCR systems are inefficient on low-quality printed documents, and are unsuited to the processing of ancient fonts. Furthermore, OCR is not applicable to manuscript text recognition. In this paper we introduce two alternative methods of accessing to text trough the image: the Computer Assisted Transcription and the Word Spotting.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01592827
Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Monday, September 25, 2017 - 2:35:16 PM
Last modification on : Wednesday, October 31, 2018 - 12:24:07 PM

Identifiers

Citation

Yann Leydier, Frank Le Bourgeois, Hubert Emptoz. Textual indexation of ancient documents. ACM Symposium on Document Engineering, DocEng'05, Nov 2005, Bristol, United Kingdom. pp.111-117, ⟨10.1145/1096601.1096630⟩. ⟨hal-01592827⟩

Share

Metrics

Record views

100