Textual indexation of ancient documents

Yann Leydier; Frank Le Bourgeois; Hubert Emptoz

doi:10.1145/1096601.1096630

Communication Dans Un Congrès Année : 2005

Textual indexation of ancient documents

(1) , (1) , (1)

Yann Leydier

Fonction : Auteur

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Frank Le Bourgeois

Fonction : Auteur
PersonId : 7699
IdHAL : frank-le-bourgeois

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Hubert Emptoz

Fonction : Auteur

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Résumé

In the past years many levels of indexation have been developped to allow a fast retrieval of digitized documents. Among all the ways of indexing a document, textual indexation allows the finest querries on a the documents' content. Usually, the plain text transcription of a digitized document is obtained by applying an OCR (Optical Character Recognition) software on it. What if the OCR fails? Indeed OCR systems are inefficient on low-quality printed documents, and are unsuited to the processing of ancient fonts. Furthermore, OCR is not applicable to manuscript text recognition. In this paper we introduce two alternative methods of accessing to text trough the image: the Computer Assisted Transcription and the Word Spotting.

Domaines

Informatique [cs]

Équipe gestionnaire des publications SI LIRIS : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01592827

Soumis le : lundi 25 septembre 2017-14:35:16

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Dates et versions

hal-01592827 , version 1 (25-09-2017)

Identifiants

HAL Id : hal-01592827 , version 1
DOI : 10.1145/1096601.1096630

Citer

Yann Leydier, Frank Le Bourgeois, Hubert Emptoz. Textual indexation of ancient documents. ACM Symposium on Document Engineering, DocEng'05, Nov 2005, Bristol, United Kingdom. pp.111-117, ⟨10.1145/1096601.1096630⟩. ⟨hal-01592827⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS INSA-GROUPE UDL

67 Consultations

0 Téléchargements

Textual indexation of ancient documents

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager