Particular Words Mining and Article Spotting in Old French Gazettes - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2009

Particular Words Mining and Article Spotting in Old French Gazettes

Résumé

This paper describes a method to extract particular words from a printed journal of the eighteenth century and then to retrieve those words in other journals of the same historical period. After ex- tracting the words in the Gazette de Leyde dataset, we link them to the pages of contemporary revues facilitating the Humanists' work. We present two original methods: one to detect the typographic style of the words and the other to retrieve words in old document images. As the characters segmentation is not trivial for those documents we detect Italic style by analysing the vertical projection of the word image. The word retrieval method is realized by extracting zones of interest from the query images and try to map it in the searching documents.
Fichier non déposé

Dates et versions

hal-01437703 , version 1 (17-01-2017)

Identifiants

  • HAL Id : hal-01437703 , version 1

Citer

Loris Eynard, Yann Leydier, Hubert Emptoz. Particular Words Mining and Article Spotting in Old French Gazettes. Machine Learning and Data Mining in Pattern Recognition (MLDM) 2009, Jul 2009, Leipzig, Germany. pp.176-188. ⟨hal-01437703⟩
43 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More