Automatic transcription of 17th century English text in Contemporary English with NooJ: Method and Evaluation

Abstract : Since 2006 we have undertaken to describe the differences between 17th century English and contemporary English thanks to NLP software. Studying a corpus spanning the whole century (tales of English travellers in the Ottoman Empire in the 17th century, Mary Astell's essay A Serious Proposal to the Ladies and other literary texts) has enabled us to highlight various lexical, morphological or grammatical singularities. Thanks to the NooJ linguistic platform, we created dictionaries indexing the lexical variants and their transcription in CE. The latter is often the result of the validation of forms recognized dynamically by morphological graphs. We also built syntactical graphs aimed at transcribing certain archaic forms in contemporary English. Our previous research implied a succession of elementary steps alternating textual analysis and result validation. We managed to provide examples of transcriptions, but we have not created a global tool for automatic transcription. Therefore we need to focus on the results we have obtained so far, study the conditions for creating such a tool, and analyze possible difficulties. In this paper, we will be discussing the technical and linguistic aspects we have not yet covered in our previous work. We are using the results of previous research and proposing a transcription method for words or sequences identified as archaic.
Type de document :
Communication dans un congrès
Nooj '2011 Conference, Jun 2011, Dubrovnik, Croatia
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger
Contributeur : Odile Piton <>
Soumis le : samedi 1 septembre 2012 - 07:00:59
Dernière modification le : lundi 27 novembre 2017 - 14:14:02
Document(s) archivé(s) le : dimanche 2 décembre 2012 - 02:25:08


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-00625884, version 1
  • ARXIV : 1109.4906



Odile Piton, Slim Mesfar, Hélène Pignot. Automatic transcription of 17th century English text in Contemporary English with NooJ: Method and Evaluation. Nooj '2011 Conference, Jun 2011, Dubrovnik, Croatia. 〈hal-00625884〉



Consultations de la notice


Téléchargements de fichiers