Automatic transcription of 17th century English text in Contemporary English with NooJ: Method and Evaluation

Abstract : Since 2006 we have undertaken to describe the differences between 17th century English and contemporary English thanks to NLP software. Studying a corpus spanning the whole century (tales of English travellers in the Ottoman Empire in the 17th century, Mary Astell's essay A Serious Proposal to the Ladies and other literary texts) has enabled us to highlight various lexical, morphological or grammatical singularities. Thanks to the NooJ linguistic platform, we created dictionaries indexing the lexical variants and their transcription in CE. The latter is often the result of the validation of forms recognized dynamically by morphological graphs. We also built syntactical graphs aimed at transcribing certain archaic forms in contemporary English. Our previous research implied a succession of elementary steps alternating textual analysis and result validation. We managed to provide examples of transcriptions, but we have not created a global tool for automatic transcription. Therefore we need to focus on the results we have obtained so far, study the conditions for creating such a tool, and analyze possible difficulties. In this paper, we will be discussing the technical and linguistic aspects we have not yet covered in our previous work. We are using the results of previous research and proposing a transcription method for words or sequences identified as archaic.
Type de document :
Communication dans un congrès
Nooj '2011 Conference, Jun 2011, Dubrovnik, Croatia
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00625884
Contributeur : Odile Piton <>
Soumis le : samedi 1 septembre 2012 - 07:00:59
Dernière modification le : lundi 9 février 2015 - 01:00:44
Document(s) archivé(s) le : dimanche 2 décembre 2012 - 02:25:08

Fichier

Automatic_transcription_of_17t...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00625884, version 1
  • ARXIV : 1109.4906

Collections

Citation

Odile Piton, Slim Mesfar, Hélène Pignot. Automatic transcription of 17th century English text in Contemporary English with NooJ: Method and Evaluation. Nooj '2011 Conference, Jun 2011, Dubrovnik, Croatia. 〈hal-00625884〉

Partager

Métriques

Consultations de
la notice

367

Téléchargements du document

248