Computerization of African languages-French dictionaries

Abstract : This paper relates work done during the DiLAF project. It consists in converting 5 bilingual African language-French dictionaries originally in Word format into XML following the LMF model. The languages processed are Bambara, Hausa, Kanuri, Tamajaq and Songhai-zarma, still considered as under-resourced languages concerning Natural Language Processing tools. Once converted, the dictionaries are available online on the Jibiki platform for lookup and modification. The DiLAF project is first presented. A description of each dictionary follows. Then, the conversion methodology from .doc format to XML files is presented. A specific point on the usage of Unicode follows. Then, each step of the conversion into XML and LMF is detailed. The last part presents the Jibiki lexical resources management platform used for the project.
Type de document :
Communication dans un congrès
CCURL 2014 : Collaboration and Computing for Under Resourced Languages in the Linked Open Data Era, May 2014, Reykjavik, Iceland. pp.121, 2014
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00994821
Contributeur : Mathieu Mangeot <>
Soumis le : jeudi 22 mai 2014 - 11:07:52
Dernière modification le : samedi 15 décembre 2018 - 01:49:36

Fichier

ENGUEHARD_DiLAF_WSLREC2014_fin...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00994821, version 1
  • ARXIV : 1405.5893

Citation

Chantal Enguehard, Mathieu Mangeot. Computerization of African languages-French dictionaries. CCURL 2014 : Collaboration and Computing for Under Resourced Languages in the Linked Open Data Era, May 2014, Reykjavik, Iceland. pp.121, 2014. 〈hal-00994821〉

Partager

Métriques

Consultations de la notice

282

Téléchargements de fichiers

274