Creating Parallel Arabic Dialect Corpus: Pitfalls to Avoid - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Creating Parallel Arabic Dialect Corpus: Pitfalls to Avoid

Résumé

Creating parallel corpora is a difficult issue that many researches try to deal with. In the context of under-resourced languages like Arabic dialects this issue is more complicated due to the nature of these spoken languages. In this paper, we share our experiment of creating a Parallel Corpus which contain several dialects and Modern Standard Arabic(MSA). We attempt to highlight the most important choices that we did and how good were these choices.
Fichier principal
Vignette du fichier
paper 258.pdf (210.69 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01557405 , version 1 (06-07-2017)

Identifiants

  • HAL Id : hal-01557405 , version 1

Citer

Salima Harrat, Karima Meftouh, Kamel Smaïli. Creating Parallel Arabic Dialect Corpus: Pitfalls to Avoid. 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLING), Apr 2017, Budapest, Hungary. ⟨hal-01557405⟩
375 Consultations
644 Téléchargements

Partager

Gmail Facebook X LinkedIn More