POS-tagging of Tunisian Dialect Using Standard Arabic Resources and Tools - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

POS-tagging of Tunisian Dialect Using Standard Arabic Resources and Tools

Résumé

Developing natural language processing tools usually requires a large number of resources (lexica, annotated corpora, etc.), which often do not exist for less-resourced languages. One way to overcome the problem of lack of resources is to devote substantial efforts to build new ones from scratch. Another approach is to exploit existing resources of closely related languages. In this paper, we focus on developing a part-of-speech tagger for the Tunisian Arabic dialect (TUN), a low-resource language, by exploiting its close-ness to Modern Standard Arabic (MSA), which has many state-of-the-art resources and tools. Our system achieved an accuracy of 89% (∼20% absolute improvement over an MSA tagger baseline).
Fichier principal
Vignette du fichier
W15-3207.pdf (172.72 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01464860 , version 1 (15-02-2017)

Identifiants

Citer

Ahmed Hamdi, Alexis Nasr, Nizar Habash, Núria Gala. POS-tagging of Tunisian Dialect Using Standard Arabic Resources and Tools. Workshop on Arabic Natural Language Processing, Jul 2015, Beijing, China. pp.59 - 68, ⟨10.18653/v1/W15-3207⟩. ⟨hal-01464860⟩
206 Consultations
263 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More