A semi-automatically generated TAG for Arabic: Dealing with linguistic phenomena

Abstract : Arabic is a challenging language when it comes to grammar production and parsing. It combines complex linguistic phenomena with a rich morphology that make its processing particularly ambiguous. This leaded us to choose the Tree-Adjoining Grammar (TAG) formalism. Indeed, TAG provides sufficient constraints for handling diverse linguistic phenomena and seems to be adequate to represent Arabic syntactic structures. In this paper, we present a semi-automatically generated TAG for modern standard Arabic using a compiler and a metagrammatical description language called XMG (eXtensible MetaGrammar). We focus on the linguistic coverage of our grammar, and show how we used TAG and XMG’s properties to define in an expressive and concise way different linguistic phenomena. To check the coverage of our grammar, we have set up a development environment including a parser and using a test corpus of linguistic phenomena gathering both grammatical and ungrammatical sentences.
Type de document :
Communication dans un congrès
19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2018), Mar 2018, Hanoi, Vietnam. 2018
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01762597
Contributeur : Yannick Parmentier <>
Soumis le : mardi 10 avril 2018 - 11:39:18
Dernière modification le : vendredi 26 octobre 2018 - 10:27:13

Identifiants

  • HAL Id : hal-01762597, version 1

Citation

Chérifa Ben Khelil, Chiraz Zribi, Denys Duchier, Yannick Parmentier. A semi-automatically generated TAG for Arabic: Dealing with linguistic phenomena. 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2018), Mar 2018, Hanoi, Vietnam. 2018. 〈hal-01762597〉

Partager

Métriques

Consultations de la notice

119