Alignment Of Bilingual Named Entities In French– Arabic Parallel Corpora - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

Alignment Of Bilingual Named Entities In French– Arabic Parallel Corpora

Résumé

Researches in the field of Named Entity recognition and alignment are of strong interest for various applications of natural language processing, such as Cross Lingual Information Retrieval, document management, question-answering systems, data mining etc. But in the processing of Arabic language, the task is particularly difficult and few resources are available to cope with these difficulties. In this paper, we present a simple method of character transcoding-a kind of transliteration that we call character reduction-which could improve an aligning system for Named Entities such as anthroponyms and toponyms. This system has been applied and evaluated on a French-Arabic parallel corpus that has been used during the Arcade 2 evaluation campaign. The purpose of this method is to bring the graphic forms of both languages close together as much as possible, in order to increase aligning precision. An outcome of such aligning is the ability to project on the target language (Arabic) annotations that has been done on the source language, for which more tools and resources are available (French, English, etc.).
Fichier principal
Vignette du fichier
ACIT_2008.Abdoulhay.Kraif.final.pdf (98.74 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01073705 , version 1 (14-03-2019)

Identifiants

  • HAL Id : hal-01073705 , version 1

Citer

Authoul Abdulhay, Olivier Kraif. Alignment Of Bilingual Named Entities In French– Arabic Parallel Corpora. ACIT 2008, 2008, Hammamet, Tunisia. pp.1-8. ⟨hal-01073705⟩

Collections

UGA LIDILEM
56 Consultations
36 Téléchargements

Partager

Gmail Facebook X LinkedIn More