Outillage de l'accès aux textes par la lecture active étymologique multilingue pour apprenants berbérophones et arabophones

Abstract : Helping access to texts through etymological multilingual active reading for learners of Berber and Arabic In the context of the Etymolo project of LIG-GETALP, we are interested in active reading for Berber speakers, schooled in Arabic, then in French, then studying Berber formally. In this type of situations, a help like active reading can be very useful, and should be multilingual. Since there are many "cognates" (complete synonyms or words spelt in the same manner but not having the same meaning) between these languages, as spoken or written words, because of etymological relationships or borrowed words, we propose to help learners in their lexical learning by showing them the "cognates" of the words in the text they read, coming then to the concept of etymological active reading (LAE). We identified the possible helps in ten real situations and specified an extension of the CESSELIN/LECTURE interface (dedicated to Japanese-French) that will show the lexical equivalents in 1 to 4 target languages, and include the "etymological" aspect, so that it constitutes a significant help. On the basis of CESSELIN/LECTURE, we have produced a simulation of this extended interface, with Berber as source, and French, Arabic and Algerian Arabic dialect as targets. Since we handle under-resourced language pairs, our first problem is to collect or constitute necessary resources, that is a large set of cognates between these languages, a lexical basis usable for each language pair, and a lemmatizer for each language. In the case of Berber, we are collecting dictionaries and putting them in the JIBIKI platform, as well as monolingual corpora in Berber (written in tifinagh or tamamritrit-latin), all of these in XML format and UTF-8 encoding.
Document type :
Conference papers
Complete list of metadatas

Cited literature [12 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02054881
Contributor : Mathieu Mangeot <>
Submitted on : Saturday, March 2, 2019 - 4:38:56 PM
Last modification on : Tuesday, April 2, 2019 - 1:47:18 AM
Long-term archiving on : Friday, May 31, 2019 - 1:08:15 PM

File

TALAf2018_SAVBMMCB.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02054881, version 1

Collections

Citation

Slimane Abdellaoui, Valérie Bellynck, Mathieu Mangeot, Christian Boitet. Outillage de l'accès aux textes par la lecture active étymologique multilingue pour apprenants berbérophones et arabophones. Traitement Automatique des Langues Africaines TALAf 2018, Sep 2018, Grenoble, France. ⟨hal-02054881⟩

Share

Metrics

Record views

28

Files downloads

19