Construction of an open-source multilingual lexical system targeted on French and Japanese through contributive and automatic methods: Research project in Japan

Abstract : This research is located in the natural language processing (NLP) domain, at the intersection of computer science and linguistics, more specifically on multilingual lexicography and lexicology. In a first long stay in Japan from November 2001 to March 2004, we made the observation that the French-Japanese lexical resources available on the Web were almost nonexistent. Which gave birth to the Papillon project of building a multilingual lexical database with a pivot structure (Sérasset et al., 2001). Since then, progress has been made in several areas (technical, academic, social) (Mangeot, 2006), but the production of real data has made very little progress. On the other hand, reuse of lexical resources is trendy (WSD, use of open-source resources like Wiktionary or Dbpedia, fusion with ontologies, etc.). Even if they can consolidate and expand the coverage of existing resources, these experiences always start from data created manually by lexicographers. Based on this observation, we defined the following project which involves building a multilingual lexical system with focus on the French-Japanese language pair. Construction will be based on the reuse of existing resources (Franco-Japanese lexicons, Wiktionary) and automatic operations (reification of translation links, word senses disambiguation) and also on a community of volunteer contributors working on the Web. They will be asked to contribute either via serious lexical games, or directly on dictionary entries according to their level of expertise and knowledge in the field of lexicography or bilingual translation. Resources generated will be royalty free and designed to be used both by humans via bilingual dictionaries and tools for automatic language processing (analysis, machine translation, etc.). We will begin with a brief inventory of bilingual lexicography in general and focus on French- Japanese in particular. We then present recent advances in the field of construction of lexical resources online. Then, we describe in more detail the lexical system that we plan to build. We conclude with a description of the steps involved in this construction.
Type de document :
Rapport
[Research Report] Laboratoire d'Informatique de Grenoble. 2015
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01294562
Contributeur : Mathieu Mangeot <>
Soumis le : mardi 29 mars 2016 - 14:31:22
Dernière modification le : jeudi 11 octobre 2018 - 08:48:03
Document(s) archivé(s) le : jeudi 30 juin 2016 - 16:35:02

Fichier

ResearchProjectInJapanMathieuM...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01294562, version 1

Collections

Citation

Mathieu Mangeot. Construction of an open-source multilingual lexical system targeted on French and Japanese through contributive and automatic methods: Research project in Japan. [Research Report] Laboratoire d'Informatique de Grenoble. 2015. 〈hal-01294562〉

Partager

Métriques

Consultations de la notice

173

Téléchargements de fichiers

151