Skip to Main content Skip to Navigation
Book sections

Collaborative construction of a good quality, broad coverage and copyright free Japanese-French dictionary

Abstract : This research project is located in the field of natural language processing (NLP), at the intersection of computer science and linguistics, specifically multilingual lexicography and lexicology. Concerning the Web, although French and Japanese are two well resourced languages (Berment, 2004), is not the case of the French-Japanese couple: - Electronic French-Japanese bilingual dictionaries (denshi jishô) can not be copied to a computer or reused; - There is a French-Japanese dictionary on the Web1, but it only contains 40 000 entries, no examples and is not available for download. There are collaborative Web dictionaries such as the Japanese-English JMdict project led by Jim Breen (2004) that contains over 173,000 items. These resources are freely downloadable. It is therefore possible to carry out such projects. During a first stay in Japan from November 2001 to March 2004, we had already noticed the lack of French-Japanese bilingual resources on the Web. Which gave rise to the Papillon project about the construction of a multilingual lexical database with a pivot structure (Sérasset et al., 2001). Since then, progress has been made in several areas (technical, theoretical, social) (Mangeot, 2006) but the actual production of data has made very little progress. On the other hand, there is a new trend in reusing existing lexical resources (word sense disambiguation, using open source resources (Wiktionary, dbpedia) merging with ontologies, etc.). Although they allow to consolidate and expand the coverage of existing resources, these experiences still use data created by hand by professional lexicographers. There are printed French-Japanese dictionaries of good quality and sufficiently old to be royalty free. It should be possible to reuse these resources as part of our project to build a good quality dictionary and broad coverage available on the Web. Based on this observation, we defined the following project to build a rich multilingual lexical system with priority over French-Japanese languages. The construction will be done first by reusing existing resources (printed Japanese-French dictionaries, Japanese-other language dictionaries, 1  Wikipedia) and automatic operations (scanning and corrections, calculating translation links) and then by volunteer contributors working as a community on the Web. They will have to contribute to dictionary articles according to their level of expertise and knowledge in the field of lexicography or bilingual translation. The resulting resources will be royalty-free and intended for use by both humans via conventional bilingual dictionaries and by machines for automatic language processing tools (analysis, machine translation, etc.).
Document type :
Book sections
Complete list of metadata

Cited literature [23 references]  Display  Hide  Download
Contributor : Mathieu Mangeot Connect in order to contact the contributor
Submitted on : Tuesday, March 29, 2016 - 2:35:42 PM
Last modification on : Wednesday, July 6, 2022 - 4:24:10 AM
Long-term archiving on: : Thursday, June 30, 2016 - 4:30:37 PM


Files produced by the author(s)


  • HAL Id : hal-01294566, version 1


Mathieu Mangeot. Collaborative construction of a good quality, broad coverage and copyright free Japanese-French dictionary. Hosei University International Found Foreign Scholar Fellowship Report, 2016, Volume XVI 2013-2014. ⟨hal-01294566⟩



Record views


Files downloads