Collaborative Construction of a Good Quality, Broad Coverage and Copyright Free Japanese-French Dictionary

Abstract : Although French and Japanese are regarded as well-resourced languages concerning tools and linguistic resources, the French-Japanese couple is considered an under-resourced language pair regarding its availability on the Web. Indeed, there are few bilingual electronic lexical resources of quality and which are both royalty and copyright free. French-Japanese bilingual aligned corpora and machine translation systems are logically equally rare. Fortunately, there are printed French-Japanese dictionaries of good quality and which are sufficiently old to be royalty-free. It should be possible to reuse these resources as part of our project to build a good quality and broad coverage dictionary available on the Web. In order to update this data whose vocabulary might be old, we could reuse existing electronic resources such as Wikipedia or Japanese-English electronic resources. The resulting resource could be then available on the Web for lookup and correction by voluntary contributors. This methodology could be applied to other language couples in a similar situation with good printed dictionaries but few electronic resources. We first conduct an inventory of Japanese bilingual dictionaries (printed or electronic) with their historical evolution. Then, we describe the resource we want to build. The next part concerns the conversion of three resources: the Cesselin Japanese-French printed dictionary, the language links between Japanese, French and English Wikipedia pages and the JMdict Japanese-English electronic dictionary. The Cesselin dictionary has been scanned, OCRized and parsed to detect headwords and entries. Then several error correction were performed on French and Japanese. New entries were created from Wikipedia links and finally, missing JMdict dictionary entries missing in the result resource were converted and added. Finally, we released the resource on a Web site built around the Jibiki platform allowing articles to be viewed and edited online. A French-Japanese bilingual corpus and an active reading moduel are also available. The resulting resources (dictionaries and corpora) are available for download on the project website. The data is released under public domain.
Type de document :
Article dans une revue
International Journal of Lexicography, Oxford University Press (OUP), 2016, 31 (1), pp.78-112. 〈https://academic.oup.com/ijl/article-abstract/31/1/78/2555494〉. 〈10.1093/ijl/ecw035〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01712271
Contributeur : Mathieu Mangeot <>
Soumis le : lundi 19 février 2018 - 12:17:49
Dernière modification le : jeudi 11 octobre 2018 - 08:48:03

Identifiants

Collections

Citation

Mathieu Mangeot-Nagata. Collaborative Construction of a Good Quality, Broad Coverage and Copyright Free Japanese-French Dictionary. International Journal of Lexicography, Oxford University Press (OUP), 2016, 31 (1), pp.78-112. 〈https://academic.oup.com/ijl/article-abstract/31/1/78/2555494〉. 〈10.1093/ijl/ecw035〉. 〈hal-01712271〉

Partager

Métriques

Consultations de la notice

83