Skip to Main content Skip to Navigation
Conference papers

Clustering Comparable Corpora For Bilingual Lexicon Extraction

Abstract : We study in this paper the problem of enhancing the comparability of bilingual corpora in order to improve the quality of bilingual lexicons extracted from comparable corpora. We introduce a clustering-based approach for enhancing corpus comparability which exploits the homogeneity feature of the corpus, and finally preserves most of the vocabulary of the original corpus. Our experiments illustrate the well-foundedness of this method and show that the bilingual lexicons obtained from the homogeneous corpus are of better quality than the lexicons obtained with previous approaches.
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00742264
Contributor : Eric Gaussier <>
Submitted on : Wednesday, October 17, 2012 - 4:12:22 PM
Last modification on : Monday, April 20, 2020 - 11:24:01 AM
Document(s) archivé(s) le : Friday, January 18, 2013 - 3:44:37 AM

File

Li-Gaussier-Azawa-ACL_11_web.p...
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-00742264, version 1

Citation

Li Bo, Éric Gaussier, Akiko Aizawa. Clustering Comparable Corpora For Bilingual Lexicon Extraction. ACL-HLT 2011, Jun 2011, Portland, Oregon, United States. pp.473-478. ⟨hal-00742264⟩

Share

Metrics

Record views

363

Files downloads

314