Deep Investigation of Cross-Language Plagiarism Detection Methods

Abstract : This paper is a deep investigation of cross-language plagiarism detection methods on a new recently introduced open dataset, which contains parallel and comparable collections of documents with multiple characteristics (different genres, languages and sizes of texts). We investigate cross-language plagiarism detection methods for 6 language pairs on 2 granularities of text units in order to draw robust conclusions on the best methods while deeply analyzing correlations across document styles and languages.
Liste complète des métadonnées

Cited literature [18 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01531346
Contributor : Jérémy Ferrero <>
Submitted on : Thursday, June 1, 2017 - 3:43:45 PM
Last modification on : Tuesday, February 12, 2019 - 1:31:24 AM
Document(s) archivé(s) le : Wednesday, September 6, 2017 - 7:10:53 PM

File

bucc.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01531346, version 1

Collections

Citation

Jérémy Ferrero, Laurent Besacier, Didier Schwab, Frédéric Agnès. Deep Investigation of Cross-Language Plagiarism Detection Methods. BUCC, 10th Workshop on Building and Using Comparable Corpora, Aug 2017, Vancouver, Canada. ⟨hal-01531346⟩

Share

Metrics

Record views

138

Files downloads

142