Deep Investigation of Cross-Language Plagiarism Detection Methods

Abstract : This paper is a deep investigation of cross-language plagiarism detection methods on a new recently introduced open dataset, which contains parallel and comparable collections of documents with multiple characteristics (different genres, languages and sizes of texts). We investigate cross-language plagiarism detection methods for 6 language pairs on 2 granularities of text units in order to draw robust conclusions on the best methods while deeply analyzing correlations across document styles and languages.
Type de document :
Communication dans un congrès
BUCC, 10th Workshop on Building and Using Comparable Corpora, Aug 2017, Vancouver, Canada. BUCC, 10th Workshop on Building and Using Comparable Corpora, 2017
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01531346
Contributeur : Jérémy Ferrero <>
Soumis le : jeudi 1 juin 2017 - 15:43:45
Dernière modification le : jeudi 11 octobre 2018 - 08:48:03
Document(s) archivé(s) le : mercredi 6 septembre 2017 - 19:10:53

Fichier

bucc.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01531346, version 1

Collections

Citation

Jérémy Ferrero, Laurent Besacier, Didier Schwab, Frédéric Agnès. Deep Investigation of Cross-Language Plagiarism Detection Methods. BUCC, 10th Workshop on Building and Using Comparable Corpora, Aug 2017, Vancouver, Canada. BUCC, 10th Workshop on Building and Using Comparable Corpora, 2017. 〈hal-01531346〉

Partager

Métriques

Consultations de la notice

129

Téléchargements de fichiers

127