Detecting Derivatives using Specific and Invariant Descriptors - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

Detecting Derivatives using Specific and Invariant Descriptors

Fabien Poulard
  • Fonction : Auteur

Résumé

This paper explores the detection of derivation linksbetween texts (otherwise called plagiarism, near-duplication,revision, etc.) at the document level. We evaluate the use of textualelements implementing the ideas of specificity and invariance aswell as their combination to characterize derivatives. We builta French press corpus based on Wikinews revisions to run thisevaluation. We obtain performances similar to the state of theart method (n-grams overlap) while reducing the signature sizeand so, the processing costs. In order to ensure the verifiabilityand the reproducibility of our results we make our code as wellas our corpus available to the community.
Fichier non déposé

Dates et versions

hal-01160955 , version 1 (08-06-2015)

Identifiants

  • HAL Id : hal-01160955 , version 1

Citer

Fabien Poulard, Nicolas Hernandez, Béatrice Daille. Detecting Derivatives using Specific and Invariant Descriptors. Twelveth International Conference on Computational Linguistics and Intelligent Text Processing (CICLING 2010), Mar 2010, Tokyo, Japan. pp.7--13. ⟨hal-01160955⟩
63 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More