Sequence Covering Similarity for Symbolic Sequence Comparison

Pierre-François Marteau 1
1 EXPRESSION - Expressiveness in Human Centered Data/Media
UBS - Université de Bretagne Sud, IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : This paper introduces the sequence covering similarity, that we formally define for evaluating the similarity between a symbolic sequence (string) and a set of symbolic sequences (strings). From this covering similarity we derive a pair-wise distance to compare two symbolic sequences. We show that this covering distance is a semimetric. Few examples are given to show how this string metric in $O(n \cdot log n)$ compares with the Levenshtein's distance that is in $O(n^2)$. A final example presents its application to plagiarism detection.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas
Contributor : Pierre-François Marteau <>
Submitted on : Thursday, March 8, 2018 - 3:42:48 PM
Last modification on : Friday, April 19, 2019 - 4:55:10 PM
Long-term archiving on : Saturday, June 9, 2018 - 2:17:05 PM


Files produced by the author(s)


  • HAL Id : hal-01689286, version 3
  • ARXIV : 1801.07013


Pierre-François Marteau. Sequence Covering Similarity for Symbolic Sequence Comparison. 2018. ⟨hal-01689286v3⟩



Record views


Files downloads