A Proposition for Sequence Mining Using Pattern Structures

Victor Codocedo 1 Guillaume Bosc 1 Mehdi Kaytoue 1 Jean-François Boulicaut 1 Amedeo Napoli 2
1 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
2 ORPAILLEUR - Knowledge representation, reasonning
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In this article we present a novel approach to rare sequence mining using pattern structures. Particularly, we are interested in mining closed sequences, a type of maximal sub-element which allows providing a succinct description of the patterns in a sequence database. We present and describe a sequence pattern structure model in which rare closed subsequences can be easily encoded. We also propose a discussion and characterization of the search space of closed sequences and, through the notion of sequence alignments, provide an intuitive implementation of a similarity operator for the sequence pattern structure based on directed acyclic graphs. Finally, we provide an experimental evaluation of our approach in comparison with state-of-the-art closed sequence mining algorithms showing that our approach can largely outperform them when dealing with large regions of the search space.
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01549107
Contributor : Mehdi Kaytoue <>
Submitted on : Wednesday, June 28, 2017 - 3:07:31 PM
Last modification on : Thursday, February 7, 2019 - 3:27:56 PM
Long-term archiving on : Wednesday, January 17, 2018 - 8:33:35 PM

File

ICFCA2017.pdf
Files produced by the author(s)

Identifiers

Citation

Victor Codocedo, Guillaume Bosc, Mehdi Kaytoue, Jean-François Boulicaut, Amedeo Napoli. A Proposition for Sequence Mining Using Pattern Structures. ICFCA 2017 - 14th International Conference on Formal Concept Analysis, Peggy Cellier and Sébastien Ferré, Jun 2017, Rennes, France. pp.106-121, ⟨10.1007/978-3-319-59271-8_7⟩. ⟨hal-01549107⟩

Share

Metrics

Record views

752

Files downloads

753