Skip to Main content Skip to Navigation
Journal articles

Anytime mining of sequential discriminative patterns in labeled sequences

Romain Mathonat 1 Diana Nurbakova 2 Jean-François Boulicaut 1 Mehdi Kaytoue 1
1 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
2 DRIM - Distribution, Recherche d'Information et Mobilité
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : It is extremely useful to exploit labeled datasets not only to learn models and perform predictive analytics but also to improve our understanding of a domain and its available targeted classes. The subgroup discovery task has been considered for more than two decades. It concerns the discovery of patterns covering sets of objects having interesting properties, e.g., they characterize or discriminate a given target class. Though many subgroup discovery algorithms have been proposed for both transactional and numerical data, discovering subgroups within labeled sequential data has been much less studied. First, we propose an anytime algorithm SeqScout that discovers interesting subgroups w.r.t. a chosen quality measure. This is a sampling algorithm that mines discriminant sequential patterns using a multi-armed bandit model. For a given budget, it finds a collection of local optima in the search space of descriptions and thus, subgroups. It requires a light configuration and is independent from the quality measure used for pattern scoring. We also introduce a second anytime algorithm MCTSExtent that pushes further the idea of a better trade-off between exploration and exploitation of a sampling strategy over the search space. 2 Romain Mathonat et al. sequential data mining setting. We have conducted a thorough and comprehensive evaluation of our algorithms on several datasets to illustrate their added-value, and we discuss their qualitative and quantitative results.
Document type :
Journal articles
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03000696
Contributor : Romain Mathonat Connect in order to contact the contributor
Submitted on : Thursday, November 12, 2020 - 9:31:11 AM
Last modification on : Tuesday, June 1, 2021 - 2:08:09 PM
Long-term archiving on: : Saturday, February 13, 2021 - 6:37:00 PM

File

KAIS_MCTSExtent_V2.pdf
Files produced by the author(s)

Identifiers

Citation

Romain Mathonat, Diana Nurbakova, Jean-François Boulicaut, Mehdi Kaytoue. Anytime mining of sequential discriminative patterns in labeled sequences. Knowledge and Information Systems (KAIS), Springer, 2020, pp. 439-476. ⟨10.1007/s10115-020-01523-7⟩. ⟨hal-03000696⟩

Share

Metrics

Record views

95

Files downloads

220