Extraction of Recurrent Patterns from Stratified Ordered Trees

Jean-Gabriel Ganascia 1
1 APA - Apprentissage et Acquisition des connaissances
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : This paper proposes a new algorithm for pattern extraction from Stratified Ordered Trees (SOT). It first describes the SOT data structure that renders possible a representation of structured sequential data. Then it shows how it is possible to extract clusters of similar recurrent patterns from any SOT. The similarity on which our clustering algorithm is based is a generalized edit distance, also described in the paper. The algorithms presented have been tested on text mining: the aim was to detect recurrent syntactical motives in texts drawn from classical literature. Hopefully, this algorithm can be applied to many different fields where data are naturally sequential (e.g. financial data, molecular biology, traces of computation, etc.)
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01571846
Contributor : Lip6 Publications <>
Submitted on : Thursday, August 3, 2017 - 5:05:46 PM
Last modification on : Thursday, March 21, 2019 - 1:09:52 PM

Links full text

Identifiers

Citation

Jean-Gabriel Ganascia. Extraction of Recurrent Patterns from Stratified Ordered Trees. ECML 2001 - 12th European Conference on Machine Learning, Sep 2001, Freiburg, Germany. pp.167-178, ⟨10.1007/3-540-44795-4_15⟩. ⟨hal-01571846⟩

Share

Metrics

Record views

57