Skip to Main content Skip to Navigation
Conference papers

What About Sequential Data Mining Techniques to Identify Linguistic Patterns for Stylistics?

Solen Quiniou 1 Peggy Cellier 2 Thierry Charnois 3 Dominique Legallois 1
2 LIS - Logical Information Systems
3 Equipe CODAG - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image et Instrumentation de Caen
Abstract : In this paper, we study the use of data mining techniques for stylistic analysis, from a linguistic point of view, by considering emerging sequential patterns. First, we show that mining sequential patterns of words with gap constraints gives new relevant linguistic patterns with respect to patterns built on n-grams. Then, we investigate how sequential patterns of itemsets can provide more generic linguistic patterns. We validate our approach from a linguistic point of view by conducting experiments on three corpora of various types of French texts (Poetry, Letters, and Fictions). By considering more particularly poetic texts, we show that characteristic linguistic patterns can be identified using data mining techniques. We also discuss how to improve our proposed approach so that it can be used more efficiently for linguistic analyses.
Complete list of metadata
Contributor : Solen Quiniou Connect in order to contact the contributor
Submitted on : Thursday, March 1, 2012 - 2:46:31 PM
Last modification on : Wednesday, December 15, 2021 - 1:42:02 PM
Long-term archiving on: : Monday, November 26, 2012 - 10:21:32 AM


Publisher files allowed on an open archive



Solen Quiniou, Peggy Cellier, Thierry Charnois, Dominique Legallois. What About Sequential Data Mining Techniques to Identify Linguistic Patterns for Stylistics?. CICLing 2012: Computational Linguistics and Intelligent Text Processing, Mar 2012, New Delhi, India. pp.166-177, ⟨10.1007/978-3-642-28604-9_14⟩. ⟨hal-00675578⟩



Les métriques sont temporairement indisponibles