Skip to Main content Skip to Navigation
Conference papers

Fouille de données pour la stylistique : cas des motifs séquentiels émergents

Solen Quiniou 1, 2 Peggy Cellier 3 Thierry Charnois 1 Dominique Legallois 2 
1 Equipe CODAG - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image et Instrumentation de Caen
3 LIS - Logical Information Systems
IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
Abstract : In this paper, we study the use of data mining techniques for stylistic analysis, from a linguistic point of view, by considering emerging sequential patterns. First, we show that mining sequential patterns of words with gapconstraints gives new relevant linguistic patterns with respect to patterns built on state-of-the-art n-grams. Then, we investigate how sequential patterns of itemsets can provide more generic linguistic patterns. We validate our approach both from a quantitative and a linguistic point of view by conducting experiments on three corpora of various types of French texts (poetry, letters, and fiction, respectively). By considering more particularly poetic texts, we show that characteristic linguistic patterns can be identified using data mining techniques.
Complete list of metadata

Cited literature [10 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00675586
Contributor : Solen Quiniou Connect in order to contact the contributor
Submitted on : Thursday, March 1, 2012 - 3:08:39 PM
Last modification on : Saturday, June 25, 2022 - 9:46:51 AM
Long-term archiving on: : Thursday, June 14, 2012 - 5:05:45 PM

File

jadt2012.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00675586, version 1

Citation

Solen Quiniou, Peggy Cellier, Thierry Charnois, Dominique Legallois. Fouille de données pour la stylistique : cas des motifs séquentiels émergents. 11es Journées Internationales d'Analyse Statistique des Données Textuelles (JADT'12), Jun 2012, Liège, Belgique. pp.821-833. ⟨hal-00675586⟩

Share

Metrics

Record views

493

Files downloads

611