What About Sequential Data Mining Techniques to Identify Linguistic Patterns for Stylistics?

Solen Quiniou 1 Peggy Cellier 2 Thierry Charnois 3 Dominique Legallois 1
2 LIS - Logical Information Systems
IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
3 Equipe CODAG - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
Abstract : In this paper, we study the use of data mining techniques for stylistic analysis, from a linguistic point of view, by considering emerging sequential patterns. First, we show that mining sequential patterns of words with gap constraints gives new relevant linguistic patterns with respect to patterns built on n-grams. Then, we investigate how sequential patterns of itemsets can provide more generic linguistic patterns. We validate our approach from a linguistic point of view by conducting experiments on three corpora of various types of French texts (Poetry, Letters, and Fictions). By considering more particularly poetic texts, we show that characteristic linguistic patterns can be identified using data mining techniques. We also discuss how to improve our proposed approach so that it can be used more efficiently for linguistic analyses.
Type de document :
Communication dans un congrès
International Conference on Intelligent Text Processing and Computational Linguistics (CICLing'12), Mar 2012, New Delhi, India. pp.166-177, 2012
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-00675578
Contributeur : Solen Quiniou <>
Soumis le : jeudi 1 mars 2012 - 14:46:31
Dernière modification le : jeudi 22 février 2018 - 01:24:51
Document(s) archivé(s) le : lundi 26 novembre 2012 - 10:21:32

Fichier

cicling2012.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-00675578, version 1

Citation

Solen Quiniou, Peggy Cellier, Thierry Charnois, Dominique Legallois. What About Sequential Data Mining Techniques to Identify Linguistic Patterns for Stylistics?. International Conference on Intelligent Text Processing and Computational Linguistics (CICLing'12), Mar 2012, New Delhi, India. pp.166-177, 2012. 〈hal-00675578〉

Partager

Métriques

Consultations de la notice

684

Téléchargements de fichiers

618