Discovering linguistic patterns using sequence mining

Nicolas Béchet 1 Peggy Cellier 2 Thierry Charnois 1 Bruno Crémilleux 1
1 Equipe CODAG - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
2 LIS - Logical Information Systems
IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
Abstract : In this paper, we present a method based on data mining techniques to automatically discover linguistic patterns matching appositive qualifying phrases. We develop an algorithm mining sequential patterns made of itemsets with gap and linguistic constraints. The itemsets allow several kinds of information to be associated with one term. The advantage is the extraction of linguistic patterns with more expressiveness than the usual sequential patterns. In addition, the constraints enable to automatically prune irrelevant patterns. In order to manage the set of generated patterns, we propose a solution based on a partial ordering. A human user can thus easily validate them as relevant linguistic patterns.We illustrate the efficiency of our approach over two corpora coming from a newspaper
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01023109
Contributor : Greyc Référent <>
Submitted on : Tuesday, July 15, 2014 - 9:15:45 AM
Last modification on : Tuesday, February 26, 2019 - 6:06:03 PM
Long-term archiving on: Thursday, November 20, 2014 - 6:15:11 PM

File

ACTI-BECHET-2012-1.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01023109, version 1

Citation

Nicolas Béchet, Peggy Cellier, Thierry Charnois, Bruno Crémilleux. Discovering linguistic patterns using sequence mining. 13th Int. Conf. on Intelligent Text Processing and Computational Linguistics (CICLing'12), Mar 2012, new delhi, India. pp.154-165. ⟨hal-01023109⟩

Share

Metrics

Record views

506

Files downloads

366