Efficiently mining frequent itemsets applied for textual aggregation

Mustapha M Bouakkaz; Youcef Ouinten; Sabine Loudcher; Philippe Fournier-Viger

doi:10.1007/s10489-017-1050-9

Article Dans Une Revue Applied Intelligence Année : 2017

Efficiently mining frequent itemsets applied for textual aggregation

(1) , (1) , (2) , (3)

1
2
3

Mustapha M Bouakkaz

Fonction : Auteur

Université Amar Telidji - Laghouat

Youcef Ouinten

Fonction : Auteur

Université Amar Telidji - Laghouat

Sabine Loudcher

Fonction : Auteur
PersonId : 869063
IdHAL : sabine-loudcher
IdRef : 112760937

Entrepôts, Représentation et Ingénierie des Connaissances

Philippe Fournier-Viger

Fonction : Auteur
PersonId : 992449

Université de Moncton

Résumé

Abstract Text mining approaches are commonly used to discover relevant information and relationships in huge amounts of text data. The term data mining refers to methods for analyzing data with the objective of finding patterns that aggregate the main properties of the data. The merger between the data mining approaches and on-line analytical processing (OLAP) tools allows us to refine techniques used in textual aggregation. In this paper, we propose a novel aggregation function for textual data based on the discovery of frequent closed patterns in a generated documents/keywords matrix. Our contribution aims at using a data mining technique, mainly a closed pattern mining algorithm, to aggregate keywords. An experimental study on a real corpus of more than 700 scientific papers collected on Microsoft Academic Search shows that the proposed algorithm largely outperforms four state-of-the-art textual aggregation methods in terms of recall, precision, F-measure and runtime.

Mots clés

Data mining Closed keywords Textual aggregation OLAP

Domaines

Méthodes et statistiques

Sabine Loudcher : Connectez-vous pour contacter le contributeur

https://shs.hal.science/halshs-01577035

Soumis le : vendredi 25 août 2017-12:18:06

Dernière modification le : mercredi 25 octobre 2023-14:38:02

Dates et versions

halshs-01577035 , version 1 (25-08-2017)

Identifiants

HAL Id : halshs-01577035 , version 1
DOI : 10.1007/s10489-017-1050-9

Citer

Mustapha M Bouakkaz, Youcef Ouinten, Sabine Loudcher, Philippe Fournier-Viger. Efficiently mining frequent itemsets applied for textual aggregation. Applied Intelligence, 2017, ⟨10.1007/s10489-017-1050-9⟩. ⟨halshs-01577035⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LYON1 UNIV-LYON2 ERIC LABEXIMU UDL

89 Consultations

2 Téléchargements

Efficiently mining frequent itemsets applied for textual aggregation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager