Efficiently mining frequent itemsets applied for textual aggregation - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Applied Intelligence Année : 2017

Efficiently mining frequent itemsets applied for textual aggregation

Résumé

Abstract Text mining approaches are commonly used to discover relevant information and relationships in huge amounts of text data. The term data mining refers to methods for analyzing data with the objective of finding patterns that aggregate the main properties of the data. The merger between the data mining approaches and on-line analytical processing (OLAP) tools allows us to refine techniques used in textual aggregation. In this paper, we propose a novel aggregation function for textual data based on the discovery of frequent closed patterns in a generated documents/keywords matrix. Our contribution aims at using a data mining technique, mainly a closed pattern mining algorithm, to aggregate keywords. An experimental study on a real corpus of more than 700 scientific papers collected on Microsoft Academic Search shows that the proposed algorithm largely outperforms four state-of-the-art textual aggregation methods in terms of recall, precision, F-measure and runtime.
Fichier non déposé

Dates et versions

halshs-01577035 , version 1 (25-08-2017)

Identifiants

Citer

Mustapha M Bouakkaz, Youcef Ouinten, Sabine Loudcher, Philippe Fournier-Viger. Efficiently mining frequent itemsets applied for textual aggregation. Applied Intelligence, 2017, ⟨10.1007/s10489-017-1050-9⟩. ⟨halshs-01577035⟩
89 Consultations
2 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More