Efficient Generation of Reliable Estimated Linguistic Summaries

Grégory Smits 1 Pierre Nerzic 1 Olivier Pivert 1 Marie-Jeanne Lesot 2
1 SHAMAN - Symbolic and Human-centric view of dAta MANagement
IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
2 LFI - Learning, Fuzzy and Intelligent systems
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : Summarizing data with linguistic statements is a crucial and topical issue that has been largely addressed by the soft computing community. The goal of summarization is to generate statements that linguistically describe the properties observed in a dataset. This paper addresses the issue of efficiently extracting these summaries and rendering them to the final user, in the case where the data to be summarized are stored in a relational data base: it proposes a novel strategy that leverages the statistics about the data distribution maintained by the database system. This paper shows that reliable summaries can be very efficiently estimated based on these statistics only and without any costly data access. Additionally, it proposes a visualization of the set of extracted summaries that offers a fruitful interactive exploration tool to the user. Experiments performed on two real data bases show the relevance and efficiency of the proposed approach: with a negligible loss of accuracy, we provide the first linguistic summarization approach whose processing time does not depend on the size of the dataset. The generation of estimated linguistic summaries takes less than one second even for dataset containing millions of tuples.
Type de document :
Communication dans un congrès
IEEE International Conference on Fuzzy Systems , Jul 2018, Rio de Janeiro, Brazil
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01854298
Contributeur : Grégory Smits <>
Soumis le : lundi 6 août 2018 - 15:55:25
Dernière modification le : vendredi 31 août 2018 - 09:25:57

Fichier

180201.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01854298, version 1

Citation

Grégory Smits, Pierre Nerzic, Olivier Pivert, Marie-Jeanne Lesot. Efficient Generation of Reliable Estimated Linguistic Summaries. IEEE International Conference on Fuzzy Systems , Jul 2018, Rio de Janeiro, Brazil. 〈hal-01854298〉

Partager

Métriques

Consultations de la notice

161

Téléchargements de fichiers

48