Efficient Generation of Reliable Estimated Linguistic Summaries

Grégory Smits 1 Pierre Nerzic 1 Olivier Pivert 1 Marie-Jeanne Lesot 2
1 SHAMAN - Symbolic and Human-centric view of dAta MANagement
IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
2 LFI - Learning, Fuzzy and Intelligent systems
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : Summarizing data with linguistic statements is a crucial and topical issue that has been largely addressed by the soft computing community. The goal of summarization is to generate statements that linguistically describe the properties observed in a dataset. This paper addresses the issue of efficiently extracting these summaries and rendering them to the final user, in the case where the data to be summarized are stored in a relational data base: it proposes a novel strategy that leverages the statistics about the data distribution maintained by the database system. This paper shows that reliable summaries can be very efficiently estimated based on these statistics only and without any costly data access. Additionally, it proposes a visualization of the set of extracted summaries that offers a fruitful interactive exploration tool to the user. Experiments performed on two real data bases show the relevance and efficiency of the proposed approach: with a negligible loss of accuracy, we provide the first linguistic summarization approach whose processing time does not depend on the size of the dataset. The generation of estimated linguistic summaries takes less than one second even for dataset containing millions of tuples.
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01854298
Contributor : Grégory Smits <>
Submitted on : Monday, August 6, 2018 - 3:55:25 PM
Last modification on : Wednesday, March 27, 2019 - 1:34:21 AM
Document(s) archivé(s) le : Wednesday, November 7, 2018 - 1:59:16 PM

File

180201.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01854298, version 1

Citation

Grégory Smits, Pierre Nerzic, Olivier Pivert, Marie-Jeanne Lesot. Efficient Generation of Reliable Estimated Linguistic Summaries. IEEE International Conference on Fuzzy Systems , Jul 2018, Rio de Janeiro, Brazil. ⟨hal-01854298⟩

Share

Metrics

Record views

197

Files downloads

83