Using Histograms for Skyline Size Estimation

Nicolas Hanusse; Patrick Kamnang Wanko; Sofian Maabout

Communication Dans Un Congrès Année : 2016

Using Histograms for Skyline Size Estimation

(1) , , (1)

Nicolas Hanusse

Fonction : Auteur
PersonId : 740797
IdHAL : nicolas-hanusse

Laboratoire Bordelais de Recherche en Informatique

Patrick Kamnang Wanko

Fonction : Auteur
PersonId : 976702

Sofian Maabout

Fonction : Auteur
PersonId : 3544
IdHAL : sofian-maabout
IdRef : 148141978

Laboratoire Bordelais de Recherche en Informatique

Résumé

Let T be a table of n points described by a set of d attributes/ dimensions. Let p and q be two objects in T. p dominates q iff it is better than q in every dimension and there exists at least one attribute for which p is strictly better than p. p is a skyline point of T iff it is not dominated by any point of T. A skyline query returns the set of all skyline points. In order to integrate Skyline queries into database management systems, deriving an estimation of the skyline cardinality is important for query optimization purposes. We propose techniques for estimating skyline cardinality when data distribution is known. We first provide an unbiased estimator which requires one traversal of the whole data which is much faster than computing the exact skyline. Then, we show that this estimator can be used on a sample of the underlying data while preserving the estimation quality, i.e., it is still unbiased. Next, we provide a convergent estimator which does not require any data access but the data distribution. It estimates skyline cardinality expectation for those data sets respecting data distribution. The advantages of these solutions are their ease of implementation and, by contrast to other proposals, no costly subskyline queries are required. Our solutions are implemented and some experiments are reported showing both the accuracy of the estimations and the execution time efficiency by which they are obtained.

Domaines

Base de données [cs.DB]

Sofian Maabout : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01316242

Soumis le : lundi 16 mai 2016-09:56:06

Dernière modification le : vendredi 24 mars 2023-14:53:02

Dates et versions

hal-01316242 , version 1 (16-05-2016)

Identifiants

HAL Id : hal-01316242 , version 1

Citer

Nicolas Hanusse, Patrick Kamnang Wanko, Sofian Maabout. Using Histograms for Skyline Size Estimation. 20th International Database Engineering & Applications Symposium (IDEAS'16), Jul 2016, Montreal, Canada. ⟨hal-01316242⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LABRI-MABIOVIS

67 Consultations

0 Téléchargements

Using Histograms for Skyline Size Estimation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager