Emerging Cubes: Borders, size estimations and lossless reductions

Abstract : Discovering trend reversals between two data cubes provides users with a novel and interesting knowledge when the real world context fluctuates: what is new? Which trends appear or emerge? Which tendencies are immersing or disappear? With the concept of Emerging Cube, we capture such trend reversals by enforcing an emergence constraint. We resume the classical borders for the Emerging Cube and introduce a new one which optimizes both storage space and computation time, provides a simple characterization of the size of Emerging Cubes, as well as classification and cube navigation tools. We soundly state the connection between the classical and proposed borders by using cube transversals. Knowing the size of Emerging Cubes without computing them is of great interest in particular for adjusting at best the underlying emergence constraint. We address this issue by studying an upper bound and characterizing the exact size of Emerging Cubes. We propose two strategies for quickly estimate their size: one based on analytical estimation, without database access, and one based on probabilistic counting using the proposed borders as the input of the near-optimal algorithm HyperLogLog. Due to the efficiency of the estimation algorithm various iterations can be performed to calibrate at best the emergence constraint. Moreover, we propose reduced and lossless representations of the Emerging Cube by using the concept of cube closure. Finally we perform experiments for different data distributions in order to measure on one hand the size of the introduced condensed and concise representations and on the other hand the performance (accuracy and computation time) of the proposed estimation method.
Document type :
Journal articles
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00464123
Contributor : Sébastien Nedjar <>
Submitted on : Tuesday, March 16, 2010 - 9:59:26 AM
Last modification on : Friday, March 9, 2018 - 11:26:21 AM

Identifiers

Collections

Citation

Sébastien Nedjar, Alain Casali, Rosine Cicchetti, Lotfi Lakhal. Emerging Cubes: Borders, size estimations and lossless reductions. Information Systems, Elsevier, 2009, 34 (6), pp.536--550. ⟨10.1016/j.is.2009.03.001⟩. ⟨hal-00464123⟩

Share

Metrics

Record views

117