OLAP-Sequential Mining: Summarizing Trends from Historical Multidimensional Data using Closed Multidimensional Sequential Patterns

Marc Plantevit 1 Anne Laurent 1 Maguelonne Teisseire 1
1 TATOO - Fouille de données environnementales
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Data warehouses are now well recognized as the way to store historical data that will then be available for future queries and analysis. In this context, some challenges are still open, among which the problem of mining such data. OLAP mining, introduced by J. Han in 1997, aims at coupling data mining techniques and data warehousing. These techniques have to take the specificities of such data into account. One of the specificities that is often not addressed by classical methods for data mining is the fact that data warehouses describe data through several dimensions. Moreover, the data are stored through time, and we thus argue that sequential patterns are one of the best ways to summarize the trends from such databases. Sequential pattern mining aims at discovering correlations among events through time. However, the number of patterns can become very important when taking several analysis dimensions into account, as it is the case in the framework of multidimensional databases. This is why we propose here to define a condensed representation without loss of information: closed multidimensional sequential patterns. This representation introduces properties that allow to deeply prune the search space. In this paper, we also define algorithms that do not require candidate set maintenance. Experiments on synthetic and real data are reported and emphasize the interest of our proposal.
Document type :
Journal articles
Complete list of metadatas

Cited literature [25 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00283426
Contributor : Marc Plantevit <>
Submitted on : Friday, October 25, 2019 - 9:38:11 AM
Last modification on : Tuesday, October 29, 2019 - 2:25:23 PM

File

hal-00283426v1 .pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00283426, version 1

Collections

Citation

Marc Plantevit, Anne Laurent, Maguelonne Teisseire. OLAP-Sequential Mining: Summarizing Trends from Historical Multidimensional Data using Closed Multidimensional Sequential Patterns. Annals of Information Systems, Springer, 2008, special issue in New Trends in Data Warehouses and Data Analysis. ⟨hal-00283426⟩

Share

Metrics

Record views

124

Files downloads

6