Numerical Pattern Mining Through Compression

Abstract : Pattern Mining (PM) has a prominent place in Data Science and finds its application in a wide range of domains. To avoid the exponential explosion of patterns different methods have been proposed. They are based on assumptions on interestingness and usually return very different pattern sets. In this paper we propose to use a compression-based objective as a well-justified and robust interestingness measure. We define the description lengths for datasets and use the Minimum Description Length principle (MDL) to find patterns that ensure the best compression. Our experiments show that the application of MDL to numerical data provides a small and characteristic subsets of patterns describing data in a compact way.
Document type :
Conference papers
Complete list of metadatas

Cited literature [19 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02162927
Contributor : Tatiana Makhalova <>
Submitted on : Sunday, June 23, 2019 - 7:30:44 AM
Last modification on : Friday, July 26, 2019 - 3:22:19 PM

File

DCC_makhalova.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02162927, version 1

Collections

Citation

Tatiana Makhalova, Sergei Kuznetsov, Amedeo Napoli. Numerical Pattern Mining Through Compression. DCC 2019 - 2019 Data Compression Conference, Mar 2019, Snowbird, United States. pp.112-121. ⟨hal-02162927⟩

Share

Metrics

Record views

47

Files downloads

60