Numerical Pattern Mining Through Compression

Abstract : Pattern Mining (PM) has a prominent place in Data Science and finds its application in a wide range of domains. To avoid the exponential explosion of patterns different methods have been proposed. They are based on assumptions on interestingness and usually return very different pattern sets. In this paper we propose to use a compression-based objective as a well-justified and robust interestingness measure. We define the description lengths for datasets and use the Minimum Description Length principle (MDL) to find patterns that ensure the best compression. Our experiments show that the application of MDL to numerical data provides a small and characteristic subsets of patterns describing data in a compact way.
Document type :
Conference papers
Complete list of metadatas

Cited literature [19 references]  Display  Hide  Download
Contributor : Tatiana Makhalova <>
Submitted on : Sunday, June 23, 2019 - 7:30:44 AM
Last modification on : Friday, July 26, 2019 - 3:22:19 PM


Files produced by the author(s)


  • HAL Id : hal-02162927, version 1



Tatiana Makhalova, Sergei Kuznetsov, Amedeo Napoli. Numerical Pattern Mining Through Compression. DCC 2019 - 2019 Data Compression Conference, Mar 2019, Snowbird, United States. pp.112-121. ⟨hal-02162927⟩



Record views


Files downloads