Skip to Main content Skip to Navigation
Conference papers

Gibbs Sampling Subjectively Interesting Tiles

Abstract : The local pattern mining literature has long struggled with the so-called pattern explosion problem: the size of the set of patterns found exceeds the size of the original data. This causes computational problems (enumerating a large set of patterns will inevitably take a substantial amount of time) as well as problems for interpretation and usabil-ity (trawling through a large set of patterns is often impractical). Two complementary research lines aim to address this problem. The first aims to develop better measures of interestingness, in order to reduce the number of uninteresting patterns that are returned [6, 10]. The second aims to avoid an exhaustive enumeration of all 'interesting' patterns (where interestingness is quantified in a more traditional way, e.g. frequency), by directly sampling from this set in a way that more 'interest-ing' patterns are sampled with higher probability [2]. Unfortunately, the first research line does not reduce computational cost, while the second may miss out on the most interesting patterns. In this paper, we combine the best of both worlds for mining interesting tiles [8] from binary databases. Specifically, we propose a new pattern sampling approach based on Gibbs sampling, where the probability of sampling a pattern is proportional to their subjective interest-ingness [6]-an interestingness measure reported to better represent true interestingness. The experimental evaluation confirms the theory, but also reveals an important weakness of the proposed approach which we speculate is shared with any other pattern sampling approach. We thus conclude with a broader discussion of this issue, and a forward look.
Complete list of metadata

Cited literature [14 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02960847
Contributor : Marc Plantevit Connect in order to contact the contributor
Submitted on : Thursday, October 8, 2020 - 8:23:41 AM
Last modification on : Tuesday, June 1, 2021 - 2:08:09 PM
Long-term archiving on: : Saturday, January 9, 2021 - 6:07:35 PM

File

Bendimerad2020_Chapter_GibbsSa...
Files produced by the author(s)

Identifiers

Citation

Anes Bendimerad, Jefrey Lijffijt, Marc Plantevit, Céline Robardet, Tijl de Bie. Gibbs Sampling Subjectively Interesting Tiles. Advances in Intelligent Data Analysis - 18th International Symposium on Intelligent Data Analysis (IDA 2020), Apr 2020, Konstanz (on line), Germany. ⟨10.1007/978-3-030-44584-3_7⟩. ⟨hal-02960847⟩

Share

Metrics

Record views

54

Files downloads

102