Skip to Main content Skip to Navigation
Conference papers

Hide and Mine in Strings: Hardness and Algorithms

Abstract : We initiate a study on the fundamental relationbetween data sanitization (i.e., the process of hiding confidentialinformation in a given dataset) and frequent pattern mining, inthe context of sequential (string) data. Current methods for stringsanitization hide confidential patterns introducing, however, anumber of spurious patterns that may harm the utility offrequent pattern mining. The main computational problem isto minimize this harm. Our contribution here is twofold. First,we present several hardness results, for different variants of thisproblem, essentially showing that these variants cannot be solvedor even be approximated in polynomial time. Second, we proposeinteger linear programming formulations for these variants andalgorithms to solve them, which work in polynomial time undercertain realistic assumptions on the problem parameters.
Document type :
Conference papers
Complete list of metadata
Contributor : Garance Gourdel Connect in order to contact the contributor
Submitted on : Thursday, December 17, 2020 - 10:35:10 AM
Last modification on : Friday, January 21, 2022 - 3:19:00 AM
Long-term archiving on: : Thursday, March 18, 2021 - 6:15:35 PM


Files produced by the author(s)


  • HAL Id : hal-03070560, version 1


Giulia Bernardini, Alessio Conte, Garance Gourdel, Roberto Grossi, Grigorios Loukides, et al.. Hide and Mine in Strings: Hardness and Algorithms. ICDM 2020 - 20th IEEE International Conference on Data Mining, Nov 2020, Sorrento, Italy. pp.1-6. ⟨hal-03070560⟩



Les métriques sont temporairement indisponibles