Skip to Main content Skip to Navigation
Journal articles

Cleaning Data with Llunatic

Abstract : Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a given set of constraints. In recent years, repairing methods have been proposed for several classes of constraints. These methods, however, tend to hard-code the strategy to repair conflicting values and are specialized toward specific classes of constraints. In this paper we develop a general chase-based repairing framework, referred to as LLUNATIC, in which repairs can be obtained for a large class of constraints and by using different strategies to select preferred values. The framework is based on an elegant formalization in terms of labeled instances and partially ordered preference labels. In this context, we revisit concepts such as upgrades, repairs and the chase. In LLUNATIC, various repairing strategies can be slotted in, without the need for changing the underlying implementation. Furthermore, LLUNATIC is the first data repairing system which is DBMS-based. We report experimental results that confirm its good scalability and show that various instantiations of the framework result in repairs of good quality.
Document type :
Journal articles
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03560719
Contributor : Centre De Documentation Eurecom Connect in order to contact the contributor
Submitted on : Monday, February 7, 2022 - 4:57:50 PM
Last modification on : Tuesday, February 8, 2022 - 3:34:32 AM
Long-term archiving on: : Sunday, May 8, 2022 - 7:10:05 PM

File

publi-6107.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Floris Geerts, Giansalvatore Mecca, Paolo Papotti, Donatello Santoro. Cleaning Data with Llunatic. The VLDB Journal, Springer, 2019, 29 (4), pp.867-892. ⟨10.1007/s00778-019-00586-5⟩. ⟨hal-03560719⟩

Share

Metrics

Record views

10

Files downloads

7