YAM: a Step Forward for Generating a Dedicated Schema Matcher

Fabien Duchateau 1 Zohra Bellahsene 2
1 BD - Base de Données
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
2 FADO - Fuzziness, Alignments, Data & Ontologies
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Discovering correspondences between schema elements is a crucial task for data integration. Most schema matching tools are semiautomatic, e.g., an expert must tune certain parameters (thresholds, weights, etc.). They mainly use aggregation methods to combine similarity measures. The tuning of a matcher, especially for its aggregation function, has a strong impact on the matching quality of the resulting correspondences, and makes it difficult to integrate a new similarity measure or to match specific domain schemas. In this paper, we present YAM (Yet Another Matcher), a matcher factory which enables the generation of a dedicated schema matcher for a given schema matching scenario. For this purpose we have formulated the schema matching task as a classification problem. Based on this machine learning framework, YAM automatically selects and tunes the best method to combine similarity measures (e.g., a decision tree, an aggregation function). In addition, we describe how user inputs, such as a preference between recall or precision, can be closely integrated during the generation of the dedicated matcher. Many experiments run against matchers generated by YAM and traditional matching tools confirm the benefits of a matcher factory and the significant impact of user preferences
Document type :
Journal articles
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01411174
Contributor : Zohra Bellahsene <>
Submitted on : Wednesday, December 7, 2016 - 10:48:07 AM
Last modification on : Wednesday, November 20, 2019 - 3:02:19 AM

Identifiers

Citation

Fabien Duchateau, Zohra Bellahsene. YAM: a Step Forward for Generating a Dedicated Schema Matcher. Transactions on Large-Scale Data- and Knowledge-Centered Systems, Springer Berlin / Heidelberg, 2016, Transactions on Large-Scale Data- and Knowledge-Centered Systems XXV, LNCS (9620), pp.150-185. ⟨10.1007/978-3-662-49534-6_5⟩. ⟨hal-01411174⟩

Share

Metrics

Record views

413