XML Structure Mapping, Application to the PASCAL/INEX 2006 XML Document Mining Track

Abstract : We address the problem of learning to map automatically flat and semi-structured documents onto a mediated target XML schema. We propose a machine learning approach where the mapping between input and target documents is learned from examples. Complex transformations can be learned using only pairs of input and corresponding target documents. From a machine learning point of view, the structure mapping task raises important complexity challenges. Hence we propose an original model which scales well to real world applications. We provide learning and inference procedures with low complexity. The model sequentially builds the target XML document by processing the input document node per node. We demonstrate the efficiency of our model on two structure mapping tasks. Up to our knowledge, there are no other model yet able to solve these tasks.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01335987
Contributor : Lip6 Publications <>
Submitted on : Wednesday, June 22, 2016 - 3:17:17 PM
Last modification on : Thursday, March 21, 2019 - 1:05:17 PM

Links full text

Identifiers

Citation

Francis Maes, Ludovic Denoyer, Patrick Gallinari. XML Structure Mapping, Application to the PASCAL/INEX 2006 XML Document Mining Track. Advances in XML Information Retrieval and Evaluation: Fifth Workshop of the INitiative for the Evaluation of XML Retrieval (INEX'06), Dec 2006, Dagstuhl, Germany. pp.540-551, ⟨10.1007/978-3-540-73888-6_49⟩. ⟨hal-01335987⟩

Share

Metrics

Record views

195