Document structure matching for heterogeneous corpora

Abstract : Querying heterogeneous XML document collections is an open problem. This will require building some sort of correspondence between the DTD of the different sources. We consider here the problem of matching the structure of XML documents from different sources. We introduce for that a stochastic structured document model and describe preliminary experiments performed on the INEX collection.
Complete list of metadatas
Contributor : Ludovic Denoyer <>
Submitted on : Tuesday, August 30, 2016 - 10:17:31 AM
Last modification on : Thursday, March 21, 2019 - 2:18:56 PM


  • HAL Id : hal-01357592, version 1


Ludovic Denoyer, Guillaume Wisniewski, Patrick Gallinari. Document structure matching for heterogeneous corpora. SIGIR 2004 workshop on XML and Information Retrieval, Jul 2004, Sheffield, United Kingdom. ⟨hal-01357592⟩



Record views