Modeling, Encoding and Querying Multi-Structured Documents

Abstract : The issue of multi-structured documents became prominent with the emergence of the digital humanities fiel of practices. Many distinct structures may be defined simultaneously on the same original content for matching different documentary tasks. For example, a document may have both a structure for the logical organization of content (logical structure), and a structure expressing a set of content formating rules (physical structure). In this paper, we present MSDM, a generic model for multi-structured documents, in wich several important features are established. We also address the problem of efficiently encoding multi-structured documents by introducing multix, a new XML formalism based on the MSDM Model. Finally, we propose a library of xquery functions for querying multix documents. We will illlustrate all the contributions witih a use case based on a fragment of an old manuscript.
Document type :
Journal articles
Complete list of metadatas

Cited literature [21 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01353379
Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Tuesday, March 7, 2017 - 10:26:56 AM
Last modification on : Thursday, February 7, 2019 - 2:26:52 PM
Long-term archiving on : Thursday, June 8, 2017 - 12:42:47 PM

File

Liris-5448.pdf
Files produced by the author(s)

Identifiers

Citation

Pierre-Edouard Portier, Noureddine Chatti, Sylvie Calabretto, Elod Egyed-Zsigmond, Jean-Marie Pinon. Modeling, Encoding and Querying Multi-Structured Documents. Information Processing and Management, Elsevier, 2012, 5, 48, pp.931-955. ⟨10.1016/j.ipm.2011.11.004⟩. ⟨hal-01353379⟩

Share

Metrics

Record views

238

Files downloads

108