Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

An accessible and transparent pipeline for publishing historical egodocuments

Abstract : The automatization of the processing of documents oriented towards online publication and exploration by the humanities increases the rapidity of treatments like the transcription, but they should also be an opportunity to make the experimentation and the resulting corpora sustainable and reusable. The DAHN project (Dispositif de soutien à l’Archivistique et aux Humanités Numériques) relies on a joint interdisciplinary collaboration between Inria, the EHESS and the University of Le Mans. By taking the example of egodocuments, the project aims to create a ready-to-use digital and scientific publishing pipeline going from the material archive to an online publication. In this presentation, we introduce our method and guidelines for the processing of non-digital-native textual documents using open-source and easily hackable tools that guarantee visibility across an accessible pipeline, thus challenging the notions of a black box or scattered tools which tend to be hard to maintain in the long run.
Complete list of metadata
Contributor : Alix Chagué Connect in order to contact the contributor
Submitted on : Thursday, March 25, 2021 - 10:58:00 AM
Last modification on : Wednesday, June 8, 2022 - 12:50:06 PM
Long-term archiving on: : Saturday, June 26, 2021 - 6:29:53 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License


  • HAL Id : hal-03180669, version 1



Alix Chagué, Floriane Chiffoleau. An accessible and transparent pipeline for publishing historical egodocuments. 2021. ⟨hal-03180669⟩



Record views


Files downloads