Skip to Main content Skip to Navigation
Conference papers

Automatic page classification in a large collection of manuscripts based on the International Image Interoperability Framework

Abstract : In patrimonial institutions such as libraries and archives, the valorization of the vast amount of documents that have been recently digitized is still a challenge. Most of these documents are freely accessible as images but their textual content remains largely unreachable and unknown. Research projects dedicated to specific collection allow creating meta-data or even transcriptions obtained through volunteers or crowdsourcing. But the vast majority of the documents cannot be manually transcribed or indexed: automatic large-scale processes for indexing are needed. The increasing adoption of the International Image Interoperability Framework (IIIF) by the patrimonial institutions is a technological enabler for the development of such services. Images are accessible with a unique protocol across institutions and both images and data can be presented with standard tools. In this paper, we describe an architecture for automatic processing of historical documents owned by different institutions but processed and presented thanks to the IIIF framework. We implemented this architecture and processed a large collection of books of hours with a page classifier trained on an annotated sample. The result is freely distributed and can be viewed with any IIIF compatible viewer.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02426404
Contributor : Dominique Stutzmann <>
Submitted on : Thursday, January 2, 2020 - 12:21:05 PM
Last modification on : Friday, January 3, 2020 - 1:25:24 AM

Identifiers

Collections

Citation

Emanuela Boroş, Alexis Toumi, Erwan Rouchet, Bastien Abadie, Dominique Stutzmann, et al.. Automatic page classification in a large collection of manuscripts based on the International Image Interoperability Framework. 2019 International Conference on Document Analysis and Recognition (ICDAR), Sep 2019, Sydney, Australia. pp.756-762, ⟨10.1109/ICDAR.2019.00126⟩. ⟨hal-02426404⟩

Share

Metrics

Record views

53