In Search of a Sustainable Model for Digital Heritage Repositories: A Case Study

and operational that create, disseminate, curate and preserve the data. How to ensure their existence over the medium or the long-term? This is a case study: it addresses the sustainability issues faced by Persée, a French platform dedicated to digitized documentary heritage that was launched in 2003. Through this example, the aim is to present, in practical terms, how an organization has to adapt and to change to sustain over time. Persée tested and combined various mechanisms (technical actions, users’ involvement, organizational evolution, marketing, funding models) with reciprocal influence, to achieve sustainability. Rather than a steady state, ensuring the long term existence of a data infrastructure is an ongoing and resource intensive


Introduction 1
A wide range of initiatives for developing research and data infrastructures have been funded in recent years. There is a growing concern amongst the academic community to maintain the resources invested beyond the period of the original research funding. If technical progress has been made to preserve the data themselves, few thinking and operational solutions exist for the institutions that create, disseminate, curate and preserve the data. How to ensure their existence over the medium or the long-term? This paper is a case study: it addresses the sustainability issues faced by Persée, a French platform dedicated to digitized documentary heritage that was launched in 2003. Through this example, the aim is to present, in practical terms, how an organization has to adapt and to change to sustain over time. Sustainability is a multi-faceted concept with no agreement on its means. For this paper, we define sustainability as "(a) the continued operation of an organization that offers data collections and services, (b) where operations relates to technology, preservation, users, institutional relations, business models, and other facets, (c) in the face of on-going internal and external challenges which may or may not be resolved, (d) where stakeholders recognize continuity in the mission, data sources, and value of the data repository, (e) while keeping in mind that each of those elements might evolve over time in response to (c)" (Eschenfelder and Shankar, 2017). 2 We will give some background elements about Persée (historical and institutional origins, main achievements). Then, we will examine the different and complementary aspects of the strategy that was implemented in 2011 after a major crisis. Persée responded to this funding and institutional challenge by enhancing its technical framework, by fostering the engagement of the research community and the collaboration with other platforms and infrastructures and by reshaping its strategy with the imperative to have reliable revenue sources to maintain its Open Access repositories. This experience shows that sustainability is not a steady situation but rather a continuous and resource intensive process for any organization.

Background elements 3
Persée was launched by the French Ministry of Higher Education and Research and Innovation (MESRI) in 2003 in order to develop a complete platform for the digitization, the documentation, the online publishing and the long-term preservation of back files of scientific journals in the field of humanities and social sciences. In 2005, the Persée portal ( www.persee.fr) opened with 7 journals. Currently, 184 partnerships exist with academic publishers and 740.000 full texts documents and 570.000 illustrations are available in open access.

4
The original purpose of Persée was expanded according two directions: a disciplinary openness (earth and life sciences in addition to HSS) and the offer of a new range of services. Persée provides software tools, its digital platform and expertise to the researchers and, leads some projects in the field of the Digital Humanities and Open Data.
• Perseids is the generic name of a range of websites developed by Persée in partnership with laboratories and heritage institutions as part of research projects. At the heart of a Perseid, there is a corpus that can be composed of primary sources, archives, grey literature, and published documents then, this content is digitized, precisely encoded and disseminated through a dedicated website with specialized index and tools for text analysis 1 . Emphasis is given to the standardization of item description and document structuring to ensure the use and the reuse of the data from the digitized sources.
• Data Persée (data.persee.fr) is the triplestore of Persée. All the metadata produced by Perséethrough the Persée portal or the Perseids -are available by exploring a graph. Some collections have been enriched with data from external repositories (idRef, GBIF, Geonames, BnF, Cairo Gazeetter, etc.) and the pairings are certified by the Persée team.

5
As far as the content is concerned, Persée focuses on the documentary heritage. As defined by the Memory of the Word Programme (UNESCO): a document is composed of the information content (signs, codes, sounds, images) and the support in which it is recorded (it can have a large variety of form, it can be conserved, reproduced and transported). These characteristics exclude elements that are part of a fixed structure such as a building or natural site. Persée deals mainly with the "collections of old, rare and valuable documents housed by the libraries" (Desgraves, 1982) that is to say printed document (in particular, but not exclusively, academic journals, books, proceedings), archives, grey literature, maps and iconography. Researchers' communities were closely associated to the definition of indexing methods and searching and visualization tools (Fargier, Néouze, 2008). Therefore, the originality of the Persée portal relies on an accurate structured description at the article level and a set of tools that are similar to those associated to born-digital publications. Researchers can search full-text documents, browse through tables of content, articles layouts or lists of illustrations, use a cited-by linking service, export citations or explore different indexes. Persée was launched as a project after a call for bids. It lasted beyond the start-up phase and entered an operational phase because it demonstrated the proof-of-concept and met the researchers' expectations. During this period, from a financial point of view, Persée was supported by the French MESRI and a host institution (a university). In 2011, this latter argued that it was not its mission to support a national platform as Persée, began to complain about the cost of such a support and cut funding. At the time, this university was covering approximatively 40% of Persée costs. There were two responses to this threat: an emergency one with actions taken by others outside of Persée and a long-term one with an internal change of strategy and governance. A consortium was set-up with the French MESRI and three academic institutions 2 . Complementary pathways ere identified to ensure Persée sustainability: • the transformation from a digital library to a data and services infrastructure; • a better inclusion within the existing landscape of research infrastructures, data repositories and digital platforms; • the diversification of the financing resources.

7
This approach considered that various factors had to be combined (technical, social, financial, organizational) with reciprocal influence, to achieve sustainability (Chowdury, 2014).  Considering that "technical sustainability is directly related to the standards and best practices followed when creating the digital files" (Fyffe and Warner, 2005), Persée developed an integrated open source software named jGalith. It manages every step of the process to convert printed documents to electronic format: from digitization, through documentation, quality control, XML schemas compliance check, to online publishing and long term preservation. Printed documents are collected from publishers then digitized in high quality resolution (400 dpi, bitonal and grayscale), post-processed (OCR, image quality improvement), encoded (toc, author, abstract, keywords, illustrations, footnotes, annexes...) and linked (crossRef). Each document (article, review, preamble...) is individually described and Author indexes created. The process ends with four outputs: the Persée portal or the Perseids (bibliographic data + image mode + TEI + enriched PDF + index), the triplestore (RDF), the OAI-PMH (METS, MODS, DC) and the long term archive (OAIS model with PNG and XML files).
10 Furthermore, Persée has established partnerships based on interoperability: Open Edition (link between the past issues available on the Persée portal and the current ones on Open Edition Journals), Erudit (mutual access), HumaNum (Isidore).
11 All this accurate documentation and complete process contribute to enhance Persée brand image which is based on quality and reliable service.

Towards a business model?
15 Developing a reliable revenue is the most challenging issue as Persée faces a twofold difficulty: building and maintaining, with an extensive level of curation, OA repositories of digitized documentary heritage without any payment-for-use. 16 We examine different funding models for OA digital data repositories (Kitchin, Collins and Frost-2015) and we identified four funding streams for Persée: The French MESRI provides the core operational costs through an annual subvention. And the members of the joint unit service pool human resources and funding, and facilitate capacity building.

• Philanthropic revenue
Persée applies for endowment for private companies.

• Research grants
Persée applies to research grants from national and European sources, with overheads to fund core services.

ABSTRACTS
A wide range of initiatives for developing research and data infrastructures have been funded in recent years. There is a growing concern amongst the academic community to maintain the resources invested beyond the period of the original research funding. If technical progress has been made to preserve the data themselves, few thinking and operational solutions exist for the institutions that create, disseminate, curate and preserve the data. How to ensure their existence over the medium or the long-term? This paper is a case study: it addresses the sustainability issues faced by Persée, a French platform dedicated to digitized documentary heritage that was launched in 2003. Through this example, the aim is to present, in practical terms, how an organization has to adapt and to change to sustain over time. Persée tested and combined various mechanisms (technical actions, users' involvement, organizational evolution, marketing, funding models) with reciprocal influence, to achieve sustainability. Rather than a steady state, ensuring the long term existence of a data infrastructure is an ongoing and resource intensive process.