Software Heritage: Why and How to Preserve Software Source Code

Abstract : Software is now a key component present in all aspects of our society. Its preservation has attracted growing attention over the past years within the digital preservation community. We claim that source code—the only representation of software that contains human readable knowledge—is a precious digital object that needs special handling: it must be a first class citizen in the preservation landscape and we need to take action immediately, given the increasingly more frequent incidents that result in permanent losses of source code collections. In this paper we present Software Heritage, an ambitious initiative to collect, preserve, and share the entire corpus of publicly accessible software source code. We discuss the archival goals of the project, its use cases and role as a participant in the broader digital preservation ecosystem, and detail its key design decisions. We also report on the project road map and the current status of the Software Heritage archive that, as of early 2017, has collected more than 3 billion unique source code files and 700 million commits coming from more than 50 million software development projects.
Complete list of metadatas

Cited literature [29 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01590958
Contributor : Stefano Zacchiroli <>
Submitted on : Wednesday, September 20, 2017 - 2:57:41 PM
Last modification on : Friday, January 4, 2019 - 5:33:38 PM

File

ipres-2017-software-heritage.p...
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01590958, version 1

Collections

Citation

Roberto Di Cosmo, Stefano Zacchiroli. Software Heritage: Why and How to Preserve Software Source Code. iPRES 2017 - 14th International Conference on Digital Preservation, Sep 2017, Kyoto, Japan. pp.1-10. ⟨hal-01590958⟩

Share

Metrics

Record views

3356

Files downloads

1664