Revealing Historical Events out of Web Archives

Quentin Lobbé 1, 2
1 VALDA - Value from Data
DI-ENS - Département d'informatique de l'École normale supérieure, Inria de Paris
Abstract : As the living Web expands, worldwide volumes of Web archives constantly increase, making difficult to identify relevant archived contents. Here we propose an application for detecting historical events out of a corpus of Web archives and based on an entity called Web Fragment: a semantic and syntactic subset of a given Web page. The Web fragment has the particularity to be indexed by its edition date instead of its archiving date. We apply our framework on an archived Moroccan forum and witness how it reacted to the Arab Spring at the end of 2010.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01895951
Contributor : Quentin Lobbé <>
Submitted on : Monday, October 15, 2018 - 4:16:19 PM
Last modification on : Monday, February 25, 2019 - 11:56:01 AM
Long-term archiving on : Wednesday, January 16, 2019 - 3:49:31 PM

File

camera-ready_4.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01895951, version 1

Collections

Citation

Quentin Lobbé. Revealing Historical Events out of Web Archives. 22nd International Conference on Theory and Practice of Digital Libraries (TPDL 2018), Sep 2018, Porto, Portugal. ⟨hal-01895951⟩

Share

Metrics

Record views

74

Files downloads

47