Searching for Truth in a Database of Statistics - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Searching for Truth in a Database of Statistics

Résumé

The proliferation of falsehood and misinformation, in particular through the Web, has lead to increasing energy being invested into journalistic fact-checking. Fact-checking journalists typically check the accuracy of a claim against some trusted data source. Statistic databases such as those compiled by state agencies are often used as trusted data sources, as they contain valuable, high-quality information. However, their usability is limited when they are shared in a format such as HTML or spreadsheets: this makes it hard to find the most relevant dataset for checking a specific claim, or to quickly extract from a dataset the best answer to a given query. We present a novel algorithm enabling the exploitation of such statistic tables, by (i) identifying the statistic datasets most relevant for a given fact-checking query, and (ii) extracting from each dataset the best specific (precise) query answer it may contain. We have implemented our approach and experimented on the complete corpus of statistics obtained from INSEE, the French national statistic institute. Our experiments and comparisons demonstrate the effectiveness of our proposed method.
Fichier principal
Vignette du fichier
paper-hal.pdf (620.4 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01745768 , version 1 (28-03-2018)

Identifiants

  • HAL Id : hal-01745768 , version 1

Citer

Tien-Duc Cao, Ioana Manolescu, Xavier Tannier. Searching for Truth in a Database of Statistics. WebDB 2018 - 21st International Workshop on the Web and Databases, Jun 2018, Houston, United States. pp.1-6. ⟨hal-01745768⟩
584 Consultations
348 Téléchargements

Partager

Gmail Facebook X LinkedIn More