Tournesol: A quest for a large, secure and trustworthy database of reliable human judgments

Lê-Nguyên Hoang; Louis Faucon; Aidan Jungo; Sergei Volodin; Dalia Papuc; Orfeas Liossatos; Ben Crulis; Mariame Tighanimine; Isabela Constantin; Anastasiia Kucherenko; Alexandre Maurer; Felix Grimberg; Vlad Nitu; Chris Vossen; Sébastien Rouault; El-Mahdi El-Mhamdi

Pré-Publication, Document De Travail Année : 2021

Tournesol: A quest for a large, secure and trustworthy database of reliable human judgments

(1) , (2) , (2) , (2) , (1) , (1) , (3) , (1) , (2) , (1) , (4) , (1) , (5) , (2) , (1) , (6)

1
2
3
4
5
6

Lê-Nguyên Hoang

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Louis Faucon

Fonction : Auteur

Tournesol Association

Aidan Jungo

Fonction : Auteur

Tournesol Association

Sergei Volodin

Fonction : Auteur

Tournesol Association

Dalia Papuc

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Orfeas Liossatos

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Ben Crulis

Fonction : Auteur

Université de Tours

Mariame Tighanimine

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Isabela Constantin

Fonction : Auteur

Tournesol Association

Anastasiia Kucherenko

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Alexandre Maurer

Fonction : Auteur

Université Mohammed VI Polytechnique [Ben Guerir]

Felix Grimberg

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Vlad Nitu

Fonction : Auteur
PersonId : 749417
IdHAL : vlad-nitu
ORCID : 0000-0002-7996-3963
IdRef : 225604760

Centre National de la Recherche Scientifique

Chris Vossen

Fonction : Auteur

Tournesol Association

Sébastien Rouault

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

El-Mahdi El-Mhamdi

Fonction : Auteur
PersonId : 751506
IdHAL : el-mhamdi
ORCID : 0000-0001-5041-1260

Département d'informatique de l'École polytechnique

Résumé

Today's large-scale algorithms have become immensely influential, as they recommend and moderate the content that billions of humans are exposed to on a daily basis. These algorithms are the de-facto regulators of the information diet of billions of humans, from shaping opinions on public health information to organizing groups for social movements. This creates serious concerns, but also great opportunities to promote quality information [Hoa20, HFE21]. Addressing the concerns and seizing the opportunities is a challenging, enormous and fabulous endeavor [HE19], as intuitively appealing ideas often come with unforeseen unwanted side effects [EMH21], and as it requires us to think about what we truly and deeply prefer [Soa15]. To make progress, it is critical to understand how today's large-scale algorithms are built, and to determine what interventions will be most effective. Given that these algorithms rely heavily on machine learning, we make the following key observation: any algorithm trained on uncontrolled data must not be trusted. Indeed, a malicious entity could take control over the data, poison it with dangerously misleading or manipulative fabricated inputs, and thereby make the trained algorithm extremely unsafe. We thus argue that the first step towards safe and ethical large-scale algorithms must be the collection of a large, secure and trustworthy dataset of reliable human judgments. To achieve this, we introduce Tournesol, an open source platform available at https: //tournesol.app. Tournesol aims to collect a large database of human judgments on what algorithms ought to widely recommend (and what algorithms ought to stop widely recommending). In this paper, we outline the structure of the Tournesol database, the key features of the Tournesol platform and the main hurdles that must be overcome to make it a successful project. Most importantly, we argue that, if successful, Tournesol may then serve as the essential foundation for any safe and ethical large-scale algorithm.

Domaines

Informatique [cs]

Fichier principal

Tournesol__White_Paper_.pdf (16.58 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Vlad Nitu : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03390514

Soumis le : jeudi 21 octobre 2021-13:59:42

Dernière modification le : lundi 5 février 2024-17:04:55

Archivage à long terme le : samedi 22 janvier 2022-19:04:59

Dates et versions

hal-03390514 , version 1 (21-10-2021)

Identifiants

HAL Id : hal-03390514 , version 1

Citer

Lê-Nguyên Hoang, Louis Faucon, Aidan Jungo, Sergei Volodin, Dalia Papuc, et al.. Tournesol: A quest for a large, secure and trustworthy database of reliable human judgments. 2021. ⟨hal-03390514⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

X UNIV-TOURS CNRS IP_PARIS

73 Consultations

13 Téléchargements

Tournesol: A quest for a large, secure and trustworthy database of reliable human judgments

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager