Semi-automated fact-checking of nucleotide sequence reagents in biomedical research publications: The Seek & Blastn tool

Abstract : Nucleotide sequence reagents are verifiable experimental reagents in biomedical publications , because their sequence identities can be independently verified and compared with associated text descriptors. We have previously reported that incorrectly identified nucleotide sequence reagents are characteristic of highly similar human gene knockdown studies, some of which have been retracted from the literature on account of possible research fraud. Because of the throughput limitations of manual verification of nucleotide sequences, we developed a semi-automated fact checking tool, Seek & Blastn, to verify the targeting or non-targeting status of published nucleotide sequence reagents. From previously described and unknown corpora of 48 and 155 publications, respectively, Seek & Blastn correctly extracted 304/342 (88.9%) and 1066/1522 (70.0%) nucleotide sequences and a predicted targeting/ non-targeting status. Seek & Blastn correctly predicted the targeting/ non-targeting status of 293/304 (96.4%) and 988/1066 (92.7%) of the correctly extracted nucleotide sequences. A total of 38/39 (97.4%) or 31/79 (39.2%) Seek & Blastn predictions of incorrect nucleotide sequence reagent use were correct in the two literature corpora. Combined Seek & Blastn and manual analyses identified a list of 91 misidentified nucleotide sequence reagents, which could be built upon through future studies. In summary, incorrect nucleotide sequence reagents represent an under-recognized source of error within the biomedical literature , and fact checking tools such as Seek & Blastn may help to identify papers and manuscripts affected by these errors.
Type de document :
Article dans une revue
PLoS ONE, Public Library of Science, 2019, 14 (3), pp.e0213266. 〈10.1371/journal.pone.0213266〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-02057036
Contributeur : Cyril Labbé <>
Soumis le : mardi 5 mars 2019 - 09:08:02
Dernière modification le : vendredi 15 mars 2019 - 01:04:00

Fichier

journal.pone.0213266.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

Collections

Citation

Cyril Labbé, Natalie Grima, Thierry Gautier, Bertrand Favier, Jennifer Byrne. Semi-automated fact-checking of nucleotide sequence reagents in biomedical research publications: The Seek & Blastn tool. PLoS ONE, Public Library of Science, 2019, 14 (3), pp.e0213266. 〈10.1371/journal.pone.0213266〉. 〈hal-02057036〉

Partager

Métriques

Consultations de la notice

14

Téléchargements de fichiers

7