Mirinho: An efficient and general plant and animal pre-miRNA predictor for genomic and deep sequencing data

Abstract : Background: Several methods exist for the prediction of precursor miRNAs (pre-miRNAs) in genomic or sRNA-seq (small RNA sequences) data produced by NGS (Next Generation Sequencing). One key information used for this task is the characteristic hairpin structure adopted by pre-miRNAs, that in general are identified using RNA folders whose complexity is cubic in the size of the input. The vast majority of pre-miRNA predictors then rely on further information learned from previously validated miRNAs from the same or a closely related genome for the final prediction of new miRNAs. With this paper, we wished to address three main issues. The first was methodological and aimed at obtaining a more time-efficient predictor, however without losing in accuracy which represented a second issue. We indeed aimed at better predicting miRNAs at a genome scale, but also from sRNAseq data where in some cases, notably of plants, the current folding methods often infer the wrong structure. The third issue is related to the fact that it is important to rely as little as possible on previously recorded examples of miRNAs. We therefore also sought a method that is less dependent on previous miRNA records.[br/] Results: As concerns the first and second issues, we present a novel alternative to a classical folder based on a thermodynamic Nearest-Neighbour (NN) model for computing the free energy and predicting the classical hairpin structure of a pre-miRNA. We show that the free energies thus computed correlate well with those of RNAFOLD. This novel method, called MIRINHO, has quadratic instead of cubic complexity and is much more efficient also in practice. When applied to sRNAseq data of plants, it gives in general better results than classical folders. On the third issue, we show that MIRINHO, which uses as only knowledge the length of the loops and stem-arms and the free energy of the pre-miRNA hairpin, compares well with algorithms that require more information. The results, obtained with different datasets, are indeed similar to those of other approaches with which such a comparison was possible. These needed to be publicly available softwares that could be used on a large input. In some cases, MIRINHO is even better in terms of sensitivity or precision.[br/] Conclusion: We provide a simpler and much faster method with very reasonable sensitivity and precision, which can be applied without special adaptation to the prediction of both animal and plant pre-miRNAs, using as input either genomic sequences or sRNA-seq data.
Type de document :
Article dans une revue
BMC Bioinformatics, BioMed Central, 2015, 16 (1), pp.179. 〈10.1186/s12859-015-0594-0〉
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01166487
Contributeur : Archive Ouverte Prodinra <>
Soumis le : lundi 22 juin 2015 - 20:09:55
Dernière modification le : mardi 20 novembre 2018 - 09:32:15
Document(s) archivé(s) le : mardi 25 avril 2017 - 19:20:59

Fichier

Mirinho an efficient and gener...
Accord explicite pour ce dépôt

Identifiants

Collections

Citation

Susan Higashi, Cyril Fournier, Christian Gautier, Christine Gaspin, Marie-France Sagot. Mirinho: An efficient and general plant and animal pre-miRNA predictor for genomic and deep sequencing data. BMC Bioinformatics, BioMed Central, 2015, 16 (1), pp.179. 〈10.1186/s12859-015-0594-0〉. 〈hal-01166487〉

Partager

Métriques

Consultations de la notice

466

Téléchargements de fichiers

203