# MiRNA and co: Methodologically exploring the world of small RNAs

1 BAMBOO - An algorithmic view on genomes, cells, and environments
Inria Grenoble - Rhône-Alpes, LBBE - Laboratoire de Biométrie et Biologie Evolutive - UMR 5558
Abstract : The main contribution of this thesis is the development of a reliable, robust, and much faster method for the prediction of pre-miRNAs. With this method, we aimed mainly at two goals: efficiency and flexibility. Efficiency was made possible by means of a quadratic algorithm. Since the majority of the predictors use a cubic algorithm to verify the pre-miRNA hairpin structure, they may take too long when the input is large. Flexibility relies on two aspects, the input type and the organism clade. Mirinho can receive as input both a genome sequence and small RNA sequencing (sRNA-seq) data of both animal and plant species. To change from one clade to another, it suffices to change the lengths of the stem-arms and of the terminal loop. Concerning the prediction of plant miRNAs, because their pre-miRNAs are longer, the methods for extracting the hairpin secondary structure are not as accurate as for shorter sequences. With Mirinho, we also addressed this problem, which enabled to provide pre-miRNA secondary structures more similar to the ones in miRBase than the other available methods. Mirinho served as the basis to two other issues we addressed. The first issue led to the treatment and analysis of sRNA-seq data of Acyrthosiphon pisum, the pea aphid. The goal was to identify the miRNAs that are expressed during the four developmental stages of this species, allowing further biological conclusions concerning the regulatory system of such an organism. For this analysis, we developed a whole pipeline, called MirinhoPipe, at the end of which Mirinho was aggregated. We then moved on to the second issue, that involved problems related to the prediction and analysis of non-coding RNAs (ncRNAs) in the bacterium Mycoplasma hyopneumoniae. A method, called Alvinho, was thus developed for the prediction of targets in this bacterium, together with a pipeline for the segmentation of a numerical sequence and detection of conservation among ncRNA sequences using a $k$-partite graph. We finally addressed a problem related to motifs, that is to patterns, that may be composed of one or more parts, that appear conserved in a set of sequences and may correspond to functional elements. This had already been addressed in a robust method called Smile. However, depending on the input parameters, the output may be too large to be tractable, as was realized in other works of the team. We then presented some clustering solutions to group the motifs that may correspond to a same biological element, and thus to better distinguish the biologically significant ones from noise that may be present in what often are large outputs from many motif extraction algorithms.
Keywords :
Document type :
Theses

Cited literature [162 references]

https://hal.archives-ouvertes.fr/tel-01096833
Contributor : Marie-France Sagot <>
Submitted on : Tuesday, January 13, 2015 - 11:27:03 AM
Last modification on : Monday, October 19, 2020 - 11:03:13 AM
Long-term archiving on: : Tuesday, April 14, 2015 - 10:15:54 AM

### Identifiers

• HAL Id : tel-01096833, version 1

### Citation

Susan Higashi. MiRNA and co: Methodologically exploring the world of small RNAs. Bioinformatics [q-bio.QM]. Universite Claude Bernard Lyon 1, 2014. English. ⟨tel-01096833⟩

Record views