Discovering novelty in sequential patterns: application for analysis of microarray data on Alzheimer disease

Abstract : Analyzing microarrays data is still a great challenge since existing methods produce huge amounts of useless results. We propose a new method called NoDisco for discovering novelties in gene sequences obtained by applying data-mining techniques to microarray data. Method: We identify popular genes, which are often cited in the literature, and innovative genes, which are linked to the popular genes in the sequences but are not mentioned in the literature. We also identify popular and innovative sequences containing these genes. Biologists can thus select interesting sequences from the two sets and obtain the k-best documents. Results: We show the efficiency of this method by applying it on real data used to decipher the mechanisms underlying Alzheimer disease. Conclusion: The first selection of sequences based on popularity and innovation help experts focus on relevant sequences while the top-k documents help them understand the sequences.
Complete list of metadatas

Cited literature [15 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00558156
Contributor : Import Ws Irstea <>
Submitted on : Friday, January 21, 2011 - 10:40:55 AM
Last modification on : Wednesday, September 18, 2019 - 4:04:04 PM
Long-term archiving on : Friday, April 22, 2011 - 2:48:12 AM

File

MT2010-PUB00030587.pdf
Files produced by the author(s)

Identifiers

Citation

Sandra Bringay, Mathieu Roche, Maguelonne Teisseire, Pascal Poncelet, Ronza Abdel Rassoul, et al.. Discovering novelty in sequential patterns: application for analysis of microarray data on Alzheimer disease. MedInfo: Congress on Medical Informatics, Sep 2010, Cape Town, South Africa. pp.1314-1318. ⟨hal-00558156⟩

Share

Metrics

Record views

898

Files downloads

425