Skip to Main content Skip to Navigation
Journal articles

TCS: A new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction

Abstract : Multiple sequence alignment (MSA) is a key modeling procedure when analyzing biological sequences. Homology and evolutionary modeling are the most common applications of MSAs. Both are known to be sensitive to the underlying MSA accuracy. In this work we show how this problem can be partly overcome using the transitive consistency score (TCS), an extended version of the T-Coffee scoring scheme. Using this local evaluation function we show that one can identify the most reliable portions of an MSA, as judged from BAliBASE and PREFAB structure based reference alignments. We also show how this measure can be used to improve phylogenetic tree reconstruction using both an established simulated dataset and a novel empirical yeast dataset. For this purpose, we describe a novel lossless alternative to site filtering that involves over-weighting the trustworthy columns. Our approach relies on the T-Coffee framework; it uses libraries of pairwise alignments to evaluate any third party MSA. Pairwise projections can be produced using fast or slow methods, thus allowing a trade-off between speed and accuracy. We compared TCS to HoT, GUIDANCE, Gblocks and trimAl and found it to lead to significantly better estimate of structural accuracy as well as more accurate phylogenetic trees. Availability: TCS is part of the T-Coffee package, a freeware open source code can be downloaded from http://www.tcoffee.org/Packages/Stable/Latest and a web server is also available from http://tcoffee.crg.cat/tcs.
Complete list of metadata

Cited literature [56 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00977584
Contributor : Jia Ming Chang Connect in order to contact the contributor
Submitted on : Friday, April 11, 2014 - 12:31:41 PM
Last modification on : Monday, October 11, 2021 - 1:22:23 PM
Long-term archiving on: : Friday, July 11, 2014 - 12:26:13 PM

File

FullManuscript_TCS_Figures.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Jia Ming Chang, Paolo Di Tommaso, Cedric Notredame. TCS: A new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction. Molecular Biology and Evolution, Oxford University Press (OUP), 2014, pp.doi: 10.1093/molbev/msu117. ⟨10.1093/molbev/msu117⟩. ⟨hal-00977584⟩

Share

Metrics

Record views

417

Files downloads

860