Using Data-Display Networks for Exploratory Data Analysis in Phylogenetic Studies
Résumé
Exploratory data analysis (EDA) is a frequently under-valued part of data analysis in biology. It involves evaluating the characteristics of the data before proceeding to the definitive analysis in relation to the scientific question at hand. For phylogenetic analyses, a useful tool for EDA is a data-display network. This type of network is designed to display any character (or tree) conflict in a dataset, without prior assumptions about the causes of those conflicts. The conflicts might be caused by (a) methodological issues in data collection or analysis, (b) homoplasy, or (c) horizontal gene flow of some sort. Here, I explore 13 published datasets using splits networks, as examples of using data-display networks for EDA. In each case, I performed an original EDA on the data provided, to highlight the aspects of the resulting network that will be important for an interpretation of the phylogeny. In each case, there is at least one important point (possibly missed by the original authors) that might affect the phylogenetic analysis. I conclude that EDA should play a greater role in phylogenetic analyses than it has done.
Domaines
Biologie moléculaire
Origine : Fichiers produits par l'(les) auteur(s)
Loading...