Resolving deep nodes of eukaryote phylogeny

Sammanfattning: My thesis aims to solve deep nodes in the eukaryote tree of life (eToL), by developing new data sets and new approaches to analysing them. In paper I, I described a dataset of 76 universal eukaryotic proteins of bacterial descent (euBacs), in order to test the relations among the three main divisions of mitochondriate eukaryotes (Amorphea, Diaphretickes and Discoba). I developed two protocols to identify problematic data. The conJac protocol analyzes data by jackknifing to detect outlier sequences, while conWin uses a sliding window to find sequence fragments of potentially foreign origin. Phylogenetic analyses of the 76 euBacs, with and without conWin or conJac filtering place Discoba as the sister group to Amorphea and Diaphretickes. The results are largely consistent and highly supported under various evolutionary models except for highly complex CAT models. In paper II, I describe a dataset of 198 universal eukaryote proteins of archaeal ancestry (euArcs), which includes the remaining eukaryotes, informally referred to as amitochondriate excavate. These were excluded from the previous study because they lack euBacs. Phylogenetic analyses of the euArc dataset place the amitochondriate excavate as the first three branches of eToL, followed by Discoba, the only mitochondriate excavates, which appear as a sister group to the remaining eukaryotes. I also developed a protocol using predicted protein structures to increase the fitness of the model without inflating the parameter space, allowing me to conduct a series of control analyses and further support the multi-excavate root. In Paper III, I describe a new application of reciprocal-rooting using concatenated sequences, which I then use to test the euArc root. I also developed two sampling protocols unique to this kind of data. The protocols confirm the multi-excavate euArc root, which indicates that eukaryotes arose from an excavate ancestor. Paper IV describes a follow-up on the ConWin results from Paper I. These show moderate to strong support for mosaicism in 16 euBac proteins from diverse metabolic pathways and donor lineages. In summary, this thesis presents a novel root for the eukaryote tree of life. The new root requires revision of fundamental theories of eukaryote evolution including the source and timing of mitochondrial origins. The methods I have developed are applicable to many different kinds of phylogenetic studies, and the new protein structure model should make these analyses faster, more flexible, and more widely available.

  KLICKA HÄR FÖR ATT SE AVHANDLINGEN I FULLTEXT. (PDF-format)