Computational methods for analysis of fragmented sequence data

Detta är en avhandling från Chalmers University of Technology

Sammanfattning: Recent developments in genomic and proteomic sequencing technologies have revolutionized research in life sciences, providing new opportunities for the study of biological systems. However, modern sequence data sets are large, diverse, and heavily fragmented, which presents new challenges for their analysis and interpretation. In this thesis we present six research papers, that describe novel methods for studying bacteria and bacterial communities through the analysis of large data sets produced by modern DNA and protein sequencing technologies. In Paper I, we describe a method for discovering fragments of fluoroquinolone antibiotic resistance genes in short fragments of DNA. The resistance phenotypes of the predicted resistance genes were then validated by expression in an Escherichia coli host (Paper II). The method was further improved to handle larger and more fragmented data sets in Paper III. In Paper IV, we present Tentacle, an easy-to-use tool for high performance gene quantification in metagenomes that can be run on distributed computing resources to enable fast and efficient gene quantification in terabase metagenomes. In Paper V, we introduce proteotyping, an approach for microbial identification in clinical samples based on shotgun proteomics. Finally, in Paper VI we describe and evaluate a method for proteotyping analysis suited for application to clinical diagnostics of bacterial infections. The rapidly increasing volumes of data produced by new sequencing technologies provide new opportunities for understanding microbial biology. To unlock the full potential of large sequence data sets requires novel methods and approaches such as those presented in this thesis.

  Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.