Bioinformatic Methods in Metagenomics

Sammanfattning: Microbial organisms are a vital part of our global ecosystem. Yet, our knowledge of them is still lacking. Direct sequencing of microbial communities, i.e. metagenomics, have enabled detailed studies of these microscopic organisms by inspection of their DNA sequences without the need to culture them. Furthermore, the development of modern high- throughput sequencing technologies have made this approach more powerful and cost-effective. Taken together, this has shifted the field of microbiology from previously being centered around microscopy and culturing studies, to largely consist of computational analyses of DNA sequences. One such computational analysis which is the main focus of this thesis, aims at reconstruction of the complete DNA sequence of an organism, i.e. its genome, directly from short metagenomic sequences.This thesis consists of an introduction to the subject followed by five papers. Paper I describes a large metagenomic data resource spanning the Baltic Sea microbial communities. This dataset is complemented with a web-interface allowing researchers to easily extract and visualize detailed information. Paper II introduces a bioinformatic method which is able to reconstruct genomes from metagenomic data. This method, which is termed CONCOCT, is applied on Baltic Sea metagenomics data in Paper III and Paper V. This enabled the reconstruction of a large number of genomes. Analysis of these genomes in Paper III led to the proposal of, and evidence for, a global brackish microbiome. Paper IV presents a comparison between genomes reconstructed from metagenomes with single-cell sequenced genomes. This further validated the technique presented in Paper II as it was found to produce larger and more complete genomes than single-cell sequencing.