Identification of novel antibiotic resistance genes through large-scale data analysis

Sammanfattning: Antibiotic resistance is increasing worldwide, and is considered a serious threat to public health by e.g. the World Health Organization. Antibiotic resistance genes are hypothesized to originate from harmless bacteria in and around us, from where they are horizontally transfered into human pathogens. It is therefore of great importance to explore human-associated and environmental bacterial communities to identify novel antibiotic resistance genes before they reach clinical settings. The three papers presented in this thesis aim to identify novel antibiotic resistance genes in large genomic and metagenomic datasets. In paper I, the aim was to identify novel genes of the clinically important subclass B1 metallo-β-lactamases. By analyzing whole bacterial genomes as well as metagenomes from environmental and human-associated bacterial communities, 76 novel putative B1 genes were predicted. Twenty-one of these were selected for experimental validation, whereof 18 expressed the predicted phenotype in E. coli. Phylogentic analysis revealed that the novel genes formed 59 previously undescribed gene families. In paper II, a large volume of genomic and metagenomic data was searched for novel plasmid-mediated quinolone resistance (qnr ) genes. In total, 611 qnr genes were predicted, of which 20 were putative novel. Nine of these were experimentally tested in E. coli, whereof eight expressed the predicted phenotype. In paper III, a new method for identification and reconstruction of novel antibiotic resistance genes from fragmented metagenomic data was presented. The method is based on gene specific models, which are optimized for a high sensitivity and specificity. The method is furthermore computationally efficient and can be applied to any class of resistance genes. The results of this thesis provides a deeper insight to the diversity and evolutionary history of two types of clinically relevant antibiotic resistance genes. It also provides new methods for efficient and reliable identification of novel resistance genes in fragmented metagenomic data.