Integrative omics data analysis to discover novel signatures in complex diseases
Sammanfattning: Apart from diseases caused by the defect of a single gene, most diseases are highly complex and are usually caused by a combination of biological and environmental factors. In the biological context, cellular processes are often tightly connected across molecular layers of the central dogma of biology, and the examination of a single layer would not be sufficient to address disease pathology, therefore, conclusions drawn can be limited. Combining biological observations from multiple layers or angles would greatly broaden our perspectives on the disease in concern and may lead to novel discoveries which would not be possible to deduce from a single-omics perspective. In this thesis, we focused on the method development for single-cell transcriptomics to address the prime bias problem introduced by the new dropletbased technologies; integrative omics discovery of genomic signatures specific to different brain regions in normal individuals; as well as the utilization of multiple omics to identify potential biomarkers specific to amyotrophic lateral sclerosis (ALS) disease prognosis and diagnosis. Research has been revolutionized with the advent of single-cell omics technologies in the past few decades and new methods and tools have also been developed to accommodate such scientific accelerations. These innovations however posed new challenges and could potentially introduce bias and unforeseeable circumstances if left unaddressed. Specifically, to resolve the prime-based problem introduced by the current popular droplet-based single-cell sequencing technologies which may lead to bias quantification, in Study I, we presented a novel transcript quantification tool for droplet-based single-cell RNA-Sequencing (scRNA-Seq) technologies and benchmarked our tool with other popular transcript and gene quantification tools. Our tool outperformed currently popular tools in terms of transcript- and gene-level quantifications. In Study II, we investigated the association of splicing variants with the genetic patterns from different regions of the brain in normal individuals to identify quantitative trait loci (QTL) associated with ratios of isoform expression in genes. We carried out genome-wide association studies (GWAS) on isoform ratios from 13 brain regions and identified isoform-ratio QTL (irQTL) specific to each brain region, and their associated traits which could have been missed by expression QTL derived from gene expressions. We further looked into the utilization of proteomics and genomics data for ALS disease in Study III to understand disease pathology from multiple perspectives, and to identify potential protein biomarkers and protein QTL (pQTL) specific to different stages of the disease and tissue sites. In terms of proteomics, for each tissue site, we identified potential protein biomarkers specific to disease prognosis, survival of ALS patients, the functional decline among ALS patients, and longitudinal changes after disease diagnosis. In terms of integrative omics, we performed GWAS of protein expressions with genotyping data and identified tissuesite-specific pQTL signatures for ALS patients. All in all, our studies showed efforts in developing a single-cell transcript quantification tool to address potential bias problems with improved performance; identifying novel irQTL signatures specific to various brain regions using an integrative omics approach; and also discovering potential protein and genetic signatures for different tissues sites and pathological stages in ALS disease using multiple omics. We hope our work could potentially enhance the research process in various omics in terms of methods development and the novel signatures could act as valuable resources for fostering further research ideas and potential experimental validations.
Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.