Towards Precision Medicine: Exploiting Genetic Variation in Tumours by Inferring Multitype Gene Regulatory Networks

Sammanfattning: Precision medicine aims to customize treatment to a patient given measured genetic or other molecular data for diagnostics. In cancer, optimal medical treatment, depends on how far the disease has progressed, type and subtype of cancer and, the individual tumor’s circumstances. Finding large-scale genome-wide models for cancer tumours comes with several challenges including, but not limited to, the relatively small size of the cohort compared to the vast number of genes. In this licentiate thesis, using DNA copy number aberration (CNA), mRNA expression data and survival data, methods were developed to adress some of the issues along the path towards useful large-scale models. In paper I, two related models were suggested that incorporate these data types. To allow large-scale computations a new LASSO solver based on Cyclic Coordinate Descent were coded in C/BLAS for both R and Matlab. A set of validation techniques were used and the solutions to the models could find both previously known genes involved in cancer as well as new candidate targets for intervention, predicting survival length and further elucidated the connectome. One of these candidate targets were verified in vitro. In paper II, the techniques and the software developed in paper I, were further refined in the form of R packages and exposed in a book chapter as a hands on tutorial. In paper III, efforts were made to increase the likelihood of reproducibility and save both human and machine time in calculations and report writing. Having calculations splitted into interdependent blocks and caching computations, results in a dynamic update of reports which change if data or analysis change. This allows for certain in silico issues in the reproducibility process to be mitigated. In paper IV, a model was developed to find similarites and differences between cancer types or subtypes. Potential benefits are to further elucidate the workings of the gene regulatory networks in cancer for multiple cohort clusters at any granularity by exploiting the accumulative statistical strength for coinciding cross-type subnetworks increasing the available sample size while keeping the resolution at a type or sub-type specific level. Known genes relevant to cancer appear in the models and the networks inferred disclose candidate hub genes, connections of interest for candidate sub-hub interventions predictors for survival important for selection of therapy. A generalization to pairwise fused LASSO were used as a model and a solver were implemented using Split Bregman optimization and parallel computations in C/BLAS. I conclude that the tools and models presented may aid in accelerating the system biology loop and provide insights into the biology of cancers, be it type, subtype or as more data come in, even smaller groups.

  Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.