Predicting the route: from protein sequence to sorting in eukaryotic cell

Sammanfattning: Proteins need to be localised in the correct compartment of a eukaryotic cell to function correctly. Therefore, a protein needs to be transported to the right location. Specific signals present in the protein sequence direct proteins to different subcellular localisations. The correct transport is essential for the life of the cell, while, possible errors during the transport can cause irreversible damage and interfere with the activities of surrounding proteins. For more than 30 years, the development of methods to identify the localisation of proteins using both experimental and computational approaches has been an important research area. The objective of this thesis is to develop better computational methods for the classification of the subcellular localisation of eukaryotic proteins. I first describe the development of a consensus method, SubCons, which improves the subcellular prediction of human proteins. Next, I present the SubCons web-server as well as an additional benchmark using protein annotation from novel mass-spectrometry studies in two eukaryotic organisms Mus musculus and Drosophila melanogaster. Then, I present the new version of TargetP and how deep learning can improve the identification of N-terminal sorting signals by focusing on relevant biological signatures. Finally, I describe the development of a novel method for sub-nuclear localisation prediction. Here, I show that the performance of a deep convolutional neural network is improved when using an augmented dataset of homologous proteins.

  KLICKA HÄR FÖR ATT SE AVHANDLINGEN I FULLTEXT. (PDF-format)