Deciphering HIV genetic variability and evolution by massive parallel pyrosequencing and bioinformatics

Detta är en avhandling från Stockholm : Karolinska Institutet, Dept of Microbiology, Tumor and Cell Biology

Sammanfattning: HIV-1 is a virus with a very variable genome and therefore has the ability to adapt to new environments which include escape from immune pressure and suboptimal antiretroviral treatment. Next-generation sequencing (NGS), especially ultra-deep pyrosequencing (UDPS), has enabled in-depth sequencing studies with a previously unattainable resolution. However, the technology is more error prone than traditional sequencing which makes it challenging to interpret UDPS results. In this thesis we carried out comprehensive work to identify, characterize and reduce errors as well as investigate the UDPS performance (Papers II, III and IV). In Papers IV and V we used UDPS to study HIV-1 minority variants. Novel primer design software was developed in Paper I and a new method to tag molecules was developed and evaluated in Paper VI. The design of primers is of special importance in NGS to avoid selective amplification which may skew estimates of variant frequencies. We developed a computer program, PrimerDesign, to meet the changed requirements for primer design. PrimerDesign is tailored to design primers from a multiple alignment and is suitable for all types of NGS that is preceded by amplification. The new Primer ID methodology has the potential to provide highly accurate deep sequencing. We identified three major challenges; a skewed resampling of Primer IDs, low recovery of templates and erroneous consensus sequences. Undetected this would lead to an underestimation in diversity of the quasispecies and cause a skewed and incorrect results. As many of our other findings, the methodology is not limited to HIV or virology. The resolution of UDPS analysis is primarily determined by the number of input DNA templates, the error frequency of the method and the efficiency of data cleaning. In Papers II and IV we therefore optimized the pre-UDPS protocol and investigated the characteristics and sources of errors that occurred when UDPS was used to sequence a fragment of the HIV-1 pol gene. UDPS introduced indel errors located in homopolymeric regions that were removed by our in-house data cleaning software. The remaining errors were primarily substitution errors that were introduced in the PCR that preceded UDPS. Transitions were significantly more frequent than transversions, which will limit detection of minor variants and mutations in HIV-1 as well as other species. Further, we evaluated the quality and reproducibility of the UDPS technology in analysis of the same pol gene fragment. We concluded that the UDPS repeatability was good for both major and minor variants. In our experimental settings, in vitro recombination and sequencing directions posed a minor problem, but still needs to be considered especially for studies of minor viral variants and linkage between mutations. Minority resistance mutations have been shown to impact the clinical outcome in treated patients. We examined the presence of pre-existing drug resistance mutations in treatment-naïve HIV-1 infected individuals and found very low levels of M184I, T215A and T215I, but no presence of M184V, Y181C, Y188C or T215Y/F. This indicates that the natural occurrence of these mutations is very low. When the same individuals experienced treatment failure or interruption, almost 100 % of the wild-type virus respective drug resistance variants were replaced. Other patients were followed from primary HIV infection (PHI) until their virus switched coreceptor use from CCR5 (R5) to CXCR4 (X4). We did not find any X4-using virus present as a minority population during PHI. The results indicate that the X4-using population most probably evolved in stepwise fashion from the R5-using populations in each of the three patients. In conclusion, we have developed and used new NGS and bioinformatic methods to study HIV-1 genetic variation. We have shown that UDPS can be used to gain new insights in HIV evolution and to detect minority drug resistance mutations as well as minority variants.

  HÄR KAN DU HÄMTA AVHANDLINGEN I FULLTEXT. (följ länken till nästa sida)