Files
Abstract
Genome-wide association studies (GWAS) have allowed us to successfully identify thousands of common genetic variants underlying a number of diseases, but it has been difficult to understand how mechanism of action because the vast majority of these loci are located in non-coding regions of thegenome. Because it is estimated that only 25% of disease associated genetic variants contribute to disease by affecting steady-state gene expression levels, it will be important to establish a more comprehensive understanding of alternative mechanisms through which genetic variants act to contribute to disease, such as RNA processing and RNA modification. Motivated by this, this dissertation outlines the development of computational methods and assessment of existing tools to profiles various RNA processing and RNA modification events across individuals, cell types, and developmental stages, which can ultimately be applied to large disease-cohort datasets in future studies. In the first chapter, we provide primers on quantitative genetics and RNA processing and modifications to put this work in context. In the second chapter, we demonstrate that combining large quantities of RNA-seq data with small quantities of specialized data, including 3'-Seq and single-molecule real-time (SMRT) isoform sequencing (Iso-Seq), allows one to study alternative cleavage and polyadenylation, without compromising affordability or accuracy. We apply this approach to explore inter-individual variation in polyadenylation site choice. In the third chapter, we profile the role of the RNA modification N6-methyladenosine (m6A) in oligodendrocyte lineage progression and its potential impacts in human diseases, such as multiple sclerosis. Finally, in the fourth chapter, we develop a method to examine how genetic variants that increase risk of disease reduce the fidelity of RNA splicing.