Latent Variables in 'omic' Data

McKennan, Christopher Gordon

doi:10.6082/uchicago.1965

Latent Variables in 'omic' Data

McKennan, Christopher Gordon

2019

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Cite

Files

Abstract

Nearly all high-throughput 'omic' data are influenced by technical and biological factors unknown to the researcher, which, if unaccounted for, can severely obfuscate estimation of and inference on the effects of interest. While the importance of this problem has precipitated the development of many methods that attempt to correct for these latent factors, most are designed for gene expression data and are not amenable for modern, complex experimental designs. In this thesis, we develop novel and provably accurate methodology to estimate and perform inference on the coefficients of interest in a multivariate linear model in the presence of latent covariates. Chapter 2 discusses this problem in the context of DNA methylation in which latent cell type typically confounds the covariate of interest. We then provide the first methods amenable to experimental designs with complex sample correlation structures in Chapters 3 and 4. Lastly, motivated by untargeted LC-MS metabolomic data, we present the first method to account for both unobserved covariates and non-random missing data in Chapter 5.