Bayesian methods provide attractive approaches to select relevant variables in multiple regression models, particularly in settings with very highly correlated variables. For example, they are popular in genetic fine-mapping problems, aiming to identify the genetic variants that causally affect some phenotypes of interest. However, Bayesian methods are limited by the computational speed and the interpretability of the posterior distribution. Wang et al. (2020) presented a simple and computationally scalable approach to variable selection, the “Sum of Single Effects” (SuSiE) model, which provides a Credible Set for each selection, making the results easy to interpret. The SuSiE model requires access to individual genotypes and phenotypes.In this dissertation, we provide a method to fit the SuSiE model using summary statistics from univariate regression results. To improve the accuracy and power for variable selection, we further generalize the SuSiE framework to select variables jointly for multiple outcomes and account for complicated effect size heterogeneity among outcomes. We provide multivariate variable selection methods using individual-level data, sufficient statistics, and summary statistics. We illustrate the power and flexibility of our method using realistic numerical simulations and real data applications.




Downloads Statistics

Download Full History