Files
Abstract
In genetic association analysis of complex traits, detection of interaction (either GxG or GxE) can help to elucidate the genetic architecture and biological mechanisms underlying the trait. Detection of interaction in a genome-wide association study (GWAS) can be methodologically challenging for various reasons, including a high burden of multiple comparisons when testing for epistasis between all possible pairs of a set of genome-wide variants, as well as heteroscedasticity effects occurring in the presence of GxG or GxE interaction. In this paper, we address the problem of an even more striking phenomenon that we call the "feast or famine" effect that occurs when testing interaction in a genome-wide context. We show that, even in a simplified setting in which there is no interaction at all (and so no heteroscedasticity) and all SNPs are assumed independent, in a GWAS to detect gene-gene or gene-environment interactions with a fixed genetic variant or environmental factor, the distribution of the genome-wide p-values under the null hypothesis of no interaction is not the i.i.d.\ uniform one that is commonly assumed. Using standard methods, even if all SNPs are independent, some GWAS’s will have systematically under-inflated p-values (“feast”), and others will have systematically over-inflated p-values (“famine”), which can lead to false detection of interaction, reduced power, inconsistent results across studies, and failure to replicate true signal. This is a surprising result that is specific to detection of interaction in a GWAS, and it may partly explain why such detection has so far proved challenging and difficult to replicate. We show theoretically that the key cause of this phenomenon is which variables are conditioned on in the analysis, and this suggests an approach to correct the problem by changing the way the conditioning is done. Using this insight, we have developed the TINGA (Testing INteraction in GWAS with test statistic Adjustment) method to adjust the interaction test statistics to make their p-values closer to uniform under the null hypothesis. In simulations we show that TINGA controls type 1 error, improves power and reduces the "feast or famine" effect. TINGA allows for covariates and population structure through use of a linear mixed model and accounts for heteroscedasticity. We apply TINGA to detection of epistasis in a study of flowering time in Arabidopsis thaliana.