000012912 001__ 12912
000012912 005__ 20240808085606.0
000012912 0247_ $$2doi$$a10.6082/uchicago.12912
000012912 037__ $$aTHESIS
000012912 037__ $$bDissertation
000012912 041__ $$aeng
000012912 245__ $$aGenetic Association Analysis of Phenotypes Jointly Influenced by a Pair of Interacting Organisms
000012912 260__ $$bUniversity of Chicago
000012912 269__ $$a2024-08
000012912 336__ $$aDissertation
000012912 502__ $$bPh.D.
000012912 520__ $$aThe virulence of infectious diseases is usually affected by a combination of a host and at least one pathogen organism. Previous experiments have revealed that combining genetic information from different organisms has enabled the identification of more relevant genetic variants than just individually performing an association analysis on each organism. Hence, we are interested in performing a joint association analysis to test for the interaction effect of each possible pair of a host and pathogen genetic variant on the phenotypic trait relating to the infectious disease. Three main issues may arise when performing this joint association analysis. First, the presence of a non-trivial interaction effect between one of the genetic variants being tested and some unaccounted factor - either observed or unobserved - can lead to heteroscedasticity in the phenotypic trait. Failure to account for this heteroscedasticity may lead to overinflated type I error rates when testing for an interaction effect between this genetic variant and any genetic variant from the other organism. We compare different methods to test and account for the potential heteroscedasticity in the phenotypic trait in the case where the genotype of the pathogen organism is a binary variable. Secondly, the fact that the phenotypic trait is held fixed while the interacting genotypes vary across different interaction tests in a joint genome-wide association analysis means that the collection of interaction test statistics corresponding to a fixed pathogen genetic variant may often display a tangible departure from the known distribution of the interaction test statistic. Under the global null hypothesis of no interaction, the collection of interaction p-values corresponding to a given pathogen genetic variant might turn out to be consistently smaller than uniform, leading to a phenomenon which has been called the "feast" effect, since we end up with excess false discoveries. Similarly, the collection of interaction p-values corresponding to another fixed pathogen genetic variant might turn out to be consistently larger than uniform, leading to a phenomenon which has been called the "famine" effect, since it limits our ability to make any important discoveries. This "feast or famine" effect has been shown to result from improper conditioning in the construction of the interaction test statistic in a joint association analysis. The ordinary interaction test statistic conditions on the pair of genetic variants being tested for interaction. Instead, we take the approach of conditioning on the phenotypic trait and a fixed pathogen genetic variant in order to construct a corrected host-pathogen interaction test statistic which alleviates the feast or famine effect. We focus our efforts on the case of diploid host organisms where an appropriate discrete correction might be required to account for the binomially distributed host genotype. We present a diagnostic tool to predict the prevalence of the feast or famine effect given only the information about a phenotypic trait and a fixed pathogen genetic variant and demonstrate its relationship with the commonly used genomic control inflation factor. Lastly, accounting for population structure among patients infected with related strains of the same pathogen presents a significant challenge, owing to the presence of genetic variants with differing number of alleles within the pathogen genome. As the number of alleles in a genetic variant increases, some of the alleles may be associated with excessively small observed allele frequencies, which introduce numerical instabilities in the existing methods of constructing a pathogen genetic relatedness matrix (GRM). We build upon previous work to develop a novel pathogen GRM for organisms with multiallelic genetic variants which avoids filtering out genetic variants with exceedingly small observed allele frequencies by introducing an adjusted weighting for rare alleles. We validate the type I error control and rectification of the feast of famine effect by our correction framework through a host of simulation studies. We demonstrate the applicability of our proposed pathogen GRM and our correction framework by testing for interaction effects between human SNPs and hepatitis C viral genetic variants on pre-treatment viral load in a cohort of HCV infected patients from the BOSON clinical trial.
000012912 540__ $$a© 2024 Vasileios Katsianos
000012912 6531_ $$aStatistical Genetics
000012912 6531_ $$aGenome-Wide Association Study
000012912 6531_ $$aPopulation Structure
000012912 690__ $$aPhysical Sciences Division
000012912 691__ $$aStatistics
000012912 7001_ $$aKatsianos, Vasileios$$uUniversity of Chicago
000012912 72012 $$aMary Sara McPeek
000012912 72014 $$aMatthew Stephens
000012912 72014 $$aMark Abney
000012912 8564_ $$yDissertation$$97cb24931-b2ce-4bb4-9609-af10576c05fe$$s3043644$$uhttps://knowledge.uchicago.edu/record/12912/files/Dissertation.pdf$$ePublic
000012912 8564_ $$yApproval Form$$9bb5725a3-2d7d-45e6-a3eb-7334d10a808c$$s145181$$uhttps://knowledge.uchicago.edu/record/12912/files/Departmental%20Approval%20Form.pdf$$erestricted_admin
000012912 908__ $$aI agree
000012912 909CO $$ooai:uchicago.tind.io:12912$$pDissertations$$pGLOBAL_SET
000012912 983__ $$aDissertation