Published December 4, 2024 | Version v1
Journal article Open

TRIO RVEMVS: A Bayesian framework for rare variant association analysis with expectation-maximization variable selection using family trio data

  • 1. Medical College of Wisconsin
  • 2. Colorado State University
  • 3. University of Chicago
  • 4. Regeneron Pharmaceuticals, Inc.
  • 5. Marck & Co., Inc.
  • 6. University of Texas Health Science Center at Houston

Description

It is commonly reported that rare variants may be more functionally related to complex diseases than common variants. However, individual rare variant association tests remain challenging due to low minor allele frequency in the available samples. This paper proposes an expectation maximization variable selection (EMVS) method to simultaneously detect common and rare variants at the individual variant level using family trio data. TRIO_RVEMVS was assessed in both large (1500 families) and small (350 families) datasets based on simulation. The performance of TRIO_RVEMVS was compared with gene-level kernel and burden association tests that use pedigree data (PedGene) and rare-variant extensions of the transmission disequilibrium test (RV-TDT). At the region level, TRIO_RVEMVS outperformed PedGene and RV-TDT when common variants were included. TRIO_RVEMVS performed competitively with PedGene and outperformed RV-TDT when the analysis was only restricted to rare variants. At the individual variants level, with 1,500 trios, the average true positive rate of individual rare variants that were polymorphic across 500 datasets was 12.20%, and the average false positive rate was 0.74%. In the datasets with 350 trios, the average true and false positive rates of individual rare variants were 13.10% and 1.30%, respectively. When applying TRIO_RVEMVS to real data from the Gabriella Miller Kids First Pediatric Research Program, it identified 3 rare variants in q24.21 and q24.22 associated with the risk of orofacial clefts in the Kids First European population.

Data availability

The data underlying this study cannot be shared publicly as they are owned by the National Institutes of Health (NIH) and are available through the Gabriella Miller Kids First Pediatric Research Program (https://commonfund.nih.gov/kidsfirst/overview). Access to the data requires an application process, and users are prohibited from sharing the data with others.

Files

journal.pone.0314502.pdf

Files (3.3 MB)

Name Size Download all
Article
md5:534d38beae7f3165b50de47e90c8f049
2.1 MB Preview Download
Supporting information
md5:8b92c53c53aee07c56989737158f8f9b
1.2 MB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pone.0314502
Other
oai:uchicago.tind.io:14215

Funding

Eunice Kennedy Shriver National Institute of Child Health and Human Development
R03HD083674

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Human Genetics