Published October 19, 2018
| Version v1
Journal article
Open
Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes
Description
Genome-wide association studies (GWAS) aim to identify genetic factors associated with phenotypes. Standard analyses test variants for associations individually. However, variant-level associations are hard to identify and can be difficult to interpret biologically. Enrichment analyses help address both problems by targeting sets of biologically related variants. Here we introduce a new model-based enrichment method that requires only GWAS summary statistics. Applying this method to interrogate 4,026 gene sets in 31 human phenotypes identifies many previously-unreported enrichments, including enrichments of endochondral ossification pathway for height, NFAT-dependent transcription pathway for rheumatoid arthritis, brain-related genes for coronary artery disease, and liver-related genes for Alzheimer's disease. A key feature of our method is that inferred enrichments automatically help identify new trait-associated genes. For example, accounting for enrichment in lipid transport genes highlights association between MTTP and low-density lipoprotein levels, whereas conventional analyses of the same data found no significant variants near this gene.
Data availability
Analysis results and all 4026 gene sets for the present study are publicly available at https://doi.org/10.5281/zenodo.1412872. The 4026 gene sets consist of 3913 biological pathways retrieved from the following four repositories: Pathway Commons (version 7, http://www.pathwaycommons.org/archives/PC2/v7), NCBI Biosystems (ftp://ftp.ncbi.nih.gov/pub/biosystems), PANTHER (version 3.3, ftp://ftp.pantherdb.org/pathway), BioCarta (used in ref.), and 113 tissue-based gene sets derived from GTEx transcriptome data (https://www.gtexportal.org/home/). Links to download GWAS summary statistics of 31 human phenotypes are provided in Supplementary Notes. The list of HapMap3 SNPs is available at https://data.broadinstitute.org/alkesgroup/LDSCORE/w_hm3.snplist.bz2. The 1000 Genomes Phase 3 data are available at ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502. The Wellcome Trust Case Control Consortium data are available at the European Genome-phenome Archive (https://www.ebi.ac.uk/ega/). The APO gene family is available at https://www.genenames.org/cgi-bin/genefamilies/set/405.Files
Large-scale-genome-wide-enrichment-analyses-identify-new-trait-associated-genes-and-pathways-across-31-human-phenotypes.pdf
Files
(9.0 MB)
| Name | Size | Download all |
|---|---|---|
|
Supplementary material md5:d50d74440e5c49479df34fc994fd96e9 |
7.6 MB | Preview Download |
|
Article md5:9177335f29af44e42107d3acec285d8b |
1.4 MB | Preview Download |
Additional details
Identifiers
- DOI
- 10.1038/s41467-018-06805-x
- Other
- oai:uchicago.tind.io:5742
Funding
- Gordon and Betty Moore Foundation
- Grant GBMF
- National Institutes of Health
- HG02585