000010309 001__ 10309
000010309 005__ 20250218124656.0
000010309 02470 $$ahttps://doi.org/10.1371/journal.pgen.1000147$$2doi
000010309 037__ $$aTEXTUAL
000010309 037__ $$bArticle
000010309 041__ $$aeng
000010309 245__ $$aLinkage Disequilibrium-Based Quality Control for Large-Scale Genetic Studies
000010309 269__ $$a2008-08-01
000010309 336__ $$aArticle
000010309 520__ $$a<p>Quality control (QC) is a critical step in large-scale studies of genetic variation. While, on average, high-throughput single nucleotide polymorphism (SNP) genotyping assays are now very accurate, the errors that remain tend to cluster into a small percentage of “problem” SNPs, which exhibit unusually high error rates. Because most large-scale studies of genetic variation are searching for phenomena that are rare (e.g., SNPs associated with a phenotype), even this small percentage of problem SNPs can cause important practical problems. Here we describe and illustrate how patterns of linkage disequilibrium (LD) can be used to improve QC in large-scale, population-based studies. This approach has the advantage over existing filters (e.g., HWE or call rate) that it can actually reduce genotyping error rates by automatically correcting some genotyping errors. Applying this LD-based QC procedure to data from The International HapMap Project, we identify over 1,500 SNPs that likely have high error rates in the CHB and JPT samples and estimate corrected genotypes. Our method is implemented in the software package fastPHASE, available from the Stephens Lab website (<a href="http://stephenslab.uchicago.edu/software.html">http://stephenslab.uchicago.edu/software.html</a>).</p>
000010309 536__ $$oNational Institutes of Health$$c1RO1HG/LM02585-01
000010309 536__ $$oNational Institutes of Health$$cHL084729-02
000010309 540__ $$a<p>© 2008 Scheet, Stephens. </p> <p>This is an open access article distributed under the terms of the <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">Creative Commons Attribution License</a>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</p>
000010309 542__ $$fCC BY
000010309 690__ $$aBiological Sciences Division
000010309 690__ $$aPhysical Sciences Division
000010309 691__ $$aHuman Genetics
000010309 691__ $$aStatistics
000010309 7001_ $$aScheet, Paul$$uUniversity of Michigan
000010309 7001_ $$aStephens, Matthew$$uUniversity of Chicago
000010309 773__ $$tPLOS Genetics
000010309 8564_ $$yArticle$$9608742d8-2e49-4daf-9557-4737a46e5d0b$$s416938$$uhttps://knowledge.uchicago.edu/record/10309/files/journal.pgen.1000147.pdf$$ePublic
000010309 8564_ $$ySupporting information$$92dd21452-8f27-43a3-8b17-88074a26effc$$s254053$$uhttps://knowledge.uchicago.edu/record/10309/files/journal.pgen.1000147.zip$$ePublic
000010309 908__ $$aI agree
000010309 909CO $$ooai:uchicago.tind.io:10309$$pGLOBAL_SET
000010309 983__ $$aArticle