000011171 001__ 11171 000011171 005__ 20250218124745.0 000011171 02470 $$ahttps://doi.org/10.1001/jamanetworkopen.2020.12734$$2doi 000011171 037__ $$aTEXTUAL 000011171 037__ $$bArticle 000011171 041__ $$aeng 000011171 245__ $$aValidation of a Machine Learning Model to Predict Childhood Lead Poisoning 000011171 269__ $$a2020-09-16 000011171 336__ $$aArticle 000011171 520__ $$a<p>Importance: Childhood lead poisoning causes irreversible neurobehavioral deficits, but current practice is secondary prevention.</p> <p>Objective: To validate a machine learning (random forest) prediction model of elevated blood lead levels (EBLLs) by comparison with a parsimonious logistic regression.</p><p>Design, Setting, and Participants: This prognostic study for temporal validation of multivariable prediction models used data from the Women, Infants, and Children (WIC) program of the Chicago Department of Public Health. Participants included a development cohort of children born from January 1, 2007, to December 31, 2012, and a validation WIC cohort born from January 1 to December 31, 2013. Blood lead levels were measured until December 31, 2018. Data were analyzed from January 1 to October 31, 2019. Exposures: Blood lead level test results; lead investigation findings; housing characteristics, permits, and violations; and demographic variables.</p> <p>Main Outcomes and Measures: Incident EBLL (≥6 μg/dL). Models were assessed using the area under the receiver operating characteristic curve (AUC) and confusion matrix metrics (positive predictive value, sensitivity, and specificity) at various thresholds.</p><p>Results: Among 6812 children in the WIC validation cohort, 3451 (50.7%) were female, 3057 (44.9%) were Hispanic, 2804 (41.2%) were non-Hispanic Black, 458 (6.7%) were non-Hispanic White, and 442 (6.5%) were Asian (mean [SD] age, 5.5 [0.3] years). The median year of housing construction was 1919 (interquartile range, 1903-1948). Random forest AUC was 0.69 compared with 0.64 for logistic regression (difference, 0.05; 95% CI, 0.02-0.08). When predicting the 5% of children at highest risk to have EBLLs, random forest and logistic regression models had positive predictive values of 15.5% and 7.8%, respectively (difference, 7.7%; 95% CI, 3.7%-11.3%), sensitivity of 16.2% and 8.1%, respectively (difference, 8.1%; 95% CI, 3.9%-11.7%), and specificity of 95.5% and 95.1% (difference, 0.4%; 95% CI, 0.0%-0.7%).</p> <p>Conclusions and Relevance: The machine learning model outperformed regression in predicting childhood lead poisoning, especially in identifying children at highest risk. Such a model could be used to target the allocation of lead poisoning prevention resources to these children.</p> 000011171 536__ $$oRobert Wood Johnson Foundation$$c73354 000011171 540__ $$a<p>© 2020 Potash E et al.</p> <p>This is an open access article distributed under the terms of the <a href=”https://jamanetwork.com/pages/cc-by-license-permissions”>CC-BY</a> License.</p> 000011171 542__ $$fCC BY 000011171 690__ $$aHarris School of Public Policy Studies 000011171 7001_ $$aPotash, Eric$$uUniversity of Chicago 000011171 7001_ $$aGhani, Rayid$$uCarnegie Mellon University 000011171 7001_ $$aWalsh, Joe$$uUniversity of Chicago 000011171 7001_ $$aJorgensen, Emile$$uChicago Department of Public Health 000011171 7001_ $$aLohff, Cortland$$uSouthern Nevada Health District 000011171 7001_ $$aPrachand, Nik$$uChicago Department of Public Health 000011171 7001_ $$aMansour, Raed$$uChicago Department of Public Health 000011171 773__ $$tJAMA Network Open 000011171 8564_ $$yArticle$$9289dd5d5-5544-40b1-806c-803ab8d1a35b$$s964129$$uhttps://knowledge.uchicago.edu/record/11171/files/potash_2020_oi_200483_1599661889.87241.pdf$$ePublic 000011171 8564_ $$ySupplemental files$$99d610739-e805-4b71-b8ab-81895df000c5$$s529700$$uhttps://knowledge.uchicago.edu/record/11171/files/zoi200483supp1_prod_1599661889.87241.pdf$$ePublic 000011171 908__ $$aI agree 000011171 909CO $$ooai:uchicago.tind.io:11171$$pGLOBAL_SET 000011171 983__ $$aArticle