Files
Abstract
The modern medical data deluge accelerated when the vast amount of medical information gathered and stored by electronic sensors became widely available. Medical data are complex, heterogeneous, and continue to rapidly accumulate in electronic databases, therefore, data-driven statistical learning techniques have the potential to drastically improve clinical care by anticipating clinical complications and suggesting interventions.This dissertation investigates the application of an assortment of statistical learning techniques to extract instructive patterns from raw medical data. Chapter 1 provides a brief overview of current statistical learning methods. We also examine both the limitations and the opportunities for state-of-the-art developments in medical forecasting. Chapter 2 introduces a project that began as a mere conjecture formulated by an endocrinologist but developed into a large-data analysis of linked pathogenesis, linking pancreatitis and type 2 diabetes mellitus. Chapter 3 describes a study in which we collaborated with a gerontologist interested in predicting cognitive decline in senior patients. In this study, we attempt such predictions by using accelerometry data collected from Chicago's south side community and implementing advanced machine learning methods for predicting patients' future clinical trajectories. In Chapter 4, we identify the novel, hip fracture risk factors and investigate whether statistical survival analysis could improve upon existing tools' accuracy. In Chapter 5, we constructed a state-of-art machine learning tool on fracture detection on patients’ broad prior disease history. Lastly, Chapter 6 summarizes the above projects and suggests future directions for our exploration of statistical learning from complex medical data. We also discuss our studies' potential importance for statistical learning from medical data and outline the problems that remain open in the field.