Personalized and precision medicine is a global challenge that requires clear understanding of etiology of common human diseases and complex traits. The rise of data-intensive research and recent advances in genotyping, sequencing and other systems approaches have created an unprecedented opportunity to develop accurate medical diagnosis, efficient therapeutic interventions and cost-effective preventive care. However, modernization of disease classification system that properly integrates our understanding of causal molecular and environmental factors is still in an early stage. In Chapter 1, I will briefly outline recent progresses and challenges associated with our understanding of etiology of common human diseases and complex traits. In Chapter 2 of this dissertation, I will describe a series of studies that utilized the large diagnostic data available through health insurance claims of 150 million patients to discover genetic and especially, environmental factors, that have significant effects on multiple diseases. In Chapter 3, I will illustrate an accurate and reliable classification of complex diseases based on common genetic or environmental factors using in-depth data of about half a million patients. The same analysis can also be used to quantify the genetic and environment effects of hundreds of diseases. In Chapter 4, I will introduces a carefully designed formal ontology and a corpus consists of two significant annotations of biomedical text aimed to facilitate rich digital phenotyping. Finally, in Chapter 5, I will summarize these results and describe how the integration of diverse data in medicine could lead to accurate and precise disease classification and digital phenotyping systems that will transform medical research and patient care.


