Files

Abstract

Artificial intelligence (AI) has become a driving force in medical imaging, from applications in breast cancer screening to COVID-19. Within the field of breast cancer screening, AI systems using human-engineered radiomic features and deep learning extracted features have shown promising performance in breast imaging diagnosis, detection, and risk assessment. However, AI has not yet been applied to the investigation of a breast cancer field effect, in which histologically normal areas of the parenchyma show molecular similarity to the tumor. Identification of a cancer field effect in mammography has the potential to provide a novel approach to stratification of breast cancer risk in the general population. Furthermore, development of a temporal risk assessment model would expand upon the potential impact of utilizing AI-based tools to predict risk of future cancer from the breast parenchyma.As a result of the explosion of machine intelligence algorithm development for understanding and characterizing a wide variety of diseases, including breast cancer and COVID-19, validation of algorithm performance and generalizability have become increasingly important. To ensure that AI systems are robust and generalizable, the data with which they are evaluated should be population-representative and independent of that used for training. The development of novel algorithmic methods for the creation of a large, common sequestered dataset and task-based sampling would enable robust evaluations of AI algorithms on representative datasets. A sequestered database for algorithm testing could also allow for expedited clinical implementation of algorithms developed for medical decision-making if accepted by regulating bodies. Aim 1: Mammograms and mastectomy specimen radiographs of women with a malignant tumor were investigated using radiomic and deep learning based features to provide initial characterization of a breast cancer field effect in imaging. Features were extracted from four regions: within the tumor, near to the tumor, far from the tumor, and in the contralateral breast. Results found statistically significant correlations of feature values with the region’s proximity to the tumor in intensity-based features and select structure-based features. Aim 2: To improve upon conventional breast cancer risk assessment models, a method that analyzes prior mammography data to predict future occurrence of breast cancer was implemented. The long-short-term memory network (LSTM), a network that can incorporate AI-based features into a temporal model, was utilized and compared to classification using only a single time point. The resulting LSTM network was able to predict incidence of cancer in the subsequent year with performance significantly better than guessing. Aim 3: Data used in the development and evaluation of AI models play a significant role in the robustness and generalizability of the model performance. To enable independent assessment of algorithms using a multi-institutional data commons, a first-of-its-kind sequestered commons was initiated using a developed method of multi-dimensional stratified sampling. To draw an independent sample for performance evaluation from the commons, a novel method of task-based distribution sampling was also developed. This aim was completed in collaboration with the Medical Imaging and Data Resource Center (MIDRC), a multi-institutional effort to accelerate machine intelligence research for COVID-19.

Details

Actions

PDF

from
to
Export
Download Full History