Machine Learning on Medical Imaging for Breast Cancer Risk Assessment

Robinson, Kayla Rae

doi:10.6082/uchicago.1805

Robinson, Kayla Rae

2019

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Breast cancer was the most frequently diagnosed cancer in women in 2018, and this trend is expected to continue in years to come. Imaging-based cancer screening, including full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT), plays a significant role in the early detection and diagnosis of breast cancer, as well as in cancer risk assessment. Breast cancer risk assessment is important to breast cancer screening protocols as it aims to identify women at an elevated risk of breast cancer who may benefit from specialized screening. There is growing interest in the potential of computer-aided image evaluation, or radiomics, to provide information for identifying these populations. Radiomic features are quantitative image descriptors of human-defined features. While their utility has been widely demonstrated on homogeneous image sets, inconsistent imaging conditions may introduce feature variations. However, consideration is not typically given to radiomic feature robustness in radiomics studies. Additionally, there does not exist consensus in the field on methodology and metrics for characterizing feature robustness. To fill this need, this dissertation proposes novel metrics for characterizing radiomic feature robustness on image sets in which the population underwent imaging on mammography units from two different vendors. Having proposed metrics for characterizing robustness, this dissertation then incorporates these metrics into a two-stage feature selection scheme to identify a set of robust, non-redundant, and descriptive features. Stage one involves hierarchical clustering and robustness metric comparison to identify robust and non-redundant features as feature candidates in an unsupervised manner. Stage two involves classification evaluation of the feature candidates identified in stage one to select features that are also descriptive in the clinical task at hand. This two-stage method was demonstrated on FFDM in the task of classifying the risk of breast cancer among patients with no abnormalities detected on mammographic screening exams. This dissertation then presents work investigating the use of deep learning in medical image analysis. Transfer learning was investigated for evaluation of FFDM and DBT images. Extracted features were used for classification in a conventional classifier, such as a support vector machine. In this dissertation, we applied transfer learning to the task of characterizing lesions as malignant or benign. Performance was compared between DBT and FFDM images as network inputs. This comparison is clinically relevant given the growing adoption of DBT at medical centers throughout the country. An additional area where machine learning may add value to medical image analysis is in the evaluation of temporal sequences of screening mammograms. It is standard clinical procedure to consider prior mammograms in the evaluation of current mammograms when that data is available. Long short-term memory (LSTM) networks were investigated for use in evaluating temporal sets of mammograms for classifying future lesions as malignant or benign. By incorporating temporal sequences of images, patterns over time may potentially lead to improvements in classification performance. This dissertation presents the following results. First, thisinvestigation found that box counting fractal dimension, Minkowski fractal dimension, and power law beta features were relatively robust across vendors. Given the proposed robustness metrics, a two-stage feature selection method was used to predict the risk of breast cancer in patients with no detectable mammographic abnormalities. A monotonically decreasing trend was observed in classification performance as feature robustness restrictions were loosened, suggesting value in robustness considerations. Investigations into transfer learning for lesion characterization found that, for mass and architectural distortion lesions, classification performance using a key slice from DBT was higher than classification performance using a FFDM image of the same lesion. Additionally, investigation into temporal mammography sequence analysis using deep learning revealed higher performance of LSTM networks over conventional evaluation of a single time point in the task of classing future lesions as malignant or benign. This suggests that there is potential value in considering antecedent images in a deep learning framework when performing computerized image analysis for classifying breast cancer lesions. This work is clinically significant as medical imaging for breast cancer screening is a common practice across the world, and an improvement in the analysis of screening images has the potential to impact a large number of people. This work presents a method for the incorporation of robustness considerations into radiomics studies to improve their generalizability of these studies.