Accelerating Protein Design with Deep Learning

McPartlon, Matthew Timothy

doi:10.6082/uchicago.7659

Accelerating Protein Design with Deep Learning

McPartlon, Matthew Timothy

2023

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Cite

Files

Abstract

The human proteome comprises tens of thousands of proteins, each tailored for a specific function by the selective pressures of evolution. The field of protein design seeks to develop proteins with new or enhanced functions at will, ultimately bypassing the evolutionary clock. In this thesis, we introduce general machine-learning methods for accelerating protein design, with a particular focus on modeling protein structure. First, we propose an approach for fixed-backbone design (Chapter 2), the problem of designing primary sequence and side-chain rotamers for a given backbone conformation. Whereas classic approaches formulate sequence and rotamer design tasks separately, we offer an approach to solve both simultaneously. To realize this, we develop a deep neural network that effectively leverages backbone coordinates. By exploiting backbone geometry, we efficiently represent atomic microenvironments at the coordinate level and ultimately avoid discrete rotamer sampling. This results in more robust designs and accurate quality estimates for downstream tasks. Next, we introduce a framework for flexible protein-protein docking (Chapter 3), the task of determining the structure of a protein complex given the unbound structures of its constituent chains. Traditional docking methods are limited by their reliance on empirical physics-based scoring functions, inability to accommodate conformational flexibility, and failure to incorporate binding sites. To address these challenges, we propose an end-to-end approach that can model conformational changes and target specific interactions while significantly reducing computational time. As one of the pioneering deep learning methods for this task, we uncover key determinants underlying our success and provide important insights for future research. Finally, we highlight the generality of our approach by extending it to simultaneously dock and co-design the sequence and structure of antibody complementarity-determining regions targeting a specified epitope.