Published December 27, 2018 | Version v1
Journal article Open

Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours

  • 1. University of Chicago

Description

An ongoing challenge in protein chemistry is to identify the underlying interaction energies that capture protein dynamics. The traditional trade-off in biomolecular simulation between accuracy and computational efficiency is predicated on the assumption that detailed force fields are typically well-parameterized, obtaining a significant fraction of possible accuracy. We re-examine this trade-off in the more realistic regime in which parameterization is a greater source of error than the level of detail in the force field. To address parameterization of coarse-grained force fields, we use the contrastive divergence technique from machine learning to train from simulations of 450 proteins. In our procedure, the computational efficiency of the model enables high accuracy through the precise tuning of the Boltzmann ensemble. This method is applied to our recently developed Upside model, where the free energy for side chains is rapidly calculated at every time-step, allowing for a smooth energy landscape without steric rattling of the side chains. After this contrastive divergence training, the model is able to de novo fold proteins up to 100 residues on a single core in days. This improved Upside model provides a starting point both for investigation of folding dynamics and as an inexpensive Bayesian prior for protein physics that can be integrated with additional experimental or bioinformatic data.

Data availability

All relevant data are within the paper and its Supporting Information files. Trajectories can be obtained from Dryad (https://doi.org/10.5061/dryad.h9f8sb7).

Files

journal.pcbi.1006578.pdf

Files (5.0 MB)

Name Size Download all
Article
md5:b655034302c4b3c369ee65b1f128619f
4.8 MB Preview Download
Supporting information
md5:a26373af4f6084ae6b02dd8e5abf8367
169.4 kB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pcbi.1006578
Other
oai:uchicago.tind.io:6368

Funding

National Institute of Health General Medical Sciences
GM55694
National Institute of Health General Medical Sciences
T32GM008720
National Science Foundation
CHE-1363012
National Science Foundation
MCB-1517221
Natural Sciences and Engineering Research Council of Canada
fellowship

UChicago Information

Division(s)
Biological Sciences Division, Physical Sciences Division
Department(s)
Biochemistry and Molecular Biology, Biophysical Sciences, Chemistry
Center(s) or Institute(s)
Institute for Biophysical Dynamics, James Franck Institute