Published March 19, 2025 | Version v1
Journal article Open

Unsupervised Learning of Progress Coordinates during Weighted Ensemble Simulations: Application to NTL9 Protein Folding

Description

A major challenge for many rare-event sampling strategies is the identification of progress coordinates that capture the slowest relevant motions. Machine-learning methods that can identify progress coordinates in an unsupervised manner have therefore been of great interest to the simulation community. Here, we developed a general method for identifying progress coordinates "on-the-fly" during weighted ensemble (WE) rare-event sampling via deep learning (DL) of outliers among sampled conformations. Our method identifies outliers in a latent space model of the system's sampled conformations that is periodically trained using a convolutional variational autoencoder. As a proof of principle, we applied our DL-enhanced WE method to simulate the NTL9 protein folding process. To enable rapid tests, our simulations propagated discrete-state synthetic molecular dynamics trajectories using a generative, fine-grained Markov state model. Results revealed that our on-the-fly DL of outliers enhanced the efficiency of WE by >3-fold in estimating the folding rate constant. Our efforts are a significant step forward in the unsupervised learning of slow coordinates during rare event sampling.

Data availability

All input files and scripts needed to run and analyze the WE simulations in this study are provided in the GitHub repository: https://github.com/westpa/DL-enhancedWE and deposited on Zenodo under DOI: 10.5281/zenodo.13387514.

Files

leung-et-al-2025-unsupervised-learning-of-progress-coordinates-during-weighted-ensemble-simulations-application-to-ntl9.pdf

Files (3.1 MB)

Additional details

Identifiers

DOI
10.1021/acs.jctc.4c01136
Other
oai:uchicago.tind.io:14784

Funding

National Science Foundation
CHE-2136142
National Institutes of Health
P01AI165077
National Institutes of Health
R01 GM1151805
National Science Foundation
2139536

UChicago Information

Division(s)
Physical Sciences Division
Department(s)
Computer Science