Published April 9, 2025 | Version v1
Journal article Open

Chromatin structures from integrated AI and polymer physics model

  • 1. University of Chicago

Description

The physical organization of the genome in three-dimensional space regulates many biological processes, including gene expression and cell differentiation. Three-dimensional characterization of genome structure is critical to understanding these biological processes. Direct experimental measurements of genome structure are challenging; computational models of chromatin structure are therefore necessary. We develop an approach that combines a particle-based chromatin polymer model, molecular simulation, and machine learning to efficiently and accurately estimate chromatin structure from indirect measures of genome structure. More specifically, we introduce a new approach where the interaction parameters of the polymer model are extracted from experimental Hi-C data using a graph neural network (GNN). We train the GNN on simulated data from the underlying polymer model, avoiding the need for large quantities of experimental data. The resulting approach accurately estimates chromatin structures across all chromosomes and across several experimental cell lines despite being trained almost exclusively on simulated data. The proposed approach can be viewed as a general framework for combining physical modeling with machine learning, and it could be extended to integrate additional biological data modalities. Ultimately, we achieve accurate and high-throughput estimations of chromatin structure from Hi-C data, which will be necessary as experimental methodologies, such as single-cell Hi-C, improve.

Data availability

The source code for implementing, training, and applying our approach is available at the GitHub repository: https://github.com/ERSchultz/GNN_HiC_to_Structure.

Files

journal.pcbi.1012912.pdf

Files (25.0 MB)

Name Size Download all
Article
md5:7ff28e941f9d2c55246df93efa4dd0c1
9.4 MB Preview Download
Supporting information
md5:9c3db041b3a69860303c9e19cb2aad33
15.5 MB Preview Download

Additional details

Identifiers

DOI
10.1371/journal.pcbi.1012912
Other
oai:uchicago.tind.io:14901

Funding

National Science Foundation
DMS-2023109
National Science Foundation
DMS-AWD00000326
National Science Foundation
PHY2317138
Simons Foundation
MP-TMPS-00005320

UChicago Information

Division(s)
Physical Sciences Division, Pritzker School of Molecular Engineering
Department(s)
Computer Science, Statistics