A Triptych in Computation: Deep Learning for Molecular Mass Spectra, Sum-of-Squares Optimization, and Diffusion Generative Processes

Zhu, Richard

doi:10.6082/uchicago.13644

Zhu, Richard

2024

Download

Formats

Add to Basket

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Mathematics, as Eugene Wigner once noted, has no inherent reason to be as effective in the natural sciences as it is. Yet, those who seek to model the world have long used it to formulate powerful physical theories -- from explaining the motions of planets to the dynamics of electricity and even the quantum-mechanical behavior of particles too small to observe. While many of the big questions have been answered, countless others remain. Why are some problems so easily solved, while others remain stubbornly intractable? What governs the dynamics of complex systems? How do we distinguish real regularities in data from phantom fluctuations? Today's solutions look very different from yesterday's as our reliance on data and predictive power continues to grow. In such a world, efficiency and simplicity matter more than ever. This thesis presents three seemingly unrelated ideas. First, we present a data-driven approach to structured prediction of mass spectra. Mass spectrometry is commonly used in analytical chemistry as a means to characterize compounds, as it counts and weighs the fragments from the high-energy breakdown of molecules. By combining supervision from the substructures generated in the fragmentation of small molecules with graph neural networks, we achieve state-of-the-art performance in the prediction of electron-ionization mass spectra. Next, we explore a semidefinite programming perspective on parametric polynomial optimization. We demonstrate how parameterized polynomial optimization can be lower bounded by the solution to an infinite-dimensional sum-of-squares optimization problem and we show how semidefinite programming can be used to approximate a solution. We prove the convergence of the resulting hierarchy (a variant of the Lasserre SOS hierarchy) and present some practical applications. Finally, we do a deep dive into generative diffusion processes. We discuss the connections between generative diffusion processes and physics, closely examine the structure of score-matching, and illustrate ways to uncover the structure of a problem from sparsity priors. Each of these topics illuminates a distinct facet of computational science -- from the efficient use of structured data in deep learning to the power and challenges of semidefinite programming, and finally, returning to the continued inspiration that physics offers for modern modeling approaches. May their synthesis be an ode to modern computational thinking and a tribute to the simple yet powerful ideas of the past, present, and future.