On Learning and Optimization in Inverse Problems with Group Structured Latent Variables

Paul, Sounak

doi:10.6082/uchicago.12967

On Learning and Optimization in Inverse Problems with Group Structured Latent Variables

Paul, Sounak

2024

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Cite

Files

Abstract

Inverse problems are ubiquitous in science and engineering, manifesting whenever we seek to determine the underlying causes or parameters that give rise to observed data. These problems often involve latent variables, which in many cases, follow a group structure. In this class of inverse problems, we aim to estimate an unknown function after being distorted by a group action and observed via a known operator, with the observations typically being contaminated with a non-trivial level of noise. Two particular such problems of interest in this thesis are multireference alignment (MRA) and single-particle reconstruction (SPR) in cryo-electron microscopy (cryo-EM). SPR is a widely used technique for estimating the 3-D volume of a single macromolecule (often referred to as volume or signal) given several of its noisy 2-D projections taken at unknown viewing angles. In Chapter 1 we discuss the problem setting and mathematically formulate both MRA and cryo-EM.

The method of moments (MoM) is a powerful technique used to suppress the noise, and provide a low-resolution ab initio initialization for the 3-D structure in cryo-EM. Maximum likelihood estimation (MLE) based approaches like Expectation Maximization (EM) or Empirical Risk Minimization (ERM) are widely used for iterative refinement of the ab initio structure to obtain high-resolution reconstructions. This thesis broadly deals with developing deep neural networks for solving inverse problems with group structured latent variables via MoM, and accelerating MLE-based methods using variance reduction techniques and second-order information.

In Chapter 2 we suggest using the method of moments approach for both problems while introducing deep neural network priors. In particular, given a set of datasets, each containing observations corresponding to a single signal and distribution, our neural networks should output the signals and the distribution of group elements, with moment pairs of each dataset being the input. For MRA, we demonstrate the advantage of using the trained network to accelerate the convergence of the reconstruction of signals from moments coming from an unknown dataset. Finally, we use our method to reconstruct simulated and biological volumes in the cryo-EM setting.

Chapter 3 is a direct extension of Chapter 2, in which we introduce MoM-net, a deep neural network for learning the moment inversion map for a more generalized cryo-EM setting where we assume the presence of small shifts in the projections. Our neural network is trained to output the spherical harmonic coefficients of the volumes along the distribution of rotations and shift variance, with moments from a set of datasets being the input. We also demonstrate the acceleration of convergence for the reconstruction using the trained neural network in this general cryo-EM setting, and use our method to reconstruct biological volumes.

In Chapter 4 we study the same problems but using a different framework, i.e. maximum likelihood. Maximization of the likelihood function is usually carried out using first-order ERM and EM methods which suffer from slow convergence rates, while their stochastic versions have high variance in parameter updates. Stochastic variance-reduced gradient (SVRG) methods have been proposed in the literature to improve convergence rates and stability by reducing the variance of the stochastic updates. This chapter thus explores the application of SVRG and stochastic variance-reduced EM (sEM-vr) methods, along with their second-order accelerated variants, in solving MRA and SPR. A second-order acceleration of sEM-vr is also proposed. We conduct extensive experiments on simulated datasets illustrating the applicability of variance-reduced methods for both of these problems.

We end with Chapter 5, where we provide final thoughts on the overarching theme of this thesis, and discuss the strengths and drawbacks of our methods, along with potential future research steps.