Published June 2023 | Version v1
Dissertation Open

A Statistical Learning Theory for Models with High Complexity

Creators

  • 1. University of Chicago

Contributors

Advisor:

Committee members:

Description

Understanding why high-dimensional estimators can generalize beyond finite training samples is a fundamental problem in statistical learning theory. The traditional intuition, as suggested by Occam's razor, is that models with low complexity tend to generalize better. We can often find simple models that explain the training data well if the high-dimensional data distribution has some hidden low-dimensional structure (for example, sparse linear regression and low-rank matrix recovery). However, contrary to our traditional intuition, complex models which interpolate noisy training labels can also enjoy good generalization in some settings. This phenomenon, which we call "interpolation learning," has significantly challenged our theoretical foundation of statistical learning. In this thesis, we present a novel Moreau envelope generalization theory to establish the concentration of measure in high dimensions. Since our result can precisely quantify the role of model complexity in generalization error, we can establish strong consistency results even though the norm of the high-dimensional interpolants that we consider diverges. In addition to proving sharp non-asymptotic bounds for interpolants in various contexts, we also recover versions of classical results from the compressed sensing and high-dimensional statistics literature. Applications of our theory include kernel ridge regression, max-margin classification, phase retrieval, matrix sensing, and some simple neural networks.

Files

Zhou_uchicago_0330D_16952.pdf

Files (4.0 MB)

Name Size Download all
md5:ad9f8134c3ba517b4024ca3308236736
4.0 MB Preview Download

Additional details

Identifiers

Other
oai:uchicago.tind.io:6497

UChicago Information

Division(s)
Physical Sciences Division
Department(s)
Statistics