Published June 2019 | Version v1
Dissertation Open

Statistical Machine Learning Methods for Complex, Heterogeneous Data

  • 1. University of Chicago

Contributors

Advisor:

Description

This thesis develops statistical machine learning methodology for three distinct tasks. Each method blends classical statistical approaches with machine learning methods to provide principled solutions to problems with complex, heterogeneous datasets. The first framework proposes two methods for high-dimensional shape-constrained regression and classification. These methods reshape pre-trained prediction rules to satisfy shape constraints like monotonicity and convexity. The second method provides a nonparametric approach to the econometric analysis of discrete choice. This method provides a scalable algorithm for estimating utility functions with random forests, and combines this with random effects to properly model preference heterogeneity. The final method draws inspiration from early work in statistical machine translation to construct embeddings for variable-length objects like mathematical equations.

Files

Bonakdarpour_uchicago_0330D_14677.pdf

Files (569.2 kB)

Name Size Download all
md5:aaccd594409bf352943e05e270f4a010
569.2 kB Preview Download

Additional details

Identifiers

Other
oai:uchicago.tind.io:1788

UChicago Information

Division(s)
Physical Sciences Division
Department(s)
Statistics