Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

High-dimensional optimal control problems for Reflected Brownian Motion (RBM), common in finance and operations, are challenging due to the curse of dimensionality in solving Hamilton-Jacobi-Bellman (HJB) equations. This thesis explores deep learning solutions for RBM drift control under discounted costs, leveraging Backward Stochastic Differential Equations (BSDEs) and Malliavin calculus to derive probabilistic representations for the value function gradient. Two distinct deep learning algorithms using these representations are compared: Method 1 directly minimizes a loss function based on an explicit Malliavin gradient formula using a single neural network; Method 2 employs a nested iterative structure, using Deep Picard Iteration (DPI) principles to solve a Malliavin-derived fixed- point equation for the gradient via simpler regression steps. Numerical experiments on a 1D benchmark, motivated by optimal liquidation problems in finance, show both methods achieve high accuracy against analytical solutions. However, Method 2 demonstrated superior robustness, converging reliably across different control bounds without the loss function modifications and careful learning rate tuning required to stabilize Method 1 under more challenging conditions. The findings suggest the iterative DPI approach (Method 2) offers enhanced stability and practical advantages for this class of problems.

Details

from
to
Export