Files

Abstract

For the dissertation, we propose studying efficiently learning the optimal dynamic mechanism when the agents' valuations can be characterized by a Markov Decision Process (MDP). In Chapter 1, we provide a high-level motivation for our problem setting, discussing the motivations for combining theoretical Reinforcement Learning (RL) with dynamic mechanism design. In subsequent chapters, we provide three representative problems at the intersection of these fields. These chapters vary in terms of the difficulty of designing the optimal mechanism, the generality of function approximation, and the RL setup considered, providing a high-level overview of recent advances along this interdisciplinary research direction. In particular, in Chapter 2, we show how the revenue maximizing, incentive compatible, and ex-post individually rational mechanism can be learned computationally and sample efficiently, when the single buyer's type distribution is governed by a tabular MDP. In Chapter 3, we provide an online learning algorithm that learns the optimal second-price auction with reserve prices with $\tilde{O}(\sqrt{T})$ regret, even if the participating buyers behave strategically. In Chapter 4, we show that the welfare-maximizing, ex-ante incentive compatible, and ex-ante individually rational mechanism can be obtained under general function approximation setting using only a pre-collected data set, with no additional interactions with the environment.

Details

Actions

PDF

from
to
Export
Download Full History