Formats
Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

### Abstract

In recent years, extensive research has focused on the $\ell_1$ penalized least squares (Lasso) estimators of high-dimensional regression when the number of covariates $p$ is considerably larger than the sample size $n$. However, there is limited attention paid to the properties of the estimators when the errors and/or the covariates are serially dependent and/or heavy tailed. This thesis concerns the theoretical properties of the Lasso estimators for linear regression with random design and weak sparsity under serially dependent and/or non-sub-Gaussian errors and covariates. In contrast to the traditional case in which the errors are independent and identically distributed (i.i.d.) and have finite exponential moments, we show that $p$ can be at most a power of $n$ if the errors have only finite polynomial moments. In addition, the rate of convergence becomes slower due to the serial dependence in errors and the covariates. We also consider sign consistency for model selection via Lasso when there are serial correlations in the errors or the covariates or both. Adopting the framework of functional dependence measure, we provide a detailed description on how the rates of convergence and the selection consistency of the estimators depend on the dependence measures and moment conditions of the errors and the covariates. We apply the results obtained for the Lasso method to now-casting with mixed-frequency data for which serially correlated errors and a large number of covariates are common. The empirical results show the superiority of Lasso procedure in both forecasting and now-casting. This thesis also proposes a new robust $M$-estimator for generalized linear models. We investigate properties of the proposed robust procedure and the classical Lasso procedure both theoretically and numerically. As an extension, we also introduce robust estimator for linear regression. We show that the proposed robust estimator for linear model will achieve the optimal rate which is the same as the one for i.i.d sub-Gaussian data. Simulation results show that the proposed method performs well numerically in terms of heavy-tailed and serially dependent covariates and/or errors, and it significantly outperforms the classical Lasso method. For applications, we demonstrate the regularized robust procedure via analyzing high-frequency trading data in finance. We also provide new Bousquet type inequalities for high-dimensional time series, which could be quite useful in empirical process of dependent data.