Variational Inference (VI) has become a popular technique to approximate difficult-to-compute posterior distributions for decades. It has been used in many applications and tends to be faster than classical methods, such as Monte Carlo Markov Chain. However, there are few theoretical understandings about it. In this thesis, our goal is to build a statistical guarantee for the variational inference method under high-dimensional or nonparametric settings. We apply our theoretical results to develop a general variational Bayes (VB) algorithm for a group of high dimensional linear structure models. At the end of this thesis, we point out the relations between variational Bayes and empirical Bayes and propose a general convergence result for empirical Bayes posterior distributions. In Chapter 2, we develop a ``prior mass and testing" framework to show the concentration results of the variational posterior distribution and then apply these results to the Gaussian sequence model, infinite-dimensional exponential family, and piecewise constant model. We also propose the convergence results of variational posterior distribution with model selection. At the end of this chapter, we provide some discussions on the properties of variational inference. In Chapter 3, we propose a general VB algorithm for a group of high dimensional linear structured models. These models include but are not limited to stochastic block model, biclustering model, sparse linear regression, multiple regression with group sparsity, multi-task learning, and dictionary learning. Theoretically, we can prove an oracle type of contraction result for the variational posterior distribution. Empirically, the VB algorithm outperforms the classical spectral method in the stochastic block model and LASSO estimator in sparse linear regression as long as the signal-to-noise ratio is large. In Chapter 4, we demonstrate that the empirical Bayes procedure can be viewed as the variational Bayes procedure with a particular variational set. Then, we propose a theorem for the concentration result of Empirical Bayes posterior distributions under the case when the true parameters are unbounded. Finally, this result is applied to the sparse sequence model, sparse linear regression, and the general linear structured models discussed in Chapter 3.