Files
Abstract
Statistical learning for tensors has gained increasing attention over the recent years. We will present methods for two major tensor-related statistical problems, tensor completion and tensor clustering. Fibers in tensor that are missing not at random frequently appear in many applications. Typical tensor completion methodology relies on entries being revealed uniform randomly and may lead to underestimated mean squared error when evaluating specific imputation mechanisms. We propose to use propensity scores to remove selection bias for revealed fibers and derive sample size requirement to get consistent estimate for true tensor given noisy partially observed fibers. The second tensor completion problem we will introduce is on recovering low rank tensors with partially observed entries and auxiliary information. Utilizing auxiliary information, or side information, such as the feature covariates, is a plausible way to achieve compression and decomposition of high-dimensional low-rank tensors. We propose a method that employs interaction among different modes as well as sparsity structure and prove exact recovery and e-recovery when auxiliary information are perfectly measured and when they are corrupted. Moreover, we consider multi-modes tensor data clustering and propose a fused version of alternating least squares algorithm to perform tensor factorization and clustering simultaneously. Statistical convergence rates of recovering and clustering consistency are established.