Probabilistic Penalized Principal Component Analysis
Keywords
EM algorithm; Non-convex penalty; Oracle estimators; Penalized likelihood; Probability model; Variable selection
Abstract
A variable selection method based on probabilistic principal component analysis (PCA) using penalized likelihood method is proposed. The proposed method is a two-step variable reduction method. The first step is based on the probabilistic principal component idea to identify principle components. The penalty function is used to identify important variables in each component. We then build a model on the original data space instead of building on the rotated data space through latent variables (principal components) because the proposed method achieves the goal of dimension reduction through identifying important observed variables. Consequently, the proposed method is of more practical use. The proposed estimators perform as the oracle procedure and are root-n consistent with a proper choice of regularization parameters. The proposed method can be successfully applied to high-dimensional PCA problems with a relatively large portion of irrelevant variables included in the data set. It is straightforward to extend our likelihood method in handling problems with missing observations using EM algorithms. Further, it could be effectively applied in cases where some data vectors exhibit one or more missing values at random.
Publication Date
3-1-2017
Publication Title
Communications for Statistical Applications and Methods
Volume
24
Issue
2
Number of Pages
143-154
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.5351/CSAM.2017.24.2.143
Copyright Status
Unknown
Socpus ID
85044057926 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85044057926
STARS Citation
Park, Chongsun; Wang, Morgan C.; and Mo, Eun Bi, "Probabilistic Penalized Principal Component Analysis" (2017). Scopus Export 2015-2019. 5259.
https://stars.library.ucf.edu/scopus2015/5259