Probabilistic Penalized Principal Component Analysis

Keywords

EM algorithm; Non-convex penalty; Oracle estimators; Penalized likelihood; Probability model; Variable selection

Abstract

A variable selection method based on probabilistic principal component analysis (PCA) using penalized likelihood method is proposed. The proposed method is a two-step variable reduction method. The first step is based on the probabilistic principal component idea to identify principle components. The penalty function is used to identify important variables in each component. We then build a model on the original data space instead of building on the rotated data space through latent variables (principal components) because the proposed method achieves the goal of dimension reduction through identifying important observed variables. Consequently, the proposed method is of more practical use. The proposed estimators perform as the oracle procedure and are root-n consistent with a proper choice of regularization parameters. The proposed method can be successfully applied to high-dimensional PCA problems with a relatively large portion of irrelevant variables included in the data set. It is straightforward to extend our likelihood method in handling problems with missing observations using EM algorithms. Further, it could be effectively applied in cases where some data vectors exhibit one or more missing values at random.

Publication Date

3-1-2017

Publication Title

Communications for Statistical Applications and Methods

Volume

24

Issue

2

Number of Pages

143-154

Document Type

Article

Personal Identifier

scopus

DOI Link

https://doi.org/10.5351/CSAM.2017.24.2.143

Socpus ID

85044057926 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85044057926

This document is currently not available here.

Share

COinS