Sparse Proximal Support Vector Machines For Feature Selection In High Dimensional Datasets
Keywords
Class-specific feature selection; Embedded feature selection; High dimensional datasets; Regularization; Sparsity
Abstract
Classification of High Dimension Low Sample Size (HDLSS) datasets is a challenging task in supervised learning. Such datasets are prevalent in various areas including biomedical applications and business analytics. In this paper, a new embedded feature selection method for HDLSS datasets is introduced by incorporating sparsity in Proximal Support Vector Machines (PSVMs). Our method, called Sparse Proximal Support Vector Machines (sPSVMs), learns a sparse representation of PSVMs by first casting it as an equivalent least squares problem and then introducing the l1-norm for sparsity. An efficient algorithm based on alternating optimization techniques is proposed. sPSVMs removes more than 98% of features in many high dimensional datasets without compromising on generalization performance. Stability in the feature selection process of sPSVMs is also studied and compared with other univariate filter techniques. Additionally, sPSVMs offers the advantage of interpreting the selected features in the context of the classes by inducing class-specific local sparsity instead of global sparsity like other embedded methods. sPSVMs appears to be robust with respect to data dimensionality. Moreover, sPSVMs is able to perform feature selection and classification in one step, eliminating the need for dimensionality reduction on the data. To that end, sPSVMs can be used for preprocessing free classification tasks.
Publication Date
12-15-2015
Publication Title
Expert Systems with Applications
Volume
42
Issue
23
Number of Pages
9183-9191
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.1016/j.eswa.2015.08.022
Copyright Status
Unknown
Socpus ID
84940885134 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/84940885134
STARS Citation
Pappu, Vijay; Panagopoulos, Orestis P.; Xanthopoulos, Petros; and Pardalos, Panos M., "Sparse Proximal Support Vector Machines For Feature Selection In High Dimensional Datasets" (2015). Scopus Export 2015-2019. 1223.
https://stars.library.ucf.edu/scopus2015/1223