Variable Selection, High Dimensional Data, Lasso Regression, Elastic Net Regression, Overfitting


When dealing with many predictors, it is always best to consider the essential variables. This process of considering a few valuable predictors is known as variable or feature selection. Many methods have been proposed in the literature for variable selection, such as best subset selection, forward selection, backward selection, stepwise selection with both forward and backward selection, lasso shrinkage method, elastic net shrinkage method, etc. Also, the dimension reduction method, such as principal component analysis and t-distributed stochastic neighbor embedding (t-SNE), have been commonly used to solve this problem.


Variable selection is one of the key components in the machine learning area. This method reduces the unwanted and redundant predictors in the model, which prevents the overfitting situation. Since the model contains few significant predictors, the model is less likely to learn the trend from the noise. Further, the time to train the model reduces when we have only a few valuable variables.


Spring 2023

Course Name

STA 5703 Data Mining 1

Instructor Name

Xie, Rui


College of Sciences

Included in

Data Science Commons