Keywords
Variable Selection, High Dimensional Data, Lasso Regression, Elastic Net Regression, Overfitting
Description
When dealing with many predictors, it is always best to consider the essential variables. This process of considering a few valuable predictors is known as variable or feature selection. Many methods have been proposed in the literature for variable selection, such as best subset selection, forward selection, backward selection, stepwise selection with both forward and backward selection, lasso shrinkage method, elastic net shrinkage method, etc. Also, the dimension reduction method, such as principal component analysis and t-distributed stochastic neighbor embedding (t-SNE), have been commonly used to solve this problem.
Abstract
Variable selection is one of the key components in the machine learning area. This method reduces the unwanted and redundant predictors in the model, which prevents the overfitting situation. Since the model contains few significant predictors, the model is less likely to learn the trend from the noise. Further, the time to train the model reduces when we have only a few valuable variables.
Semester
Spring 2023
Course Name
STA 5703 Data Mining 1
Instructor Name
Xie, Rui
College
College of Sciences
STARS Citation
Dhakal, Pradip, "Variable Selection Using Lasso and Elastic Net Regression on High Dimensional Genetic Architecture Data of Maize Flowering Time" (2023). Data Science and Data Mining. 10.
https://stars.library.ucf.edu/data-science-mining/10