Linear regression regularization, genome-wide association study (GWAS), LASSO


The accurate estimation of the male flowering period in Maize crops is key for the prediction crop fertility. The recent scientific investigations has shown that the genetic single nucleotic polymorphism (SNP) can contribute in this regard. The genomewide association study (GWAS) is employed to generate these attributes (SNP). But it caused a high-dimensional data in which 4,981 observations with 7,389 SNP attributes. Hence, in this study, we used the penalized regression approach with the least absolute shrinkage and selection operator (Lasso) to reduce the dataset. In this regard, we set the regularization parameter to 0.21. It resulted in a set with 24 SNP markers to the predict of the days to anthesis (DtoA) in Maize plant

Course Name

STA 5703 Data Mining 1

Instructor Name

Rui Xie


College of Sciences

Included in

Data Science Commons