Keywords
Machine Learning, GWAS, Elastic Net, Principal Component Regression (PCR), Partial Least Squares (PLS), Variable Selection, and Predictive Modeling in Genomics.
Description
This study examines the prediction of male flowering time in maize using high-dimensional genomic data within a genome-wide association study framework. Penalized regression and latent-variable dimension reduction methods are compared to address challenges related to multicollinearity, dimensionality, and variable selection in genomic prediction. A standardized preprocessing and cross-validation strategy is applied to ensure robust model evaluation. The findings illustrate the complementary roles of regularization and dimension reduction techniques for modeling complex polygenic traits in plant genomics.
Abstract
This study compares three machine learning approaches—Elastic Net, Principal Component Regression (PCR), and Partial Least Squares (PLS)—for variable selection and prediction within a high-dimensional Maize-GWAS framework. The goal was to accurately predict the complex polygenic trait of time to male flowering while managing the challenges of numerous, highly correlated genetic markers. The ENET model, which combines l1 and l2 penalties, delivered the highest predictive accuracy and successfully identified a select subset of the most influential genetic variants. In contrast, PCR and PLS, both utilizing dimension reduction, offered a significant advantage in computational speed and model stability. The findings confirm that while ENET provides the most precise genomic prediction, the latent variable methods offer a highly efficient and competitive alternative for analyzing complex traits.
Course Name
STA 5703 Data Mining 1
Instructor Name
Dr. Emil Agbemade
Rights

This work is licensed under a Creative Commons Attribution 4.0 International License.
College
College of Sciences
STARS Citation
Deb, Dipok, "Predicting Male Flowering Time in Maize Using Machine Learning Technique" (2026). Data Science and Data Mining. 51.
https://stars.library.ucf.edu/data-science-mining/51
Included in
Agriculture Commons, Data Science Commons, Food Biotechnology Commons, Genetics and Genomics Commons, Plant Sciences Commons