High Impact Practices Student Showcase Spring 2025

Predicting Milk Quality with Machine Learning
Files
Course Code
STA
Course Number
4164
Faculty/Instructor
Simone
Faculty/Instructor Email
nathaniel.simone@ucf.edu
Abstract, Summary, or Creative Statement
Milk is a critical component of diets around the world, offering essential nutrients such as protein, calcium, and vitamin B12. However, in many low-income and rural regions, milk quality is often compromised due to poor storage conditions, lack of pasteurization, and even intentional adulteration. Low-quality milk poses serious health risks, including bacterial infections, toxin exposure, and long-term organ damage. These challenges highlight the need for reliable, accessible methods to assess milk quality without relying on costly laboratory testing.
In this study, we propose a data-driven approach to predict milk quality using ordered logistic regression. The dataset used, sourced from Kaggle, includes 1,059 samples with both numerical (pH, temperature, color) and categorical (odor, taste, fat, turbidity) predictors. Our target variable is milk grade, categorized as low, medium, or high quality. We perform thorough exploratory data analysis to understand feature distributions and correlations, implement baseline and penalized (lasso) logistic regression models, and conduct diagnostics to check key assumptions such as proportional odds and linearity in the logit.
Our findings show that temperature, color, and odor are among the strongest predictors of milk grade, and that the proportional odds assumption does not hold—justifying our use of multinomial logistic regression instead. This work demonstrates the feasibility of building an interpretable, low-cost milk quality classifier using simple sensory-based features. Such a tool can help improve food safety, especially in underserved regions, by providing actionable insights into milk quality before it reaches consumers.
Keywords
Machine Learning; Artificial Intelligence; Healthcare; Safety; Wealth Inequality
Recommended Citation
Vidal, Franco and Nighting-Henderson, Matthew, "Predicting Milk Quality with Machine Learning" (2025). High Impact Practices Student Showcase Spring 2025. 26.
https://stars.library.ucf.edu/hip-2025spring/26
