High Impact Practices Student Showcase Spring 2025

Predicting Milk Quality with Machine Learning

Predicting Milk Quality with Machine Learning

Streaming Media

Files

Course Code

STA

Course Number

4164

Faculty/Instructor

Simone

Faculty/Instructor Email

nathaniel.simone@ucf.edu

About the Author

The authors of this project are Franco Vidal and Matthew Nighting-Henderson. We are both 3rd year data science majors who are eager about solving real-world problems.

Abstract, Summary, or Creative Statement

Milk is a critical component of diets around the world, offering essential nutrients such as protein, calcium, and vitamin B12. However, in many low-income and rural regions, milk quality is often compromised due to poor storage conditions, lack of pasteurization, and even intentional adulteration. Low-quality milk poses serious health risks, including bacterial infections, toxin exposure, and long-term organ damage. These challenges highlight the need for reliable, accessible methods to assess milk quality without relying on costly laboratory testing.

In this study, we propose a data-driven approach to predict milk quality using ordered logistic regression. The dataset used, sourced from Kaggle, includes 1,059 samples with both numerical (pH, temperature, color) and categorical (odor, taste, fat, turbidity) predictors. Our target variable is milk grade, categorized as low, medium, or high quality. We perform thorough exploratory data analysis to understand feature distributions and correlations, implement baseline and penalized (lasso) logistic regression models, and conduct diagnostics to check key assumptions such as proportional odds and linearity in the logit.

Our findings show that temperature, color, and odor are among the strongest predictors of milk grade, and that the proportional odds assumption does not hold—justifying our use of multinomial logistic regression instead. This work demonstrates the feasibility of building an interpretable, low-cost milk quality classifier using simple sensory-based features. Such a tool can help improve food safety, especially in underserved regions, by providing actionable insights into milk quality before it reaches consumers.

Keywords

Machine Learning; Artificial Intelligence; Healthcare; Safety; Wealth Inequality

Predicting Milk Quality with Machine Learning


Share

COinS