High Impact Practices Student Showcase Spring 2025

An Analysis of Large Language Model Performance
Files
Download Full Text
Course Code
STA
Course Number
4164
Faculty/Instructor
Instructor Nathaniel Simone
Faculty/Instructor Email
nathaniel.simone@ucf.edu
Abstract, Summary, or Creative Statement
The development of Large Language Models (LLMs) represents a paradigm shift in the way we engage in learning. This study investigates the use of Multiple Linear Regression to predict LLM performance. By using 370 different observations using information gathered from various LLM leaderboards, I created a model with MMLU (an LLM benchmark) as the response variable and used every other variable as the response variable. Throughout the model selection process, transformations were performed on a refined model, tests to make sure the model didn't have collinearity or was overfit. The model has a p-value of 2.2e-16 meaning the model is significant and a F-Statistic of 145.8 suggesting a strong explanatory value. With an adjusted R-Squared of 0.8644 meaning the model is a very strong fit for outcome explanation.
Additional Resources
Samay Deepak Ashar. (2025). Large Language Models Comparison Dataset [Data set]. Kaggle. https://doi.org/10.34740/KAGGLE/DSV/10841276
Daniel, J., & Martin, J. (n.d.). Speech and Language Processing Large Language Models. Retrieved April 6, 2025, from https://web.stanford.edu/~jurafsky/slp3/old_aug24/10.pdf
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Keywords
Statistics Machine-Learning AI Regression
Recommended Citation
Marshall, Lee Edward, "An Analysis of Large Language Model Performance" (2025). High Impact Practices Student Showcase Spring 2025. 7.
https://stars.library.ucf.edu/hip-2025spring/7
