High Impact Practices Student Showcase Spring 2025

An Analysis of Large Language Model Performance

An Analysis of Large Language Model Performance

Streaming Media

Files

Link to Full Text

Download Full Text

Course Code

STA

Course Number

4164

Faculty/Instructor

Instructor Nathaniel Simone

Faculty/Instructor Email

nathaniel.simone@ucf.edu

About the Author

I am a student taking Statistical Methods III with an interest in AI

Abstract, Summary, or Creative Statement

The development of Large Language Models (LLMs) represents a paradigm shift in the way we engage in learning. This study investigates the use of Multiple Linear Regression to predict LLM performance. By using 370 different observations using information gathered from various LLM leaderboards, I created a model with MMLU (an LLM benchmark) as the response variable and used every other variable as the response variable. Throughout the model selection process, transformations were performed on a refined model, tests to make sure the model didn't have collinearity or was overfit. The model has a p-value of 2.2e-16 meaning the model is significant and a F-Statistic of 145.8 suggesting a strong explanatory value. With an adjusted R-Squared of 0.8644 meaning the model is a very strong fit for outcome explanation.

Additional Resources

Samay Deepak Ashar. (2025). Large Language Models Comparison Dataset [Data set]. Kaggle. https://doi.org/10.34740/KAGGLE/DSV/10841276

Daniel, J., & Martin, J. (n.d.). Speech and Language Processing Large Language Models. Retrieved April 6, 2025, from https://web.stanford.edu/~jurafsky/slp3/old_aug24/10.pdf

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Keywords

Statistics Machine-Learning AI Regression

An Analysis of Large Language Model Performance


Share

COinS