High Impact Practices Student Showcase Spring 2026

Streaming Media

 
Media is loading

Files

Download

Download Full Text (617 KB)

Course Code

STA

Course Number

4164

Faculty/Instructor

Instructor Nathaniel Simone

Faculty/Instructor Email

nathaniel.simone@ucf.edu

About the Author

Hello, I'm Aiden Akbarov. I'm a student here at UCF majoring in Data Science. I've always felt that choosing a major is one of the biggest gambles we make as students, so I wanted to use my background in data to see if I could actually predict the outcome. My project, 'Predicting the Paycheck,' is my way of taking all those numbers from the Census and turning them into something we can actually use to plan our futures.

I really want to thank my professor, Nathaniel Simone, for helping me wrap my head around the statistics side of this and for showing me how to build a model that actually holds up under testing. Also, thank you to the Amy Zeh Impact Showcase for letting me share how data science can help people make more informed life decisions.

Abstract, Summary, or Creative Statement

The goal of my project, Predicting the Paycheck: What Factors Truly Influence Graduate Earnings?, was to see if we could actually predict a college graduate's starting salary using data instead of just guessing. I used a dataset of 172 majors from the American Community Survey to look at how a student's field of study, the gender balance of their major, and the current job market all impact their first paycheck. I wanted to create a tool that helps students understand the financial reality of their degree before they even graduate.

To get my results, I used a Multiple Linear Regression model with a 70/30 data split. This means I used 70% of the data to teach my model and kept the other 30% hidden to test it later. I had to be very careful about which variables I included; I left out things like raw population counts to avoid double-counting information - what statisticians call multicollinearity. I even had to fix a tricky error where the Interdisciplinary major category broke my code, which taught me that real-world data is rarely perfect.

I learned two major things from this project. First, the Share of Women in a major is a huge factor - for every 10% increase in female representation, the predicted salary dropped by about $1,600. Second, my model was surprisingly accurate, with a Mean Absolute Error of only $4,006. This proves that while the job market is complex, we can use data science to find clear patterns that help people make better decisions about their futures.

Keywords

Data Science; Multiple Linear Regression; Graduate Salaries; American Community Survey; Gender Pay Gap; Predictive Modeling; R Programming; Labor Market Analysis; Academic Major Selection;

What Truly Drives Graduate Earnings?


Share

COinS
 

Accessibility Statement

This item was created or digitized prior to April 24, 2026, or is a reproduction of legacy media created before that date. It is preserved in its original, unmodified state specifically for research, reference, or historical recordkeeping. In accordance with the ADA Title II Final Rule, the University Libraries provides accessible versions of archival materials upon request. To request an accommodation for this item, please submit an accessibility request form.