The exceptional teamwork and collaborative spirit of my group mates, John Mahlon Scott and Tyler Laurie, significantly bolstered this research endeavor. Their relentless dedication to scraping the large dataset and perseverance in running numerous models are commendable. Their curiosity and commitment to refining and tuning the final model truly drove this project's success. Each team member's collective effort, expertise, and invaluable contributions are deeply appreciated.


XGBoost, SteamGaming, Regression Analysis


This project investigates game pricing strategies in the Steam market using an XGBoost model, drawing motivation from Professor Xie's lecture, and presenting findings through a density plot that delineates two primary pricing strategies. A free-to-play approach, indicated by a significant hot spot, is adopted by developers focusing on post-purchase revenues through DLC, aesthetic purchases, and in-game transactions. This sailing strategy includes community-centric developers aiming to distribute their games for player engagement rather than profit.

The project illustrates the effectiveness of advanced modeling techniques in handling complex datasets, with significant predictive accuracy reflected by a reduced MSE from 0.3472 to 0.1397. Feature selection and impact analysis have identified critical pricing factors, including independent development, game styles, and developmental stages. The study highlights how Steam's tagging system influences game visibility and recommendations, impacting market positioning and price. The analysis also reveals a dual trend in pricing: while many games target a $20 price point, some are priced near $0 to attract a broad player base, aiming for subsequent in-game purchase revenues.


Spring 2024

Course Name

STA 6366 Data Science 1

Instructor Name

Xie, Rui


Creative Commons Attribution 3.0 License
This work is licensed under a Creative Commons Attribution 3.0 License.


College of Sciences

Accessibility Status

PDF accessibility verified using Adobe Acrobat Pro Accessibility Checker

Included in

Data Science Commons