High Impact Practices Student Showcase Spring 2026

Green Light: Using Logistic Regression to Predict When a Stolen Base Attempt Will Succeed

Green Light: Using Logistic Regression to Predict When a Stolen Base Attempt Will Succeed

Streaming Media

Files

Course Code

STA

Course Number

4504

Faculty/Instructor

Instructor Nathaniel Simone

Faculty/Instructor Email

nathaniel.simone@ucf.edu

Abstract, Summary, or Creative Statement

This project aims to predict the probability of a stolen base attempt of second base in the 2023 Major League Baseball (MLB) season using Logistic Regression. Using play-by-play event data from Retrosheet and player/pitch tracking data from Baseball Savant's Statcast system, this research investigates 2,800 stolen base attempts of second base with a baserunner only on first base to decide which factors are most important to a successful steal of second base using characteristics such as sprint speed, pop time, distance of lead, and pitch metrics. The objective of this research is to aid MLB coaches and front offices in evaluating baserunning ability that goes beyond traditional scouting methods. Additionally, given the game situation, for example, run differential and inning, we can optimize when a stolen base is most effective and necessary, compared to when it is a poor choice.  Model selection was performed by using backward selection and the Akaike Information Criterion (AIC) as our selection criterion. We will evaluate the final model by using accuracy metrics such as sensitivity and specificity, as well as building a Receiver Operating Characteristic (ROC) curve and analyzing the Area Under the Curve (AUC) to assess predictive accuracy.

Keywords

Stolen Bases; MLB; Logistic Regression; Baseball

Green Light: Using Logistic Regression to Predict When a Stolen Base Attempt Will Succeed


Share

COinS
 

Accessibility Statement

This item was created or digitized prior to April 24, 2026, or is a reproduction of legacy media created before that date. It is preserved in its original, unmodified state specifically for research, reference, or historical recordkeeping. In accordance with the ADA Title II Final Rule, the University Libraries provides accessible versions of archival materials upon request. To request an accommodation for this item, please submit an accessibility request form.