ORCID

https://orcid.org/0009-0005-3108-5304

Keywords

Vulnerable Road Users, Pedestrian Safety, Computer Vision, Deep Learning, Intelligent Transportation Systems, Behavior Prediction

Abstract

Vulnerable Road Users (VRUs), such as pedestrians and cyclists, are among the most at-risk participants in traffic, making their safety a key priority for intelligent transportation systems (ITS). Accurate perception and understanding of VRU behavior are essential for proactive accident prevention and the design of human-centric mobility systems. Computer vision has become a core component of ITS, enhancing VRU safety. This thesis introduces a human behavior-aware transformer-based framework for VRU intention prediction and presents VRU-Accident, a large-scale benchmark enabling systematic assessment of multimodal large language models (MLLMs) in understanding accident scenarios involving VRUs.

First, the proposed framework leverages multi-modal cues, including 3D pose estimation and spatio-temporal trajectories, to predict VRU crossing intentions at intersections. By combining a geometric-invariant representation with temporal attention and pose- and context-aware embeddings, the model captures subtle indicators such as body orientation, motion patterns, and environmental cues influencing crossing decisions. The framework demonstrates robust performance across diverse intersection scenarios, showing high reliability and consistent predictive capability under varying conditions.

To enable realistic and comprehensive evaluation, this thesis also introduces VRU-Accident, the first vision-language benchmark for real-world accident scenarios involving VRUs. VRU-Accident contains 1,000 dashcam accident videos with over 6,000 safety-critical question-answer pairs and 1,000 dense scene descriptions. This dataset supports systematic evaluation of MLLMs in accident-related video question answering and dense scene captioning, providing a critical resource for assessing safety-critical AI systems.

Together, these contributions advance VRU safety research by offering both a behavior-aware prediction framework and a comprehensive benchmark for accident understanding. The proposed methods can enhance proactive safety interventions in ITS, support the development of autonomous driving systems, and inform policymaking for safer and more inclusive transportation networks.

Completion Date

2025

Semester

Fall

Committee Chair

Abdel-Aty, Mohamed

Degree

Master of Science in Civil Engineering (M.S.C.E.)

College

College of Engineering and Computer Science

Department

Civil, Environmental and Construction Engineering

Format

PDF

Identifier

DP0029799

Document Type

Thesis

Campus Location

Orlando (Main) Campus

Share

COinS