ORCID

0000-0001-5683-1821

Keywords

Computer Vision, Machine Learning, Generative AI, Large Language Models, Pedestrian Crossing Prediction, Vulnerable Road Users Safety

Abstract

Ensuring the safety of Vulnerable Road Users (VRUs), including pedestrians, bicyclists, and E-scooter riders, at intersections is crucial for sustainable and efficient urban mobility. Despite recent advancements in transportation technologies, accurately understanding and predicting human behavior at intersections remains challenging, resulting in unsafe interactions, crossing violations, and diminished trust in traffic systems. This dissertation presents an integrated approach comprising three novel frameworks: VRUCrossSafe, VRU-CIPI, and the Video-to-Text Pedestrian Monitoring (VTPM) system, collectively addressing critical gaps in intersection safety, VRU behavior prediction, and privacy-preserving traffic monitoring. First, VRUCrossSafe introduces a real-time intersection safety solution utilizing computer vision and ensemble machine learning techniques to predict VRU crossing intentions. Evaluated on a dataset containing 589 VRUs under various visibility conditions, VRUCrossSafe achieved a high prediction accuracy of 94.67%, processing at 33 frames per second, enabling automated pedestrian signal activation with real-world implementation is addressed providing real processing. Building upon these findings, the VRU Crossing Intention Prediction at Intersection (VRU-CIPI) model is developed to capture complex temporal dynamics of VRU behavior. VRU-CIPI integrates Gated Recurrent Units (GRU) with Transformer-based multi-head self-attention, achieving state-of-the-art crossing intention prediction accuracy of 96.45%. Moreover, it integrates cross-camera person Reidentification methods to identify the same crossing VRUs across cameras to confirm their safe crossing with precision of 88.48%. Addressing pedestrian privacy and storage efficiency, VTPM framework employs a lightweight Large Language Model (LLM) to transform pedestrian activity from video formats into concise real-time textual narratives. VTPM efficiently detects crossing violations and vehicle-pedestrian conflicts with minimal latency (0.05 sec/frame for monitoring, 0.33 sec/report generation), significantly reducing data storage requirements while enhancing privacy and enabling rapid, reliable safety analyses. Collectively, the proposed frameworks advance intersection safety through accurate VRU behavior prediction, enhanced privacy protection, and efficient real-time safety analysis, supporting safer and more sustainable urban transportation systems.

Completion Date

2025

Semester

Summer

Committee Chair

Mohamed Abdel-Aty

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Civil Engineering

Format

PDF

Identifier

DP0029502

Document Type

Thesis

Campus Location

Orlando (Main) Campus

Share

COinS