Keywords
Fire Debris, Likelihood Ratio, Machine Learning, Simulation Study, Repeatability, Reproducibility
Abstract
The importance of using statistical evidence to help inform the interpretation of forensic evidence has grown greater over time. This is especially evident in the field of forensic fire debris analysis, where the reliability and reproducibility of statistical results are crucial. This thesis investigates how the application of 5 different common machine learning methods on an in-silico fire debris dataset can be reduced to a univariate forensic score to create score-based likelihood ratios (SLRs), and how these ratios should be interpreted. These SLRs are crucial in supporting or rejecting a Prosecutor’s or Defense’s hypothesis, thus making the previously mentioned reliability and reproducibility essential. Different likelihood ratio (LR) estimation methods are tested based on the scores generated by the machine learning models. These estimation methods include Parametric estimation (PE), Kernel-density estimation (KDE), and Logistic Regression estimation (LRE), and the properties of each are analyzed on an in-silico fire debris dataset meant to mimic real-world data. The second part of this thesis investigates the bias and variance inherent in these estimation methods. Some simulations are performed to demonstrate that the LRE method is very sensitive to class imbalances. Additionally, we show how the derivation of the variance of the PE method via its relation to the Receiver Operating Curve (ROC) can be compared to the results of the actual variance found from the data. The effect of violations of underlying assumptions is then explored in relation to the stability of the variance. This work validates what previous theoretical results have shown but in a study on empirical in-silico fire debris data, demonstrating known methodology extends beyond theory, and into a practical setting.
Completion Date
2026
Semester
Spring
Committee Chair
Liansheng, Tang
Degree
Master of Science (M.S.)
College
College of Sciences
Department
Statistics and Data Science
Format
Document Type
Thesis
Identifier
DP0053186
STARS Citation
Ezazi, Cameron, "Repeatability And Reproducibility Of Likelihood Ratio Estimation For Forensics Using Machine Learning: An Application To In-Silico Fire Debris Data" (2026). Graduate Studies Theses and Dissertations 2026. 63.
https://stars.library.ucf.edu/gradstudies_etd_2026/63
Accessibility Statement
This item was created or digitized prior to April 24, 2027, or is a reproduction of legacy media created before that date. It is preserved in its original, unmodified state specifically for research, reference, or historical recordkeeping. In accordance with the ADA Title II Final Rule, the University Libraries provides accessible versions of archival materials upon request. To request an accommodation for this item, please submit an accessibility request form.