Keywords
Clustering method, Classification method, Mixture of pollen grains, Flow cytometry, Fire debris, Total ion spectrum
Abstract
Finite mixture models have been widely used to cluster data consisting of homogeneous subpopulations. In forensic palynology, pollen is used as a proxy to link individuals or items to a crime scene. Mixtures of pollen data-including willow and mustard and blank samples-were analyzed using flow cytometry. Willow and mustard clusters tend to have multivariate normal distributions, while a background cluster has multivariate non-normal distribution. We propose a finite mixture model capable of handling the mixtures of pollen in terms of univariate and multivariate distribution. The proposed methods are applied in simulated and mixture of pollen datasets.
Finite mixture models typically use an expectation–maximization (EM) algorithm to approximate parameters by maximum likelihood estimation (MLE). Since MLE is used in the M-step, we also apply alternative optimization methods such as Gradient descent and Newton’s method to estimate the parameters. To compare the performance of these optimization methods, processing of time, percent of mislabeling rate, bias, and mean squared error (MSE) are evaluated.
While the first topic focuses on clustering methods, the second explores classification techniques applying to fire debris datasets. The datasets contain total ion chromatogram (TIC) and total ion spectrum (TIS) representing the chemical profiles of materials burned in a fire. In fire investigations, identifying ignitable liquid residues-substances like gasoline or alcohol that can easily catch fire-is crucial for detecting possible arson. Substrate components, on the other hand, are the original materials present at the scene, such as carpet, wood, or fabric, that burned during the fire.
We classified ignitable liquid residues and substrate components by machine learning methods on TIS and TIC datasets. The predictive accuracy and area under the ROC (AUC) of the models was evaluated and compared on both an in-silico test dataset and on an experimental fire debris dataset.
Completion Date
2025
Semester
Fall
Committee Chair
Tang, Larry
Degree
Doctor of Philosophy (Ph.D.)
College
College of Sciences
Department
Statistics and Data science
Format
Identifier
DP0029745
Document Type
Thesis
Campus Location
Orlando (Main) Campus
STARS Citation
Booppasiri, Slun, "Clustering and Classification Methods for Forensic Science Applications" (2025). Graduate Thesis and Dissertation post-2024. 428.
https://stars.library.ucf.edu/etd2024/428