Keywords

Fiber Typing, Clustering, Two-Stage Framework, Domain-Constrained

Abstract

This dissertation introduces a novel Two-Stage Clustering Framework for the automated identification of human skeletal muscle fiber types using fluorescence microscopy image intensity data. By integrating two complementary clustering methods, this framework overcomes the subjectivity and labor-intensive nature of traditional manual methods, offering an objective and efficient solution for large-scale muscle physiology research. The initial framework combined Density Peaks Clustering (DPC) with a Gaussian Mixture Model employing a t-distribution (GMM-t). This version was validated by comparing its clustering results with manual counts at the individual subject level. We successfully applied this framework in two published studies \cite{brennan2020,hinkley2023}, demonstrating its practical impact in muscle physiology research. However, challenges with data quality emerged due to diverse data generation methods across different laboratory configurations, requiring fine-tuning of the original model for such data. These issues motivated further refinements to develop a more robust model. To enhance robustness, a simulation study evaluated clustering algorithms for detecting linear patterns in fluorescence intensity data. The study identified the constrained Linear Grouping Algorithm (cLGA), which we developed by modifying the original Linear Grouping Algorithm (LGA) by adding domain-specific angular constraints, as outperforming other methods, particularly in noisy scenarios with overlapping clusters. Based on these findings, the refined Two-Stage Clustering Framework was developed. In the first stage, DPC isolates Type I fibers and noise, as in the original version. In the second stage, cLGA identifies Type IIa, IIx, and hybrid IIa+IIx fibers. The framework was validated against manual counts, achieving higher accuracy and consistency. Extensive testing confirmed its robustness across different preprocessing methods, including transformation, outlier removal, and subsampling. This work provides a robust statistical framework for identifying muscle fiber types, supporting muscle physiology research by reducing reliance on manual methods and enabling more objective studies.

Completion Date

2025

Semester

Summer

Committee Chair

Liqiang Ni

Degree

Doctor of Philosophy (Ph.D.)

College

College of Sciences

Department

Department of Statistics and Data Science

Format

PDF

Identifier

DP0029630

Language

English

Document Type

Thesis

Campus Location

Orlando (Main) Campus

Share

COinS