Data Science and Data Mining

Optimizing AI with Advanced Data Structuring: A Comparative Analysis of K-means and GMM Clustering Techniques

Amir Alipour Yengejeh, University of Central FloridaFollow

Keywords

Cluster Analysis, K-means Clustering, Gaussian Mixture Models, AI-Driven Data Analysis, Adjusted Rand Index, Normalized Mutual Information Dimensionality Reduction in Clustering, Pattern Recognition

Abstract

This study presents a detailed comparison of Kmeans and Gaussian Mixture Model (GMM) clustering algorithms, illustrating their unique capabilities and limitations across various synthetic datasets. By utilizing metrics such as the Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI), the research provides nuanced insights into how these algorithms handle datasets with varying structures and complexities. For instance, while both K-means and GMM show robust performance on well-separated clusters, GMM demonstrates a distinct advantage in scenarios with overlapping clusters or unbalanced data distributions. Conversely, K-means excels in identifying clear, distinct groupings, highlighting its utility in simpler clustering contexts. This study contributes to a deeper understanding of the operational characteristics of these popular clustering algorithms, potentially guiding the selection of appropriate methods for complex data analysis tasks in practice.

Semester

Spring 2024

Course Name

STA 6367 Data Science 2

Instructor Name

Rui Xie

STARS Citation

Alipour Yengejeh, Amir, "Optimizing AI with Advanced Data Structuring: A Comparative Analysis of K-means and GMM Clustering Techniques" (2024). Data Science and Data Mining. 21.
https://stars.library.ucf.edu/data-science-mining/21

Accessibility Status

PDF accessibility verified using Adobe Acrobat Pro Accessibility Checker

Download

Included in

Data Science Commons

COinS

Data Science and Data Mining

Optimizing AI with Advanced Data Structuring: A Comparative Analysis of K-means and GMM Clustering Techniques

Keywords

Abstract

Semester

Course Name

Instructor Name

STARS Citation

Accessibility Status

Included in

Explore

Connect

Data Science and Data Mining

Optimizing AI with Advanced Data Structuring: A Comparative Analysis of K-means and GMM Clustering Techniques

Author(s)

Keywords

Abstract

Semester

Course Name

Instructor Name

STARS Citation

Accessibility Status

Included in

Share

Explore

Connect