Experiments with K-Means, Fuzzy c-Means and Approaches to Choose K and C
A parameter specifying the number of clusters in an unsupervised clustering algorithm is often unknown. Different cluster validity indices proposed in the past have attempted to address this issue, and their performance is directly related to the accuracy of a clustering algorithm. Toe gap statistic proposed by Tibshirani (2001) was applied to k-means and hierarchical clustering algorithms for estimating the number of clusters and is shown to outperform other cluster validity measures, especially in the null model case. In our experiments, the gap statistic is applied to the Fuzzy c-Means (FCM) algorithm and compared to existing FCM cluster validity indices examined by Pal (1995). A comparison is also made between two initialization methods where centers are randomly assigned to data points or initialized using the furthest first algorithm (Hochbaum, 1985). Toe gap statistic can be applied using the FCM algorithm as long as the fuzzy partition matrix can be employed in computing the gap statistic metric, Wk . Three new methodologies are examined for computing this metric in order to apply the gap statistic to the FCM algorithm. Toe fuzzy partition matrix generated by FCM can also be thresholded based upon the maximum membership to allow computation similar to the kmeans algorithm. This is assumed to be the current method for employing the gap statistic with the FCM algorithm and is compared to the three proposed methods. In our results, the gap statistic outperformed the cluster validity indices for FCM, and one of the new methodologies introduced for computing the metric, based upon the FCM objective function, out performed the threshold method for m=2.
This item is only available in print in the UCF Libraries. If this is your thesis or dissertation, you can help us make it available online for use by researchers around the world by downloading and filling out the Internet Distribution Consent Agreement. You may also contact the project coordinator Kerri Bottorff for more information.
Bachelor of Science (B.S.)
College of Engineering and Computer Science
Dissertations, Academic -- Engineering; Engineering -- Dissertations, Academic
Length of Campus-only Access
Honors in the Major Thesis
Hong, Sui, "Experiments with K-Means, Fuzzy c-Means and Approaches to Choose K and C" (2006). HIM 1990-2015. 571.