Title

Probabilistic Analysis Of The Rnn-Clink Clustering Algorithm

Abstract

Clustering is among the oldest techniques used in data mining applications. Typical implementations of the hierarchical agglomerative clustering methods (HACM) require an amount of O(N2)-space when there are N data objects, making such algorithms impractical for problems involving large datasets. The well-known clustering algorithm RNN-CLINK requires only O(N)-space but O(N3)-time in the worst case, although the average time appears to be O(N2 log N). We provide a probabilistic interpretation of the average-time complexity of the algorithm. We also report experimental results using the randomly generated bit vectors and using the NETNEWS articles as the input, to support our theoretical analysis.

Publication Date

1-1-1999

Publication Title

Proceedings of SPIE - The International Society for Optical Engineering

Volume

3695

Number of Pages

31-38

Document Type

Article

Personal Identifier

scopus

Socpus ID

0032633202 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/0032633202

This document is currently not available here.

Share

COinS