Title
Mining Distance-Based Outliers From Categorical Data
Keywords
Categorical data; Data mining; Distance-based outliers; Outlier detection; Similarity measure
Abstract
Distance-based outlier detection is an important data mining technique that finds abnormal data objects according to some distance function. However, when applying this technique to categorical data, a traditional simple matching dissimilarity measure is not an adequate model for high dimensional categorical data. In this article, we employ a new commonneighbor- based distance function to measure the proximity between a pair of data points. Experiments show that better outlier mining results can be achieved when the new distance function is utilized instead of a conventional simple matching dissimilarity measure. © CGU 2007.
Publication Date
12-1-2007
Publication Title
DESRIST 2007 Conference Proceedings - 2nd International Conference on Design Science Research in Information Systems and Technology
Number of Pages
75-88
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
Copyright Status
Unknown
Socpus ID
84880166024 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/84880166024
STARS Citation
Li, Shuxin; Lee, Robert; and Lang, Sheau Dong, "Mining Distance-Based Outliers From Categorical Data" (2007). Scopus Export 2000s. 5945.
https://stars.library.ucf.edu/scopus2000/5945