Title
Detecting Outliers In High-Dimensional Datasets With Mixed Attributes
Keywords
High dimensional data; Large datasets; Mixed attribute datasets; Outlier detection
Abstract
Outlier Detection has attracted substantial attention in many applications and research areas. Examples include detection of network intrusions or credit card fraud. Many of the existing approaches are based on pair-wise distances among all points in the dataset. These approaches cannot easily extend to current datasets that usually contain a mix of categorical and continuous attributes, and may be scattered over large geographical areas. In addition, current datasets usually have a large number of dimensions. These datasets tend to be sparse, and traditional concepts such as Euclidean distance or nearest neighbor become unsuitable. We propose ODMAD, a fast outlier detection strategy intended for datasets containing mixed attributes. ODMAD takes into consideration the sparseness of the dataset, and is experimentally shown to be highly scalable with the number of points and number of attributes in the dataset.
Publication Date
12-1-2008
Publication Title
Proceedings of the 2008 International Conference on Data Mining, DMIN 2008
Number of Pages
427-433
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
Copyright Status
Unknown
Socpus ID
62649086136 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/62649086136
STARS Citation
Koufakou, A.; Georgiopoulos, M.; and Anagnostopoulos, G. C., "Detecting Outliers In High-Dimensional Datasets With Mixed Attributes" (2008). Scopus Export 2000s. 9673.
https://stars.library.ucf.edu/scopus2000/9673