Scopus Export 2010-2014

A Fast Outlier Detection Strategy For Distributed High-Dimensional Data Sets With Mixed Attributes

Keywords

Anomaly detection; Data mining; Distributed data sets; High-dimensional data sets; Mixed attribute data sets; Outlier detection

Abstract

Outlier detection has attracted substantial attention in many applications and research areas; some of the most prominent applications are network intrusion detection or credit card fraud detection. Many of the existing approaches are based on calculating distances among the points in the dataset. These approaches cannot easily adapt to current datasets that usually contain a mix of categorical and continuous attributes, and may be distributed among different geographical locations. In addition, current datasets usually have a large number of dimensions. These datasets tend to be sparse, and traditional concepts such as Euclidean distance or nearest neighbor become unsuitable. We propose a fast distributed outlier detection strategy intended for datasets containing mixed attributes. The proposed method takes into consideration the sparseness of the dataset, and is experimentally shown to be highly scalable with the number of points and the number of attributes in the dataset. Experimental results show that the proposed outlier detection method compares very favorably with other state-of-the art outlier detection strategies proposed in the literature and that the speedup achieved by its distributed version is very close to linear.

Publication Date

3-1-2010

Publication Title

Data Mining and Knowledge Discovery

Volume

Issue

Number of Pages

259-289

Document Type

Article

Personal Identifier

scopus

DOI Link

https://doi.org/10.1007/s10618-009-0148-z

Copyright Status

Unknown

Socpus ID

77649275031 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/77649275031

STARS Citation

Koufakou, Anna and Georgiopoulos, Michael, "A Fast Outlier Detection Strategy For Distributed High-Dimensional Data Sets With Mixed Attributes" (2010). Scopus Export 2010-2014. 1362.
https://stars.library.ucf.edu/scopus2010/1362

This document is currently not available here.

COinS

Scopus Export 2010-2014

A Fast Outlier Detection Strategy For Distributed High-Dimensional Data Sets With Mixed Attributes

Keywords

Abstract

Publication Date

Publication Title

Volume

Issue

Number of Pages

Document Type

Personal Identifier

DOI Link

Copyright Status

Socpus ID

Source API URL

STARS Citation

Explore

Connect

Scopus Export 2010-2014

A Fast Outlier Detection Strategy For Distributed High-Dimensional Data Sets With Mixed Attributes

Creator

Keywords

Abstract

Publication Date

Publication Title

Volume

Issue

Number of Pages

Document Type

Personal Identifier

DOI Link

Copyright Status

Socpus ID

Source API URL

STARS Citation

Share

Explore

Connect