Noise Reduction A Priori Synthetic Over-Sampling For Class Imbalanced Data Sets

Keywords

Class imbalance; Classification; NRAS; OUPS; SMOTE

Abstract

In real world data set the underlying data distribution may be highly skewed. Building accurate classifiers for predicting group membership is made difficult because the classifier has a tendency to be biased towards the over represented or majority group as a result. This problem is referred to as a class imbalance problem. Re-sampling techniques that produce new samples by means of over-sampling aim to combat class imbalance by increasing the number of members that belong to the minority group. This paper introduces a new over-sampling technique that focuses on noise reduction and selective sampling of the minority group which results in improvement for prediction of minority group membership. Experiments are conducted across a wide range of data sets, learners and over sampling methods. The results for this new method show improvement for Sensitivity and Gmean measures over the compared approaches.

Publication Date

10-1-2017

Publication Title

Information Sciences

Volume

408

Number of Pages

146-161

Document Type

Article

Personal Identifier

scopus

DOI Link

https://doi.org/10.1016/j.ins.2017.04.046

Socpus ID

85018316823 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85018316823

This document is currently not available here.

Share

COinS