Noise Reduction A Priori Synthetic Over-Sampling For Class Imbalanced Data Sets
Keywords
Class imbalance; Classification; NRAS; OUPS; SMOTE
Abstract
In real world data set the underlying data distribution may be highly skewed. Building accurate classifiers for predicting group membership is made difficult because the classifier has a tendency to be biased towards the over represented or majority group as a result. This problem is referred to as a class imbalance problem. Re-sampling techniques that produce new samples by means of over-sampling aim to combat class imbalance by increasing the number of members that belong to the minority group. This paper introduces a new over-sampling technique that focuses on noise reduction and selective sampling of the minority group which results in improvement for prediction of minority group membership. Experiments are conducted across a wide range of data sets, learners and over sampling methods. The results for this new method show improvement for Sensitivity and Gmean measures over the compared approaches.
Publication Date
10-1-2017
Publication Title
Information Sciences
Volume
408
Number of Pages
146-161
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.1016/j.ins.2017.04.046
Copyright Status
Unknown
Socpus ID
85018316823 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85018316823
STARS Citation
Rivera, William A., "Noise Reduction A Priori Synthetic Over-Sampling For Class Imbalanced Data Sets" (2017). Scopus Export 2015-2019. 5271.
https://stars.library.ucf.edu/scopus2015/5271