Keywords
Classification, imbalanced data, cost sensitive learning, outliers, weighted support vector machine, relaxed support vector machines, control chart pattern recognition
Abstract
Analysis and predictive modeling of massive datasets is an extremely significant problem that arises in many practical applications. The task of predictive modeling becomes even more challenging when data are imperfect or uncertain. The real data are frequently affected by outliers, uncertain labels, and uneven distribution of classes (imbalanced data). Such uncertainties create bias and make predictive modeling an even more difficult task. In the present work, we introduce a cost-sensitive learning method (CSL) to deal with the classification of imperfect data. Typically, most traditional approaches for classification demonstrate poor performance in an environment with imperfect data. We propose the use of CSL with Support Vector Machine, which is a well-known data mining algorithm. The results reveal that the proposed algorithm produces more accurate classifiers and is more robust with respect to imperfect data. Furthermore, we explore the best performance measures to tackle imperfect data along with addressing real problems in quality control and business analytics.
Notes
If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu
Graduation Date
2014
Semester
Spring
Advisor
Xanthopoulos, Petros
Degree
Doctor of Philosophy (Ph.D.)
College
College of Engineering and Computer Science
Department
Industrial Engineering and Management Systems
Degree Program
Industrial Engineering
Format
application/pdf
Identifier
CFE0005542
URL
http://purl.fcla.edu/fcla/etd/CFE0005542
Language
English
Release Date
November 2014
Length of Campus-only Access
None
Access Status
Doctoral Dissertation (Open Access)
Subjects
Dissertations, Academic -- Engineering and Computer Science; Engineering and Computer Science -- Dissertations, Academic
STARS Citation
Razzaghi, Talayeh, "Cost-Sensitive Learning-based Methods for Imbalanced Classification Problems with Applications" (2014). Electronic Theses and Dissertations. 4574.
https://stars.library.ucf.edu/etd/4574