Abstract

Online learning is a growing branch of machine learning which allows all traditional data mining techniques to be applied on an online stream of data in real-time. In this dissertation, we present three efficient algorithms for feature ranking in online classification problems. Each of the methods are tailored to work well with different types of classification tasks and have different advantages. The reason for this variety of algorithms is that like other machine learning solutions, there is usually no algorithm which works well for all types of tasks. The first method, is an online sensitivity based feature ranking (SFR) which is updated incrementally, and is designed for classification tasks with continuous features. We take advantage of the concept of global sensitivity and rank features based on their impact on the outcome of the classification model. In the feature selection part, we use a two-stage filtering method in order to first eliminate highly correlated and redundant features and then eliminate irrelevant features in the second stage. One important advantage of our algorithm is its generality, which means the method works for correlated feature spaces without preprocessing. It can be implemented along with any single-pass online classification method with separating hyperplane such as SVMs. In the second method, with help of probability theory we propose an algorithm which measures the importance of the features by observing the changes in label prediction in case of feature substitution. A non-parametric version of the proposed method is presented to eliminate the distribution type assumptions. These methods are application to all data types including mixed feature spaces. At last, we present a class-based feature importance ranking method which evaluates the importance of each feature for each class, these sub-rankings are further exploited to train an ensemble of classifiers. The proposed methods will be thoroughly tested using benchmark datasets and the results will be discussed in the last chapter.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2018

Semester

Summer

Advisor

Zheng, Qipeng

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Industrial Engineering and Management Systems

Degree Program

Industrial Engineering

Format

application/pdf

Identifier

CFE0007584

URL

http://purl.fcla.edu/fcla/etd/CFE0007584

Language

English

Release Date

February 2022

Length of Campus-only Access

3 years

Access Status

Doctoral Dissertation (Campus-only Access)

Share

COinS