Title

A k-norm pruning algorithm for decision tree classifiers based on error rate estimation

Authors

Authors

M. Zhong; M. Georgiopoulos;G. C. Anagnostopoulos

Comments

Authors: contact us about adding a copy of your work at STARS@ucf.edu

Abbreviated Journal Title

Mach. Learn.

Keywords

decision tree; pruning; law of succession; RADEMACHER PENALIZATION; Computer Science, Artificial Intelligence

Abstract

Decision trees are well-known and established models for classification and regression. In this paper, we focus on the estimation and the minimization of the misclassification rate of decision tree classifiers. We apply Lidstone's Law of Succession for the estimation of the class probabilities and error rates. In our work, we take into account not only the expected values of the error rate, which has been the norm in existing research, but also the corresponding reliability (measured by standard deviations) of the error rate. Based on this estimation, we propose an efficient pruning algorithm, called k-norm pruning, that has a clear theoretical interpretation, is easily implemented, and does not require a validation set. Our experiments show that our proposed pruning algorithm produces accurate trees quickly, and compares very favorably with two other well-known pruning algorithms, CCP of CART and EBP of C4.5.

Journal Title

Machine Learning

Volume

71

Issue/Number

1

Publication Date

1-1-2008

Document Type

Article

Language

English

First Page

55

Last Page

88

WOS Identifier

WOS:000253375600003

ISSN

0885-6125

Share

COinS