A k-norm pruning algorithm for decision tree classifiers based on error rate estimation

Authors

    Authors

    M. Zhong; M. Georgiopoulos;G. C. Anagnostopoulos

    Comments

    Authors: contact us about adding a copy of your work at STARS@ucf.edu

    Abbreviated Journal Title

    Mach. Learn.

    Keywords

    decision tree; pruning; law of succession; RADEMACHER PENALIZATION; Computer Science, Artificial Intelligence

    Abstract

    Decision trees are well-known and established models for classification and regression. In this paper, we focus on the estimation and the minimization of the misclassification rate of decision tree classifiers. We apply Lidstone's Law of Succession for the estimation of the class probabilities and error rates. In our work, we take into account not only the expected values of the error rate, which has been the norm in existing research, but also the corresponding reliability (measured by standard deviations) of the error rate. Based on this estimation, we propose an efficient pruning algorithm, called k-norm pruning, that has a clear theoretical interpretation, is easily implemented, and does not require a validation set. Our experiments show that our proposed pruning algorithm produces accurate trees quickly, and compares very favorably with two other well-known pruning algorithms, CCP of CART and EBP of C4.5.

    Journal Title

    Machine Learning

    Volume

    71

    Issue/Number

    1

    Publication Date

    1-1-2008

    Document Type

    Article

    Language

    English

    First Page

    55

    Last Page

    88

    WOS Identifier

    WOS:000253375600003

    ISSN

    0885-6125

    Share

    COinS