Implementation and Experimentation with C4.5 Decision Trees
C4.5 is a decision tree learning algorithm that was developed by Ross Quinlan based on his earlier algorithm ID3. C4.5 is one of the most popular algorithms used to solve classification problems. Classification problems are problems of interest in a variety of disciplines. C4.5 is a supervised learning algorithm which uses a set of training patterns to build a decision tree. The algorithm uses the patterns and analyzes their individual attributes to partition the pattern data. The popularity of C4.5 stems from the fact that it can handle both continuous and categorical attributes, and it can deal with missing attribute values, while at the same time providing an easy interpretation for the answers that it produces. There are two objectives of this thesis. The first is to implement C4.5 in C++ within a generic architecture to allow for additional modules to be added. The second is to use this generic architecture to implement an innovative post induction phase which adjusts splits to minimize the error of the C4.5 tree. The C4.5 code and the post induction phase will be compiled into a MEX DLL for use as functions within MATLAB. Experimentation is performed using MATLAB to verify the advantages of this post induction phase.
This item is only available in print in the UCF Libraries. If this is your thesis or dissertation, you can help us make it available online for use by researchers around the world by downloading and filling out the Internet Distribution Consent Agreement. You may also contact the project coordinator Kerri Bottorff for more information.
Bachelor of Science (B.S.)
College of Engineering and Computer Science
Dissertations, Academic -- Engineering and Computer Science; Engineering and Computer Science -- Dissertations, Academic
Length of Campus-only Access
Honors in the Major Thesis
Beck, Jason, "Implementation and Experimentation with C4.5 Decision Trees" (2007). HIM 1990-2015. 670.