Keywords

Decision Tree, Income, Classification

Description

This project applies the decision tree methodology to the adult income data to predict an individual’s income and determine the necessary factors that improve an individual’s income.

Abstract

Decision tree is a commonly used data mining methodology for performing classification tasks. It is a tree-based supervised machine learning algorithm that is used to classify or make predictions in a path of how previous questions are answered. Generally, the decision tree algorithm categorizes data into branch-like segments that develop into a tree that contains a root, nodes, and leaves. This project seeks to explore the decision tree methodology and apply it to the Adult Income dataset from the UCI Machine Learning Repository, to determine whether a person makes over 50K per year and determine the necessary factors that improve an individual’s income. The model was evaluated using the classification metrics. The results show a good performance of the model. Also, the feature importance scores were computed to determine the contributing factors that improve an individual’s income.

Semester

Spring 2023

Course Name

STA 6704 Data Mining 2

Instructor Name

Xie, Rui

College

College of Sciences

Share

COinS