Keywords
Machine learning algorithms, Heart disease prediction, Decision tree algorithms, UCI Machine Learning Repository, 5-fold cross-validation
Abstract
The paper presents a study on the use of machine learning algorithms for the prediction of heart disease, which is the leading cause of death worldwide. The study focuses on the use of decision tree algorithms, which have the advantage of considering a large number of risk factors. The heart disease data set was obtained from the UCI Machine Learning Repository and was analyzed using a decision tree classifier. The data set had 6 missing data points, which were deleted, leaving 279 instances for analysis. One-hot-encoding was performed on categorical variables with more than two responses. The decision tree classifier was optimized using 5-fold cross-validation to choose the best parameters. The results showed that the decision tree classifier had an accuracy of predicting correctly 81% of the patients as having heart disease and like wise 82% for not having heart disease, which was higher than other machine learning algorithms used in previous studies. This study demonstrates the potential of decision tree algorithms for predicting heart disease and highlights the importance of early identification of individuals at risk of developing cardiovascular disease.
Semester
Spring 2023
Course Name
STA 5703 Data Mining 1
Instructor Name
Xie, Rui
College
College of Sciences
STARS Citation
Agbemade, Emil, "Predicting Heart Disease using Tree-based Model" (2023). Data Science and Data Mining. 1.
https://stars.library.ucf.edu/data-science-mining/1