Keywords

Machine Learning, Handwritten Digit Recognition, MNIST Dataset, Multiclass Classification, Image Classification.

Description

This paper presents a comparative analysis of Gaussian Naive Bayes (GNB) and Linear Discriminant Analysis (LDA) classifiers for handwritten digit recognition using the MNIST dataset. Both models are evaluated on their ability to classify grayscale images of digits (0–9) based on statistical assumptions of feature distributions. The study highlights LDA’s superior performance in terms of accuracy, precision, recall, and F1 score, making it a more effective approach for high-dimensional image classification tasks. This research offers valuable insights into interpretable, efficient machine learning methods for pattern recognition applications.

Abstract

Handwritten Digit Recognition (HDR) remains a fundamental benchmark in pattern recognition and machine learning due to its practical applications and inherent classification challenges posed by diverse handwriting styles. This study investigates and compares two classical statistical classifiers—Gaussian Naive Bayes (GNB) and Linear Discriminant Analysis (LDA)—to recognize the digits from the MNIST dataset. Both models assume underlying normality in feature distributions and offer computational efficiency, making them suitable for high-dimensional input such as image pixels. Using 60,000 training and 10,000 test samples, we evaluate model performance through accuracy, precision, recall, F1 score, and confusion matrices. The results reveal that while GNB achieves moderate accuracy (55.58\%), LDA significantly outperforms it with an accuracy of 87.30\%, demonstrating superior capability in distinguishing visually similar digits. Our analysis further highlights the limitations of GNB’s independence assumption and underscores LDA’s strength in capturing shared variance across classes. These findings reinforce the effectiveness of LDA as a robust baseline for HDR tasks, especially when interpretability and computational simplicity are desired.

Course Name

STA 6366 Data Science 1

Instructor Name

Dr. RUI XIE

Rights

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

College

College of Sciences

Included in

Data Science Commons

Share

COinS