With the rise of the popularity of machine learning (ML), it has been shown that ML-based classifiers are susceptible to adversarial examples and concept drifting, where a small modification in the input space may result in misclassification. The ever-evolving nature of the data, the behavioral and pattern shifting over time not only lessened the trust in the machine learning output but also created a barrier for its usage in critical applications. This dissertation builds toward analyzing machine learning-based malware detection systems, including the detection and mitigation of adversarial malware examples. In particular, we first introduce two black-box adversarial attacks on control flow-based malware detectors, exposing the vulnerability of graph-based malware detection systems. Further, we propose DL-FHMC, fine-grained hierarchical learning technique for robust malware detection, leveraging graph mining techniques alongside pattern recognition for adversarial malware detection. Enabling machine learning in critical domains is not limited to the detection of adversarial examples in laboratory settings, but also extends to exploring the existence of adversarial behavior in the wild. Toward this, we investigate the attack surface of malware detection systems, shedding light on the vulnerability of the underlying learning algorithms and industry-standard machine learning malware detection systems against adversaries in both IoT and Windows environments. Toward robust malware detection, we investigate software pre-processing and monotonic machine learning. In addition, we explore potential exploitation caused by actively retraining malware detection models. We uncover a previously unreported malicious to benign detection performance trade-off, causing the malware to revive and be classified as a benign or different malicious family. This behavior leads to family labeling inconsistencies, hindering the efforts toward malicious families' understanding. Overall, this dissertation builds toward robust malware detection, by analyzing and detecting adversarial examples. We highlight the vulnerability of industry-standard applications to black-box adversarial settings, including the continuous evolution of malware over time.
If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu
Doctor of Philosophy (Ph.D.)
College of Engineering and Computer Science
Length of Campus-only Access
Doctoral Dissertation (Open Access)
Abusnaina, Ahmed, "Studying the Robustness of Machine Learning-based Malware Detection Models: Analysis, Design, and Implementation" (2022). Electronic Theses and Dissertations, 2020-. 1458.