Title
Detecting Trojans Using Data Mining Techniques
Keywords
Data Mining; Disassembly; Principal Component Analysis; Random Forest; Support Vector Machines; Trojan Detection
Abstract
A trojan horse is a program that surreptitiously performs its operation under the guise of a legitimate program. Traditional approaches using signatures to detect these programs pose little danger to new and unseen samples whose signatures are not available. The focus of malware research is shifting from using signature patterns to identifying the malicious behavior displayed by these malwares. This paper presents the novel idea of extracting variable length instruction sequences that can identify trojans from clean programs using data mining techniques. The analysis is facilitated by the program control flow information contained in the instruction sequences. Based on general statistics gathered from these instruction sequences, we formulated the problem as a binary classification problem and built random forest, bagging and support vector machine classifiers. Our approach showed a 94.0% detection rate on novel trojans whose data was not used in the model building process. © 2008 Springer-Verlag.
Publication Date
1-1-2008
Publication Title
Communications in Computer and Information Science
Volume
20 CCIS
Number of Pages
400-411
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1007/978-3-540-89853-5_43
Copyright Status
Unknown
Socpus ID
85099426372 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85099426372
STARS Citation
Siddiqui, Muazzam; Wang, Morgan C.; and Lee, Joohan, "Detecting Trojans Using Data Mining Techniques" (2008). Scopus Export 2000s. 10808.
https://stars.library.ucf.edu/scopus2000/10808