Title

Detecting Trojans Using Data Mining Techniques

Keywords

Data Mining; Disassembly; Principal Component Analysis; Random Forest; Support Vector Machines; Trojan Detection

Abstract

A trojan horse is a program that surreptitiously performs its operation under the guise of a legitimate program. Traditional approaches using signatures to detect these programs pose little danger to new and unseen samples whose signatures are not available. The focus of malware research is shifting from using signature patterns to identifying the malicious behavior displayed by these malwares. This paper presents the novel idea of extracting variable length instruction sequences that can identify trojans from clean programs using data mining techniques. The analysis is facilitated by the program control flow information contained in the instruction sequences. Based on general statistics gathered from these instruction sequences, we formulated the problem as a binary classification problem and built random forest, bagging and support vector machine classifiers. Our approach showed a 94.0% detection rate on novel trojans whose data was not used in the model building process. © 2008 Springer-Verlag.

Publication Date

1-1-2008

Publication Title

Communications in Computer and Information Science

Volume

20 CCIS

Number of Pages

400-411

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1007/978-3-540-89853-5_43

Socpus ID

85099426372 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85099426372

This document is currently not available here.

Share

COinS