Title

Detecting Trojans Using Data Mining Techniques

Keywords

Data Mining; Disassembly; Principal Component Analysis; Random Forest; Support Vector Machines; Trojan Detection

Abstract

A trojan horse is a program that surreptitiously performs its operation under the guise of a legitimate program. Traditional approaches using signatures to detect these programs pose little danger to new and unseen samples whose signatures are not available. The focus of malware research is shifting from using signature patterns to identifying the malicious behavior displayed by these malwares. This paper presents the novel idea of extracting variable length instruction sequences that can identify trojans from clean programs using data mining techniques. The analysis is facilitated by the program control flow information contained in the instruction sequences. Based on general statistics gathered from these instruction sequences, we formulated the problem as a binary classification problem and built random forest, bagging and support vector machine classifiers. Our approach showed a 94.0% detection rate on novel trojans whose data was not used in the model building process. © Springer-Verlag Berlin Heidelberg 2008.

Publication Date

12-1-2009

Publication Title

Communications in Computer and Information Science

Volume

20

Number of Pages

400-411

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

Socpus ID

78649845943 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/78649845943

This document is currently not available here.

Share

COinS