Keywords

data mining, virus detection, malicious software

Abstract

Traditional way to detect malicious software is based on signature matching. However, signature matching only detects known malicious software. In order to detect unknown malicious software, it is necessary to analyze the software for its impact on the system when the software is executed. In one approach, the software code can be statically analyzed for any malicious patterns. Another approach is to execute the program and determine the nature of the program dynamically. Since the execution of malicious code may have negative impact on the system, the code must be executed in a controlled environment. For that purpose, we have developed a sandbox to protect the system. Potential malicious behavior is intercepted by hooking Win32 system calls. Using the developed sandbox, we detect unknown virus using dynamic instruction sequences mining techniques. By collecting runtime instruction sequences in basic blocks, we extract instruction sequence patterns based on instruction associations. We build classification models with these patterns. By applying this classification model, we predict the nature of an unknown program. We compare our approach with several other approaches such as simple heuristics, NGram and static instruction sequences. We have also developed a method to identify a family of malicious software utilizing the system call trace. We construct a structural system call diagram from captured dynamic system call traces. We generate smart system call signature using profile hidden Markov model (PHMM) based on modularized system call block. Smart system call signature weakly identifies a family of malicious software.

Notes

If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu

Graduation Date

2009

Advisor

Guha, Ratan

Degree

Doctor of Philosophy (Ph.D.)

College

College of Engineering and Computer Science

Department

Electrical Engineering and Computer Science

Degree Program

Computer Science

Format

application/pdf

Identifier

CFE0002798

URL

http://purl.fcla.edu/fcla/etd/CFE0002798

Language

English

Release Date

September 2009

Length of Campus-only Access

None

Access Status

Doctoral Dissertation (Open Access)

Share

COinS