[Back]


Talks and Poster Presentations (with Proceedings-Entry):

M. Neugschwandtner, P. Milani Comparetti, G. Jacob, C. Krügel:
"ForeCast - Skimming off the Malware Cream";
Talk: Annual Computer Security Applications Conference (ACSAC), Orlando, Florida; 12-05-2011 - 12-09-2011; in: "Proceedings of the 27th Annual Computer Security Applications Conference", ACM, New York (2011), ISBN: 978-1-4503-0672-0.



English abstract:
To handle the large number of malware samples appearing in the wild each day, security analysts and vendors employ automated tools to detect, classify and analyze malicious code. Because malware is typically resistant to static analysis, automated dynamic analysis is widely used for this purpose. Executing malicious software in a controlled environment while observing its behavior can provide rich information on a malware's capabilities. However, running each malware sample even for a few minutes is expensive. For this reason, malware analysis efforts need to select a subset of samples for analysis. To date, this selection has been performed either randomly or using techniques focused on avoiding re-analysis of polymorphic malware variants.

In this paper, we present a novel approach to sample selection that attempts to maximize the total value of the information obtained from analysis, according to an application-dependent scoring function. To this end, we leverage previous work on behavioral malware clustering [14] and introduce a machine-learning-based system that uses all statically-available information to predict into which behavioral class a sample will fall, before the sample is actually executed. We discuss scoring functions tailored at two practical applications of large-scale dynamic analysis: the compilation of network blacklists of command and control servers and the generation of remediation procedures for malware infections. We implement these techniques in a tool called ForeCast. Large-scale evaluation on over 600,000 malware samples shows that our prototype can increase the amount of potential command and control servers detected by up to 137% over a random selection strategy and 54% over a selection strategy based on sample diversity.

Keywords:
Malware, Machine Learning


"Official" electronic version of the publication (accessed through its Digital Object Identifier - DOI)
http://dx.doi.org/10.1145/2076732.2076735

Electronic version of the publication:
http://publik.tuwien.ac.at/files/PubDat_206039.pdf


Created from the Publication Database of the Vienna University of Technology.