[Back]


Publications in Scientific Journals:

M. Wurzenberger, F. Skopik, G. Settanni, W. Scherrer:
"Complex Log File Synthesis for Rapid Sandbox-Benchmarking of Security- and Computer Network Analysis Tools";
Information Systems, 60 (2016), 13 - 33.



English abstract:
Today Information and Communications Technology (ICT) networks are a dominating component of our daily life. Centralized logging allows keeping track of events occurring in ICT networks. Therefore a central log store is essential for timely detection of problems such as service quality degradations, performance issues or especially security-relevant cyber attacks. There exist various software tools such as security information and event management (SIEM) systems, log analysis tools and anomaly detection systems, which exploit log data to achieve this. While there are many products on the market, based on different approaches, the identification of the most efficient solution for a specific infrastructure, and the optimal configuration is still an unsolved problem. Today׳s general test environments do not sufficiently account for the specific properties of individual infrastructure setups. Thus, tests in these environments are usually not representative. However, testing on the real running productive systems exposes the network infrastructure to dangerous or unstable situations. The solution to this dilemma is the design and implementation of a highly realistic test environment, i.e. sandbox solution, that follows a different - novel - approach. The idea is to generate realistic network event sequence (NES) data that reflects the actual system behavior and which is then used to challenge network analysis software tools with varying configurations safely and realistically offline. In this paper we define a model, based on log line clustering and Markov chain simulation to create this synthetic log data. The presented model requires only a small set of real network data as an input to understand the complex real system behavior. Based on the input׳s characteristics highly realistic customer specified NES data is generated. To prove the applicability of the concept developed in this work, we conclude the paper with an illustrative example of evaluation and test of an existing anomaly detection system by using generated NES data.

Keywords:
Log line clustering; Markov chains; Log file analysis; Log data modeling; IDS deployment optimization


"Official" electronic version of the publication (accessed through its Digital Object Identifier - DOI)
http://dx.doi.org/10.1016/j.is.2016.02.006


Created from the Publication Database of the Vienna University of Technology.