Publications in Scientific Journals:

F. Iglesias Vazquez, T. Zseby:
"Analysis of network traffic features for anomaly detection";
Machine Learning, 101 (2015), 1; 59 - 84.

English abstract:
Anomaly detection in communication networks provides the basis for the uncovering of novel attacks, misconfigurations and network failures. Resource constraints for data storage, transmission and processing make it beneficial to restrict input data to features that are (a) highly relevant for the detection task and (b) easily derivable from network observations without expensive operations. Removing strong correlated, redundant and irrelevant features also improves the detection quality for many algorithms that are based on learning techniques. In this paper we address the feature selection problem for network traffic based anomaly detection. We propose a multi-stage feature selection method using filters and stepwise regression wrappers. Our analysis is based on 41 widely-adopted traffic features that are presented in several commonly used traffic data sets. With our combined feature selection method we could reduce the original feature vectors from 41 to only 16 features. We tested our results with five fundamentally different classifiers, observing no significant reduction of the detection performance. In order to quantify the practical benefits of our results, we analyzed the costs for generating individual features from standard IP Flow Information Export records, available at many routers. We show that we can eliminate 13 very costly features and thus reducing the computational effort for on-line feature generation from live traffic observations at network nodes.

Feature selection, Anomaly detection, Network security, Data preprocessing, Supervised classification

"Official" electronic version of the publication (accessed through its Digital Object Identifier - DOI)

Electronic version of the publication:

Created from the Publication Database of the Vienna University of Technology.