A Multi-Stage Network Anomaly Detection Method for Improving Efficiency and Accuracy
Yuji Waizumi, Hiroshi Tsunoda, Masashi Tsuji, Yoshiaki Nemoto
.
DOI: 10.4236/jis.2012.31003   PDF    HTML   XML   5,227 Downloads   9,316 Views   Citations

Abstract

Because of an explosive growth of the intrusions, necessity of anomaly-based Intrusion Detection Systems (IDSs) which are capable of detecting novel attacks, is increasing. Among those systems, flow-based detection systems which use a series of packets exchanged between two terminals as a unit of observation, have an advantage of being able to detect anomaly which is included in only some specific sessions. However, in large-scale networks where a large number of communications takes place, analyzing every flow is not practical. On the other hand, a timeslot-based detection systems need not to prepare a number of buffers although it is difficult to specify anomaly communications. In this paper, we propose a multi-stage anomaly detection system which is combination of timeslot-based and flow-based detectors. The proposed system can reduce the number of flows which need to be subjected to flow-based analysis but yet exhibits high detection accuracy. Through experiments using data set, we present the effectiveness of the proposed method.

Share and Cite:

Y. Waizumi, H. Tsunoda, M. Tsuji and Y. Nemoto, "A Multi-Stage Network Anomaly Detection Method for Improving Efficiency and Accuracy," Journal of Information Security, Vol. 3 No. 1, 2012, pp. 18-24. doi: 10.4236/jis.2012.31003.

1. Introduction

In recent years, intrusions such as worms and denial of service attack have become a major threat to the Internet. In particular, novel intrusions such as novel worms and zero-day attacks are increasing and are responsible for a big damage to the Internet. For detecting intrusions, Network Intrusion Detection Systems (NIDSs) have gained attention. NIDSs are classified into misuse detection system and anomaly detection system.

In misuse detection systems such as Snort [1], intrusions are detected by matching signatures which are prepared manually in advance. They are highly popular in network security because they exhibit higher detection accuracy and generate fewer false positives for known intrusions than anomaly detection systems. However, developing signatures is cumbersome and time-consuming task because they have to be made by security experts manually. Therefore, novel intrusions can cause a significant damage to the Internet before signatures are developed.

On the other hand, anomaly detection systems such as NIDES [2] and ADAM [3] can detect unknown intrusions. This is because these methods detect intrusions based on the deviation from the normal behavior, and thus do not require a pre-hand knowledge of intrusions.

However, these methods tend to generate more false positives than signature base IDSs. Although a lot of researchers carried out to increase the detection accuracy, still higher detection accuracy is demanded. Therefore, we focus our research on anomaly detection systems.

In anomaly detection systems, network traffic is analyzed using observation units such as timeslot and flow. A timeslot-based detection has an advantage of being able to detect network anomaly states effectively. On the other hand, the flow-based analysis is capable of examining each communication in a more detail form. Our group has proposed a combination of timeslot-based and flow-based detections and shown its effectiveness [4]. However, in a flow-based analysis, a large number of buffers have to be prepared. Analyzing all flows of network traffic is not realistic, and the buffer size can be vulnerability to Denial of Service (DoS) attacks because all flow analysis can result in a buffer overflow.

In this paper, we propose a high accuracy multi-stage anomaly detection system which can reduce the number of flows necessary to be analyzed. The proposed system consists of two detection stages. The first stage is a timeslot-based detector which picks up flows need to be analyzed by flow-based detector in detail. It then inspects only these suspicious flows in the second stage, thus, computational load and buffer size to analyze flows can be reduced.

The remainder of this paper is organized as follows. Section 2 explains timeslot-based and flow-based analyses, and mentions issues in a combination of these analyses. In Section 3, we proposed a multi-stage anomaly detection system. Evaluation of the proposed system is presented in Section 4. Finally, Section 5 concludes this paper.

2. Combination of Timeslot-Based and Flow-Based Analyses

Anomaly detection systems generally analyze traffic in observation units such as timeslots and flows. In this section, we explain these units for the intrusion detection and introduce a conventional method which combines the two detectors. Furthermore, issues in the conventional method are also presented.

2.1. Timeslot-Based Analysis

Anomaly detection often uses timeslot-based analysis [4-6]. In this method, the overall traffic is separated into timeslots of fixed length and its features, which are numerical values representing the network state, are extracted from traffic in the timeslot. It has an advantage of low buffer storage since this analysis releases buffers after each timeslot. However, it is difficult for this method to specify anomalous communication flows.

2.2. Flow-Based Analysis

A flow is defined as a set of packets which have the same values for the following three header fields.

• Protocol (TCP/UDP)

• Source/Destination address pair

• Source/Destination port pair A TCP flow ends with FIN or RST flags and UDP flows are terminated by time-out

A flow is often used in anomaly detection [4,7,8]. A flow-based analysis method can analyze each bidirectional communication in detail and can specify each anomalous communication. However, in this analysis, buffers must be prepared for every flow. The number of buffers to be prepared lineally increases with as increase in the number of flows. Thus, this method possesses a risk of buffer overflow. Therefore, storage of buffers is a bottleneck in the flow-based analysis and vulnerability to DoS attacks.

2.3. A Conventional Combination Method

Our research group has proposed a combined system using the timeslot-based and the flow-based analyses in parallel [4]. Figure 1(a) shows the overview of the conventional system, which we term as a parallel system.

Network traffic is inputted to both the timeslot-based and the flow-based detectors, and is analyzed by each detector. A combination of timeslot-based and flow-based detectors can detect intrusions effectively by taking advantage of the merits possessed by both of these methods. Therefore, the combination system is highly accurate in anomaly detection and [4] shows the effectiveness of the parallel system through some experiments using DARPA data set [9].

However, it is still necessary to address the problem of large buffer storage in the flow-based analysis. For reducing the amount of data to be analyzed by flow-based analysis, packet sampling [10-12] and setting short timeouts [13] have been proposed. However, by using the former, it is difficult to observe flows which consist of only few packets, and thus there is a high chance of missing important packets during detection. Since novel worms tend to be few packets in order to spread as fast as possible [14], such worms are difficult to be sampled. In the latter case, since long traffic flows will be split up if its interval of arrival time of packets exceeds the flow timeout, the short timeouts causes increasing the number of flows and declining efficiency and accuracy [11]. As a result, we consider that these approaches suffer from lack of information for detecting anomalous events and exhibit low detection accuracy.

Since packet sampling and setting short timeout diminish data of each flow without any regards for evaluating anomaly, it may result in lack of information needed to detect anomalous flows. For avoiding lack of information, not data of each flow but the number of flows should be reduced with appropriate criteria.

3. A Multi-Stage Anomaly Detection System

In this section, we propose a multi-stage network anomaly detection system. It uses fewer amount of buffer, but yet detects intrusions with high accuracy.

3.1. Outline

Figure 1(b) shows the overview of the proposed multistage

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] M. Roesch, “Snort-Lightweight Intrusion Detection for Networks,” LISA’99 Proceedings of the 13th USENIX Conference on System Administration, USENIX Association, Berkeley, 7-12 November 1999.
[2] D. Anderson, T. F. Lunt, H. Javits, A. Tamaru and A. Baldes, “Detecting Unusual Program Behavior Using the Statistical Component of the Nextgeneration Intrusion Detection Expert System (NIDES),” Computer Science Laboratory SRI-CSL 95-06, May 1995.
[3] R. Sekar, M. Bendre, D. Dhurjati and P. Bollineni, “A Fast Automaton-Based Method for Detecting Anomalous Program Behaviors,” Proceedings of the 2001 IEEE Symposium on Security and Privacy, Oakland, 2001.
[4] Y. Sato, Y. Waizumi and Y. Nemoto, “Improving Accuracy of Network-Based Anomaly Detection Using Multiple Detection Modules,” Proceedings of IEICE Technical Report, NS2004-144, 2004, pp. 45-48.
[5] P. Barford, J. Kline, D. Plonka and A. Ron, “A Signal Analysis of Network Traffic Anomalies,” Proceedings of ACM SIGCOMM Internet Measurement Workshop (IMW) 2002, Marseille, November 2002, pp. 71-82. doi:10.1145/637201.637210
[6] T. Oikawa, Y. Waizumi, K. Ohta, N. Kato and Y. Nemoto, “Network Anomaly Detection Using Statistical Clustering Method,” Proceedings of IEICE Technical Report, NS2002-143, IN2002-87, CS2002-98, Oct, 2002 pp. 83-88.
[7] Y. Waizumi, D. Kudo, N.Kato and Y. Nemoto, “A New Network Anomaly Detection Technique Based on Per-Flow and Per-Service Statistics,” Proceedings of International Conference on Computational Intelligence and Security, Xi’an, 15-19 December 2005, pp. 252-259.
[8] A. Lakhina, M. Crovella and C. Diot, “Characterization of Network-Wide Anomalies in Traffic Flows,” Proceedings of the ACM/SIGCOMM Internet Measurement Conference, Taormina, 25-27 October 2004, pp. 201-206.
[9] “DARPA Intrusion Detection Evaluation,” MIT Lincoln Labortory, Lincoln, 2011. http://www.ll.mit.edu/IST/ideval/index.html.
[10] Inmon Corporation, “Flow Accuracy and Billing,” 2011. http:// www.inmon.com/pdf/sFlowBillilng.pdf.
[11] N. Duffield, C. Lund and M. Thorup, “Properties and Prediction of Flow Statistics from Sampled Packet Streams,” Proceedings of ACM SIGCOMM Internet Measurement Workshop (IMW), Marseille, 6-8 November 2002. doi:10.1145/637201.637225
[12] N. Duffield, C. Lund and M. Thorup, “Flow Sampling under Hard Resource Constraints,” Proceedings of ACM SIGMETRICS, New York, 10-14 June 2004.
[13] “NeFlow,” 2011. http://www.cisco.com/warp/public/732/Tech/nmp/netflow/index.shtml.
[14] P. Akritidis, K. Anagnostakis and E. P. Markatos, “Efficient Content-Based Detection of Zero-Day Worms,” Proceedings of the International Conference on Communications (ICC 2005), Seoul, 16-20 May 2005.
[15] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba and K. Das, “The 1999 DARPA Off-Line Intrusion Detection Evaluation,” Computer Networks, Vol. 34,No. 4, 2000, pp. 579- 595. doi:10.1016/S1389-1286(00)00139-0
[16] P. Neumann and P. Porras, “Experience with EMERALD to DATE,” Proceedings of 1st USENIX Workshop on Intrusion Detection and Network Monitoring, Santa Clara, 9-12 April 1999, pp. 73-80.
[17] G. Vigna, S. T. Eckmann and R. A. Kemmerer, “The STAT Tool Suite,” Proceedings of the 2000 DARPA Information Survivability Conference and Exposition (DISCEX), Hilton Head, 25-27 January 2000.
[18] S. Jajodia, D. Barbara, B. Speegle and N. Wu, “Audit Data Analysis and Mining (ADAM),” 2000 http://www.isse.gmu.edu/dbarbara/adam.html
[19] M. Tyson, P. Berry, N. Willams, D. Moran, D. Blei, “DERBI: Diagnosis, Explanation and Recovery from computer Break-Ins,” 2000.
[20] M. Mahoney, “Network Traffic Anomaly Detection Based on Packet Bytes,” Proceedings of ACM-SAC, Melbourne, 9-12 March 2003, pp. 346-350.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.