Data Stream Subspace Clustering for Anomalous Network Packet Detection


As the Internet offers increased connectivity between human beings, it has fallen prey to malicious users who exploit its resources to gain illegal access to critical information. In an effort to protect computer networks from external attacks, two common types of Intrusion Detection Systems (IDSs) are often deployed. The first type is signature-based IDSs which can detect intrusions efficiently by scanning network packets and comparing them with human-generated signatures describing previously-observed attacks. The second type is anomaly-based IDSs able to detect new attacks through modeling normal network traffic without the need for a human expert. Despite this advantage, anomaly-based IDSs are limited by a high false-alarm rate and difficulty detecting network attacks attempting to blend in with normal traffic. In this study, we propose a StreamPreDeCon anomaly-based IDS. StreamPreDeCon is an extension of the preference subspace clustering algorithm PreDeCon designed to resolve some of the challenges associated with anomalous packet detection. Using network packets extracted from the first week of the DARPA '99 intrusion detection evaluation dataset combined with Generic Http, Shellcode and CLET attacks, our IDS achieved 94.4% sensitivity and 0.726% false positives in a best case scenario. To measure the overall effectiveness of the IDS, the average sensitivity and false positive rates were calculated for both the maximum sensitivity and the minimum false positive rate. With the maximum sensitivity, the IDS had 80% sensitivity and 9% false positives on average. The IDS also averaged 63% sensitivity with a 0.4% false positive rate when the minimal number of false positives is needed. These rates are an improvement on results found in a previous study as the sensitivity rate in general increased while the false positive rate decreased.

Share and Cite:

Z. Miller and W. Hu, "Data Stream Subspace Clustering for Anomalous Network Packet Detection," Journal of Information Security, Vol. 3 No. 3, 2012, pp. 215-223. doi: 10.4236/jis.2012.33027.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] R. Perdisci, G. Gu and W. Lee, “Using an Ensemble of One-Class SVM Classifiers to Harden Payload-Based Anomaly Detection Systems,” Proceedings of the Sixth International Conference on Data Mining, Hong Kong, 18-22 December 2006, pp. 488-498. doi:10.1109/ICDM.2006.165
[2] R. Perdisci, “Statistical Pattern Recognition Techniques for Intrusion Detection in Computer Networks, Challenges and Solutions,” Ph.D. Thesis, University of Cagliari, Italy, 2006.
[3] D. Anderson, T. Lunt, H. Javits and A. Tamaru, “Nides: Detecting Unusual Program Behavior Using the Statistical Component of the Next Generation Intrusion Detection Expert System,” Technical Report SRI-CSL-95-06, Computer Science Laboratory, SRI International, Menlo Park, 1995.
[4] M. Mahoney, “Network Traffic Anomaly Detection Based on Packet Bytes,” ACM-SAC, Melbourne, 2003, pp. 346-350.
[5] M. Mahoney and P. Chan, “Learning Non Stationary Models of Normal Network Traffic for Detecting Novel Attacks,” ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, July 2002, pp. 376-385.
[6] K. Wang and S. Stolfo, “Anomalous Payload-Based Network Intrusion Detection,” Recent Advances in Intrusion Detection, Vol. 3224, 2004, pp. 203-222. doi:10.1007/978-3-540-30143-1_11
[7] K. Wang, “Network Payload-Based Anomaly Detection and Content-Based Alert Correlation,” Ph.D. Thesis, Columbia University, New York, 2006.
[8] R. Perdisci, D. Ariu, P. Fogla, G. Giacinto and W. Lee, “McPAD: A Multiple Classifier System for Accurate Payload-Based Anomaly Detection,” Computer Networks, Special Issue on Traffic Classification and Its Applications to Modern Networks, Vol. 5, No. 6, 2009, pp. 864-881.
[9] J. Gama, “Knowledge Discovery from Data Streams,” CRC Press, Boca Raton, pp. 7-9.
[10] Z. Miller, W. Dietrick and W. Hu, “Anomalous Network Packet Detection Using Data Stream Mining,” Journal of Information Security, Vol. 2, No. 4, 2011, pp. 158-168. doi:10.4236/jis.2011.24016
[11] F. Cao, M. Ester, W. Quan and A. Zhou, “Density-Based Clustering over an Evolving Data Stream with Noise,” 2006 SIAM Conference on Data Mining, Bethesda, 20-22 April 2006.
[12] C. Bohm, K. Kailing, H. Kriegel and P. Kroger, “Density Connected Clustering with Local Subspace Preferences,” Proceedings of the Fourth IEEE International Conference on Data Mining, Brighton, 1-4 November 2004, pp. 27-34.
[13] M. Ester, H. Kriegel, J. Sander and X. Xu, “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise,” International Conference on Knowledge Discovery in Databases and Data Mining (KDD-96), Portland, August 1996, pp. 226-231.
[14] M. Ankerst, M. Breunig, H. Kriegel and J. Sander, “OPTICS: Ordering Points to Identify the Clustering Structure,” SIGMOD, Philadelphia, 1999, pp. 49-60.
[15] H. Kriegel, P. Kroger, I. Ntoutsi and A. Zimek, “Towards Subspace Clustering on Dynamic Data: An Incremental Version of PreDeCon,” Proceedings of First International Workshop on Novel Data Stream Pattern Mining Techniques, Washington DC, 2010, pp. 31-38. doi:10.1145/1833280.1833285

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.