Support Vector Machine and Random Forest Modeling for Intrusion Detection System (IDS)

Abstract

The success of any Intrusion Detection System (IDS) is a complicated problem due to its nonlinearity and the quantitative or qualitative network traffic data stream with many features. To get rid of this problem, several types of intrusion detection methods have been proposed and shown different levels of accuracy. This is why the choice of the effective and robust method for IDS is very important topic in information security. In this work, we have built two models for the classification purpose. One is based on Support Vector Machines (SVM) and the other is Random Forests (RF). Experimental results show that either classifier is effective. SVM is slightly more accurate, but more expensive in terms of time. RF produces similar accuracy in a much faster manner if given modeling parameters. These classifiers can contribute to an IDS system as one source of analysis and increase its accuracy. In this paper, KDD’99 Dataset is used and find out which one is the best intrusion detector for this dataset. Statistical analysis on KDD’99 dataset found important issues which highly affect the performance of evaluated systems and results in a very poor evaluation of anomaly detection approaches. The most important deficiency in the KDD’99 dataset is the huge number of redundant records. To solve these issues, we have developed a new dataset, KDD99Train+ and KDD99Test+, which does not include any redundant records in the train set as well as in the test set, so the classifiers will not be biased towards more frequent records. The numbers of records in the train and test sets are now reasonable, which make it affordable to run the experiments on the complete set without the need to randomly select a small portion. The findings of this paper will be very useful to use SVM and RF in a more meaningful way in order to maximize the performance rate and minimize the false negative rate.

Share and Cite:

Hasan, M. , Nasser, M. , Pal, B. and Ahmad, S. (2014) Support Vector Machine and Random Forest Modeling for Intrusion Detection System (IDS). Journal of Intelligent Learning Systems and Applications, 6, 45-52. doi: 10.4236/jilsa.2014.61005.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] H. G. Kayacik, A. N. Zincir-Heywood and M. I. Heywood, “Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Benchmark,” Proceedings of the PST 2005—International Conference on Privacy, Security, and Trust, 2005, pp. 85-89.
[2] A. A. Olusola., A. S. Oladele and D. O. Abosede, “Analysis of KDD ’99 Intrusion Detection Dataset for Selection of Relevance Features,” Proceedings of the World Congress on Engineering and Computer Science I, San Francisco, 20-22 October 2010.
[3] H. Altwaijry and S. Algarny, “Bayesian Based Intrusion Detection System,” Journal of King Saud University— Computer and Information Sciences, Vol. 24, No. 1, 2012, pp. 1-6.
[4] O. A. Adebayo, Z. Shi, Z. Shi and O. S. Adewale, “Network Anomalous Intrusion Detection using Fuzzy-Bayes,” IFIP International Federation for Information Processing, Vol. 228, 2007, pp. 525-530.
[5] B. Pal and M. A. M. Hasan, “Neural Network & Genetic Algorithm Based Approach to Network Intrusion Detection & Comparative Analysis of Performance,” Proceedings of the the 15th International Conference on Computer and Information Technology, Chittagong, Bangladesh, 2012.
[6] S. M. Bridges and R. B. Vaughn, “Fuzzy Data Mining and Genetic Algorithms Applied To Intrusion Detection,” Proceedings of the National Information Systems Security Conference (NISSC), Baltimore, October 2000, pp. 16-19.
[7] M. S. Abadeh and J. Habibi, “Computer Intrusion Detection Using an Iterative Fuzzy Rule Learning Approach,” Proceedings of the IEEE International Conference on Fuzzy Systems, London, 2007, pp. 1-6.
[8] B. Shanmugam and N. BashahIdris, “Improved Intrusion Detection System Using Fuzzy Logic for Detecting Anamoly and Misuse Type of Attacks,” Proceedings of the International Conference of Soft Computing and Pattern Recognition, 2009, pp. 212-217.
[9] J. T. Yao, S. L. Zhao and L. V. Saxton, “A study on Fuzzy Intrusion Detection,” Proceedings of Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security, 2005, pp. 23-30.
[10] Q. Wang and V. Megalooikonomou, “A Clustering Algorithm for Intrusion Detection,” Proceedings of the Conference on Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security, Vol. 5812, 2005, pp. 31-38.
[11] V. N. Vapnik, “Statistical Learning Theory,” 1st Edition, John Wiley and Sons, New York, 1998.
[12] L. Breiman, “Random Forests,” Machine Learning, Vol. 45, No. 1, 2001, pp. 5-32.
[13] MIT Lincoln Laboratory, “DARPA Intrusion Detection Evaluation,” 2010.
[14] “KDD’99 Dataset,” 2010. http://kdd.ics.uci.edu/databases
[15] M. Bahrololum, E. Salahi and M. Khaleghi, “Anomaly Intrusion Detection Design Using Hybrid of Unsupervised and Supervised Neural Network,” International Journal of Computer Networks & Communications (IJCNC), Vol. 1, No. 2, 2009, pp. 26-33.
[16] H. F. Eid, A. Darwish, A. E. Hassanien and A. Abraham, “Principle Components Analysis and Support Vector Machinebased Intrusion Detection System,” 10th International Conference on Intelligent Systems Design and Applications, Cairo, November 29 2010-December 1 2010, pp. 363-367.
[17] M. Tavallaee, E. Bagheri, W. Lu and A. A. Ghorbani, “A Detailed Analysis of the KDD CUP 99 Data Set,” Proceedings of the 2009 IEEE Symposium on Computational Intelligence in Security and Defense Applications, Ottawa, 8-10 July 2009, pp. 1-6.
[18] V. N. Vapnik, “The Nature of Statistical Learning Theory,” Second Edition, Springer, New York, 1999.
[19] B. Scholkopf and A. J. Smola, “Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond,” The MIT Press Cambridge, Massachusetts London, England, 2001.
[20] V. Jaiganesh and P. Sumathi, “Intrusion Detection Using Kernelized Support Vector Machine with Levenbergmarquardt Learning,” International Journal of Engineering Science and Technology (IJEST), Vol. 4 No. 03, 2012, pp. 1153-1160.
[21] A. Ben-Hur and J. Weston, “A User’s guide to Support Vector Machines,” In: O. Carugo and F. Eisenhaber, Eds., Biological Data Mining, Springer Protocols, 2009, pp. 223-239.
[22] A. Mewada, P. Gedam, S. Khan and M. U. Reddy, “Network Intrusion Detection Using Multiclass Support Vector Machine,” Special Issue of IJCCT, Vol. 1, No. 2-4, 2010, pp. 172-175.
[23] A. Liaw and M. Wiener, “Classification and Regression by Random Forest,” R News, Vol. 2/3, 2002.
[24] V. Svetnik, A. Liaw, C. Tong, J. C.Culberson, R. P. Sheridan and B. P. Feuston, “Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling,” Journal of Chemical Information and Computer Sciences, Vol. 43, No. 6, 2003, pp. 19471958.
[25] Y. Tang, S. Krasser, Y. He, W. Yang and D. Alperovitch, “Support Vector Machines and Random Forests Modeling for Spam Senders Behavior Analysis,” Proceedings of IEEE Global Communications Conference (IEEE GLOBECOM 2008), Computer and Communications Network Security Symposium, New Orleans, 2008, pp. 1-5.
[26] M. Tavallaee, E. Bagheri, W. Lu and A. A. Ghorbani, “A Detailed Analysis of the KDD CUP 99 Data Set,” Proceedings of the Second IEEE International Conference on Computational Intelligence for Security and Defense Applications, Ottawa, 2009, pp. 53-58.
[27] F. Kuang, W. Xu, S. Zhang, Y. Wang and K. Liu, “A Novel Approach of KPCA and SVM for Intrusion Detection,” Journal of Computational Information Systems, Vol. 8, No. 8, 2012, pp. 3237-3244.
[28] S. Lakhina, S. Joseph and B. Verma, “Feature Reduction using Principal Component Analysis for Effective Anomaly-Based Intrusion Detection on NSL-KDD,” International Journal of Engineering Science and Technology, Vol. 2, No. 6, 2010, pp. 1790-1799.
[29] J. T. Yao, S. Zhao and L. Fan, “An Enhanced Support Vector Machine Model for Intrusion Detection,” RSKT’06 Proceedings of the First international conference on Rough Sets and Knowledge Technology, Vol. 4062, 2006, pp. 538-543.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.