Journal of Intelligent Learning Systems and Applications

Volume 7, Issue 2 (May 2015)

ISSN Print: 2150-8402   ISSN Online: 2150-8410

Google-based Impact Factor: 2.33  Citations  

An Online Malicious Spam Email Detection System Using Resource Allocating Network with Locality Sensitive Hashing

HTML  XML Download Download as PDF (Size: 2857KB)  PP. 42-57  
DOI: 10.4236/jilsa.2015.72005    5,421 Downloads   7,185 Views  Citations

ABSTRACT

In this paper, we propose a new online system that can quickly detect malicious spam emails and adapt to the changes in the email contents and the Uniform Resource Locator (URL) links leading to malicious websites by updating the system daily. We introduce an autonomous function for a server to generate training examples, in which double-bounce emails are automatically collected and their class labels are given by a crawler-type software to analyze the website maliciousness called SPIKE. In general, since spammers use botnets to spread numerous malicious emails within a short time, such distributed spam emails often have the same or similar contents. Therefore, it is not necessary for all spam emails to be learned. To adapt to new malicious campaigns quickly, only new types of spam emails should be selected for learning and this can be realized by introducing an active learning scheme into a classifier model. For this purpose, we adopt Resource Allocating Network with Locality Sensitive Hashing (RAN-LSH) as a classifier model with a data selection function. In RAN-LSH, the same or similar spam emails that have already been learned are quickly searched for a hash table in Locally Sensitive Hashing (LSH), in which the matched similar emails located in “well-learned” are discarded without being used as training data. To analyze email contents, we adopt the Bag of Words (BoW) approach and generate feature vectors whose attributes are transformed based on the normalized term frequency-inverse document frequency (TF-IDF). We use a data set of double-bounce spam emails collected at National Institute of Information and Communications Technology (NICT) in Japan from March 1st, 2013 until May 10th, 2013 to evaluate the performance of the proposed system. The results confirm that the proposed spam email detection system has capability of detecting with high detection rate.

Share and Cite:

Ali, S. , Ozawa, S. , Nakazato, J. , Ban, T. and Shimamura, J. (2015) An Online Malicious Spam Email Detection System Using Resource Allocating Network with Locality Sensitive Hashing. Journal of Intelligent Learning Systems and Applications, 7, 42-57. doi: 10.4236/jilsa.2015.72005.

Cited by

[1] A Dynamic Locality Sensitive Hashing Algorithm for Efficient Security Applications
2023 International Conference …, 2023
[2] A Survey on Locality Sensitive Hashing Algorithms and their Applications
2021
[3] Cyber security threats, challenges and defence mechanisms in cloud computing
2020
[4] Analysis and Evaluation of E-mail Security
2018
[5] An Efficient Spam Classification System Using Ensemble Machine Learning Algorithm
JASC: Journal of Applied Science and Computations, 2018
[6] 一种具有迁移学习能力的 RBF-NN 算法及其应用
2018
[7] A Survey of Email Service; Attacks, Security Methods and Protocols
International Journal of Computer Applications, 2017
[8] An approach for Malicious Spam Detection In Email with comparison of different classifiers
International Research Journal of Engineering and Technology, 2017
[9] A Proposed Model for Malicious Spam Detection in Email Systems of Educational Institutes
2016
[10] Incremental learning for large-scale stream data and its application to cybersecurity
2015

Copyright © 2025 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.