A Novel Approach to Disqualify Datasets Using Accumulative Statistical Spread Map with Neural Networks (ASSM-NN)

Abstract

A novel approach to detect and filter out an unhealthy dataset from a matrix of datasets is developed, tested, and proved. The technique employs a new type of self organizing map called Accumulative Statistical Spread Map (ASSM) to establish the destructive and negative effect a dataset will have on the rest of the matrix if stayed within that matrix. The ASSM is supported by training a neural network engine, which will determine which dataset is responsible for its inability to learn, classify and predict. The carried out experiments proved that a neural system was not able to learn in the presence of such an unhealthy dataset that possessed some deviated characteristics, even though it was produced under the same conditions and through the same process as the rest of the datasets in the matrix, and hence, it should be disqualified, and either removed completely or transferred to another matrix. Such novel approach is very useful in pattern recognition of datasets and features that do not belong to their source and could be used as an effective tool to detect suspicious activities in many areas of secure filing, communication and data storage.

Share and Cite:

Iskandarani, M. (2015) A Novel Approach to Disqualify Datasets Using Accumulative Statistical Spread Map with Neural Networks (ASSM-NN). Intelligent Information Management, 7, 139-152. doi: 10.4236/iim.2015.73013.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Albarado, K., Ledlow, T. and Hartfield, R. (2015) Alternative Analysis Networking: A Multichar-acterization Algorithm. Computing in Science & Engineering, 17, 54-63.
http://dx.doi.org/10.1109/MCSE.2015.10
[2] Li, W., Amsaleg, L., Morton, A. and Marchand-Maillet, S. (2015) A Privacy-Preserving Framework for Large-Scale Content-Based Information Retrieval. IEEE Transactions on Information Forensics and Security, 10, 152-167. http://dx.doi.org/10.1109/TIFS.2014.2365998
[3] Cheng, Q., Zhou, H.B., Cheng, J. and Li, H.Q. (2014) A Minimax Framework for Classification with Applications to Images and High Dimensional Data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 2117-2130. http://dx.doi.org/10.1109/TPAMI.2014.2327978
[4] Xu, J.L., Ramos, S., Vazquez, D. and Lopez, A.M. (2014) Domain Adaptation of Deformable Part-Based Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 2367-2380.
http://dx.doi.org/10.1109/TPAMI.2014.2327973
[5] Thanh, M.N., Wu, Q.M.J. and Zhang, H. (2015) Asymmetric Mixture Model with Simultaneous Feature Selection and Model Detection. IEEE Transactions on Neural Networks and Learning Systems, 26, 400-4008. http://dx.doi.org/10.1109/TNNLS.2014.2314239
[6] Araujo, A. and Soares, M. (2015) Weights Based Clustering in Data Envelopment Analysis Using Kohonen Neural Network: An Application in Brazilian Electrical Sector. IEEE—Latin American Transactions, 13, 188-194.http://dx.doi.org/10.1109/TLA.2015.7040647
[7] Sabri, A. (2014) Further Analysis of Stability of Uncertain Neural Networks with Multiple Time Delays. Advances in Difference Equations, 41, 1-16.
[8] Shabtai, A., Moskovitch, R., Feher, C., Dolev, S. and Elovici, Y. (2012) Detecting Unknown Malicious Code by Applying Classification Techniques on OpCode Patterns. Security Informatics, 1, 1-22.
http://dx.doi.org/10.1186/2190-8532-1-1
[9] Zuech, R., Taghi, T.M. and Wald, R. (2015) Intrusion Detection and Big Heterogeneous Data: A Survey. Journal of Big Data, 2, 1-41. http://dx.doi.org/10.1186/s40537-015-0013-4
[10] Goyal, R., Chandra, P. and Singh, Y. (2013) Identifying Influential Metrics in the Combined Metrics Approach of Fault Prediction. SpringerPlus, 2, 1-8. http://dx.doi.org/10.1186/2193-1801-2-627
[11] Bashiri, M., Farshbaf-Geranmayeh, A. and Mogouie, H. (2013) A Neuro-Data Envelopment Analysis Approach for Optimization of Uncorrelated Multiple Response Problems with Smaller the Better Type Controllable Factors. Journal of Industrial Engineering International, 9, 1-10.
[12] Schiezaro, M. and Pedrini, H. (2013) Data Feature Selection Based on Artificial Bee Colony Algorithm. EURASIP Journal on Image and Video Processing, 2013, 1-8.
http://dx.doi.org/10.1186/1687-5281-2013-47
[13] Chen, J., Takiguchi, T. and Ariki, Y. (2015) A Robust SVM Classification Framework Using PSM for Multi-Class Recognition. EURASIP Journal on Image and Video Processing, 2015, 1-12.
[14] Dimou, I. and Zervakis, M. (2013) On the Analogy of Classifier Ensembles with Primary Classifiers: Statistical Performance and Optimality. Journal of Pattern Recognition Research, 8, 98-122.
http://dx.doi.org/10.13176/11.497

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.