Share This Article:

Classification and Novel Class Detection in Data Streams Using Strings

DOI: 10.4236/oalib.1101507    604 Downloads   1,019 Views   Citations

ABSTRACT

Data streams are continuous and always keep evolving in nature. Because of these reasons it becomes difficult to handle such data with simple and static strategies. Data stream poses four main challenges to researchers. These are infinite length, concept-evolution, concept-drift and feature evolution. Infinite-length is because of the amount of data having no bounds. Concept-drift is due to slow changes in the concept of stream. Concept-evolution occurs due to presence of unknown classes in data. Feature-evolution is because of new features continuously keeping appearing in the stream and older ones start disappearing. For performing any analysis on such data we first need to convert it into some knowledgeable form and also need to handle the above mentioned challenges. Various strategies have been proposed to tackle these difficulties. But most of them focus on handling the problem of infinite-length and concept-drift. In this paper, we make efforts to propose a string based strategy to handle infinite-length, concept-evolution and concept-drift.

Conflicts of Interest

The authors declare no conflicts of interest.

Cite this paper

Singh, R. and Chandak, M. (2015) Classification and Novel Class Detection in Data Streams Using Strings. Open Access Library Journal, 2, 1-8. doi: 10.4236/oalib.1101507.

References

[1] Aggarwal, C.C., Han, J., Wang, J. and Yu, P.S. (2006) A Framework for On-Demand Classification of Evolving Data Streams. IEEE Transactions on Knowledge and Data Engineering, 18, 577-589.
http://dx.doi.org/10.1109/TKDE.2006.69
[2] Masud, M.M., Gao, J., Khan, L., Han, J. and Thuraisingham, B.M. Classification and Novel Class Detection in Data Streams with Active Mining.
[3] Yang, Y., Wu, X. and Zhu, X. (2005) Combining Proactive and Reactive Predictions for Data Streams. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, ACM, New York 710-715.
[4] Spinosa, E.J., de Leon F. de Carvalho, A.P. and Gama, J. (2008) Cluster-Based Novel Concept Detection in Data streams Applied to Intrusion Detection in Computer Networks. In: Proceedings of the 2008 ACM Symposium on Applied Computing, ACM, New York, 976-980.
[5] Masud, M.M., Gao, J., Khan, L., Han, J. and Thuraisingham, B.M. (2009) Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams. Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), 79-94.
[6] Masud, M.M., Chen, Q., Gao, J., Khan, L., Han, J. and Thuraisingham, B.M. (2010) Classification and Novel Class Detection of DataStreams in a Dynamic Feature Space. Lecture Notes in Computer Science, 6322, 337-352.
http://dx.doi.org/10.1007/978-3-642-15883-4_22
[7] Masud, M.M., Chen, Q., Khan, L., Aggarwal, C., Gao, J., Han, J. and Thuraisingham, B.M. (2010) Addressing Concept-Evolution in Concept-Drifting Data Streams. Proceedings of the IEEE International Conference on Data Mining (ICDM), 929-934.
[8] Spinosa, E.J., de Leon F.de Carvalho, A.P. and Gama, J. (2007) OLINDDA: A Cluster Based Approach for Detecting Novelty and Concept-Drift in Data Stream. In: Proceedings of the 2007 ACM Symposium on Applied Computing, ACM, New York, 448-452.
[9] Wenerstrom, B. and Giraud-Carrier, C. (2006) Temporal Data Mining in Dynamic Feature Spaces. Sixth International Conference on Data Mining (ICDM), Hong Kong, 18-22 December 2006, 1141-1145.
http://dx.doi.org/10.1109/ICDM.2006.157
[10] Masud, M.M., Gao, J., Khan, L., Han, J. and Thuraisingham, B.M. (2011) Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints. IEEE Transactions on Knowledge and Data Engineering, 23, 859-874.
[11] Masud, M.M., Gao, J., Khan, L., Han, J. and Thuraisingham, B.M. (2013) Classification and Novel Class Detection in Feature Based Stream Data. IEEE Transactions on Knowledge and Data Engineering, 25, No. 7.
[12] Bopche, A., Nagle, M. and Gupta, H. (2014) A Review of Method of Stream Data Classification through Optimized Feature Evolution Process. International Journal of Engineering and Computer Science, 3, 3778-3783.

  
comments powered by Disqus

Copyright © 2019 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.