TITLE:
Knowledge Discovering in Corporate Securities Fraud by Using Grammar Based Genetic Programming
AUTHORS:
Hai-Bing Li, Man-Leung Wong
KEYWORDS:
Knowledge Discovering; Rule Induction; Token Competition; SMOTE; Corporate Securities Fraud Detection; Grammar-Based Genetic Programming
JOURNAL NAME:
Journal of Computer and Communications,
Vol.2 No.4,
March
18,
2014
ABSTRACT:
Securities fraud is a common worldwide
problem, resulting in serious negative consequences to securities market each
year. Securities Regulatory Commission from various countries has also attached
great importance to the detection and prevention of securities fraud
activities. Securities fraud is also increasing due to the rapid expansion of
securities market in China. In accomplishing the task of securities fraud
detection, China Securities Regulatory Commission (CSRC) could be facilitated
in their work by using a number of data mining techniques. In this paper, we
investigate the usefulness of Logistic regression model, Neural Networks (NNs),
Sequential minimal optimization (SMO), Radial Basis Function (RBF) networks,
Bayesian networks and Grammar Based Genet- ic Programming (GBGP) in the
classification of the real, large and latest China Corporate Securities Fraud
(CCSF) database. The six data mining techniques are compared in terms of their
performances. As a result, we found GBGP outperforms others. This paper
describes the GBGP in detail in solving the CCSF problem. In addition, the
Synthetic Minority Over-sampling Technique (SMOTE) is applied to generate
synthetic minority class examples for the imbalanced CCSF dataset.