Data Classification Using Combination of Five Machine Learning Techniques

Data clustering plays a vital role in object identification. In real life we mainly use the concept in biometric identification and object detection. In this paper we use Fuzzy Weighted Rules, Fuzzy Inference System (FIS), Fuzzy C-Mean clustering (FCM), Support Vector Machine (SVM) and Artificial Neural Network (ANN) to distinguish three types of Iris data called Iris-Setosa, Iris-Versicolor and Iris-Virginica. Each class in the data table is identified by four-dimensional vector, where vectors are used as the input variable called: Sepal Length (SL), Sepal Width (SW), Petal Length (PL) and Petal Width (PW). The combination of five machine learning methods provides above 98% accuracy of class identification.


Introduction
In this paper five widely used methods: Fuzzy weighted rule, FIS, FCM, SVM and ANN are integrated in classification of Iris data. Several works related to the paper are mentioned in this section. In [1] authors use Adaptive Neuro-Fuzzy Inference System (ANFIS) and the Fuzzy Inference System (FIS) for professional blogger classification, where FIS provides better results compared to Classification Based on Associations (CBA). The combination of Artificial Neural Network (ANN) and ANFIS gives better classification, whereas the proposed ANFIS of the paper shows the best result which is 93%. The concept of FIS in data classification is also found in [2], where fault of electrical transmission line is de-tected and classified properly.
In [3], fuzzy weighted rules are used to classify Iris data using seven membership function (MFs). The average classification rate is found 96.48%, 96.06% and 96.7% for 7, 9 and 11 labels of MFs. The main drawback of the paper is that, it only deals with single method of classification; therefore we have the scope of inclusion of other data segregation algorithms. The fuzzy rule-based classification is found in [4] for classification of coronary artery disease data, where trapezoidal membership functions are used for input variables. The classification rate varies with different weighting rules, the maximum value is found 92.8% and that of minimum value is 71.8%. In this paper, we applied fuzzy c-mean clustering in Iris data classification; the similar concept is available in MR brain image segmentation in [5]. Here the entire algorithm of C-mean clustering is shown and the performance of image classification is compared with seven different methods and fuzzy c-mean clustering provides moderate result. Application of FCM in image classification is found in [6], where FCN is combined with Convolution Neural Network (CNN) to recognize tumors in the brain. The accuracy of detection is claimed by the auditors is 91%. Application of FCM is also found in image classification in [7] [8]. The SVM in data classification is used in [9], where text based automatic task classification is done. The authors claim the accuracy of classification in the range of 82% to 99%. Similar concept is found in [10] for breast cancer diagnosis, where three different types of kernels are used and accuracy is found above 90% for all cases.
In this paper we combined all the five algorithms to classify Iris data, although the concept of the paper is applicable in any type of data or feature vector-based image classification. The main objective of the paper is to get high accuracy of data classification avoiding deep learning technique so that process time will remain low. Actually, inclusion of Fuzzy weighted rule plays a vital role in data classification. Most of the previous works did not include the Fuzzy weighted rule hence they have to include deep learning to acquire high accuracy of classification, which needs huge process time. The combination of five methods of the paper like [11] is found more robust compared to previous works. We compare the result of the paper (using same data set) with two previous works and found better result, which is shown in result section.
The rest of the paper is organized as: Section 2 provides theoretical analysis of five machine learning algorithms used in this paper for data classification, Section 3 provides results based on analysis of Section 2 and Section 4 concludes entire analysis.

Fuzzy Inference System (FIS)
Fuzzy Inference System (FIS) consists of three building blocks: Fuzzification, Inference and De-fuzzification. The numerical data is converted to Fuzzy symbols using membership functions (MFs) consisting of several variables, where each variable has its range of numerical value. The above conversion technique is called Fuzzification. The Inference block deals with some rules using if-then form to relate input and output. Finally output symbols are converted to numerical value using De-fuzzification technique on the output MFs.

Fuzzy Weighted Rule
The detail analysis of Fuzzy weighted rule is shown in [3] with numerical example. In this paper we show the steps of the algorithm in a different way like below: 11. Repeat step 9 and 10 for the rest of input variables 12.
For each input record of N-tuple determine weighted co-variance of each rule like, where Xj is jth the input Fuzzy variable, i for ith rule, The highest value of R corresponding to kth rule indicates the input tuple is under the output of kth category In this subsection few numerical examples are shown according to the steps Fuzzu weighted rule. First of all, we take few data of Iris under three categories called: Iris-Setosa, Iris-Versicolor and Iris-Virginica shown in Table 1. For each category four types of inputs (SL, SW, PL and PW) and corresponding output are taken as the initial data shown in Table 1. For better understanding of reader, we chose the same initial data of [3] and we elaborate the initial data processing steps more explicitly compared to previous paper.
For each input SL, SW, PL or PW we consider 7 trapezoidal membership functions named: HN, MN, SN, Z, SP, MP and HP as shown in Figures 1(a)-(d) for four input variables. The MFs of three output classes is shown in Figure 2.

Fuzzy c-Means Clustering
The main objective of FCM is to minimize the objective function, where m is a real number greater than 1 called fuzzifier u ij is the degree to which an x(i) belongs to the cluster j with center c j x(i) is the ith data point c is the number of clusters The steps of Fuzzy c mean clustering algorithm is given below like [12] [13].
First consider n data points, to be segregated into c clusters 2. Take the initial value of center of clusters, ck; where 1, 2,3, , k c =

3.
Evaluate grade (or degree) of membership uij i.e. the degree to which an x(i) belongs to the cluster with center cj, The entire vector is expressed at kth iteration as,

Support Vector Machine
The SVM is a supervised learning algorithm used for data classification, Journal of Computer and Communications where w is known as the weight vector and b as the bias. The SVM determines the constants: for another group of points. The SVM uses Kernel function to provide the best trajectory of decision boundary.

Artificial Neural Network
In this paper we used feed-forward ANN, where signal only travels in one direc- The five machine learning methods will be combined using Shannon entropy-based algorithm.

Result and Discussion
This section provides results based on theoretical analysis of previous section.
First of all, we apply FIS on the Iris data. The FIS used in this paper is shown in  Table 2 at the end of this section.   Next we apply Fuzzy c-mean clustering on the entire dataset taking two variables at a time. The scatterplot of three output data are shown in Figure 6. Few data points seem to cross its region i.e. produce some recognition error. Here 50 data for Iris-Set, 50 data for Iris-Ver and 50 data for Iris-Vir are taken.
Finally, scatterplot of data points in four combinations of four input variables are shown in Figures 7(a)-(d) to get the idea of best separation case. Here PW vs. PL shows the best separation as found in Figure 6(b). The regional separation of data points using SVM is shown in Figures 8(a)-(d), where Figure 8(b) shows the best regional separation. In future we will apply multiple linear regression (MLR) on four-dimensional input data to convert them into two-dimensional data, then apply SVM to observe any improvement compared to four cases of Figure 8.
Next, Irish data classification is done using feedforward ANN. The performance of the network, error histogram and confusion matrix are shown in Figure 9-11 for the case of 10 and 20 hidden layers. Similar results are shown in Figure 12 and Figure 13 for backpropagation ANN for 8 and 10 hidden layers. The performance is found better with increment of hidden layer at the expense of process time.
Except Weighted Fuzzy, no individual method provides high accuracy of recognition visualized from Table 2. The Weighted Fuzzy provides high accuracy at         the expense of process time, but process time is much smaller than deep leaning technique. We combined five methods using entropy based combining algorithm of [11], which provides accuracy of recognition above 98% for all the five experiments. Finally, we compared our results with NN + SVM of [18] and FCM + SVM of [19], using the same data, where the result of first case is found 0.9417 and that of second case is 0.9445. Our model is the combination of five MLs, which is more robust than previous works in data classifications.

Conclusion
In this paper Iris data classification is done using FIS, Weighted Fuzzy rule,