Analysis of Cardiotocogram Data for Fetal Distress Determination by Decision Tree Based Adaptive Boosting Approach

Cardiotocography is one of the most widely used technique for recording changes in fetal heart rate (FHR) and uterine contractions. Assessing cardiotocography is crucial in that it leads to identifying fetuses which suffer from lack of oxygen, i.e. hypoxia. This situation is defined as fetal distress and requires fetal intervention in order to prevent fetus death or other neurological disease caused by hypoxia. In this study a computer-based approach for analyzing cardiotocogram including diagnostic features for discriminating a pathologic fetus. In order to achieve this aim adaptive boosting ensemble of decision trees and various other machine learning algorithms are employed.


Introduction
Cardiotocography (also called as electronic fetal monitoring, EFM) is a worldwide technique for fetal monitoring.Two transducers measuring fetal heart rate (FHR) and uterine contractions are placed on the abdomen of a pregnant.Cardiotocogram (CTG) refers to simultaneous recording of both FHR and uterine contractions.
Many typical findings are included in a CTG and obstetricians make clinical decisions about the state of the fetus considering these findings.However the interpretation of the information provided by CTG is not standardized.The deficient interpretation of CTG leaded to unnecessary surgical intervention, e.g.increase in cesarean births [1].Therefore, computer-based approaches are presented recently.Huang and Hsu [2] proposed discriminant analysis (DA), decision tree (DT), and artificial neural network (ANN) in their study to evaluate fetal distress by the same CTG data used in this study.They reached the results showing that the accuracies of DA, DT and ANN are 82.1%,86.36% and 97.78% respectively, and 80%, 10%, and the remaining 10% of the whole dataset were randomly used for training, testing, and validation respectively.Sundar et al. [3] implemented a supervised ANN which can classify the CTG data, the results are evaluated with respect to rand index, precision, recall and f-Score.The authors presented another related work in which neural network based classification model has been compared with the most commonly used unsupervised clustering methods; Fuzzy C-mean and k-mean clustering [4].The arrived results show that the performance of the supervised ANN approach provided outperformed the other compared unsupervised clustering methods significantly.In a study, least squares support vector machine (LS-SVM) is employed utilizing a binary decision tree is for classification of the same cardiotocogram data to determine the fetal state [5].Particle swarm optimization (PSO) is used for the optimization of parameters of LS-SVM, they reached a classification accuracy rate of 91.62%.
In this study, CTG data is analyzed by an ensemble approach of adaptive boosting (AdaBoost).Each base classifier of the system is a decision tree which contributes to the final decision of the system, by which 95.01% accuracy is achieved.We also presented a performance comparison of classification algorithms with and without incorporation of the AdaBoost ensemble.Therefore, contribution of AdaBoost to classification algorithms is analyzed with respect to CTG data.

Dataset Descriptions
The cardiotocography data set used in this study is publicly available at "The Data Mining Repository of University of California Irvine (UCI)" [6].By using 21 given attributes data can be classified according to FHR pattern class or fetal state class code.In this study, fetal state class code is used as target attribute instead of FHR pattern class code and each sample is classified into one of three groups normal, suspicious or pathologic. The

Decision Tree Based Adaptive Boosting (AdaBoost) Method
Adaptive boosting (AdaBoost) algorithm [7] is the most popular variant of boosting ensemble method.In an ensemble system more than one classifier is trained and each classifier contributes to the final decision of the system [8].A decision tree is a prediction method which can easily integrate with information technologies, and can be used in clinical decision making, for example a type of decision tree C4.5 can be used to yield clinically useful predictive values [10].
Data classification is a two-phase operation in a decision tree.First phase is training phase, and second is classification phase.At training phase, a training data is used for construction of the tree.The rules of the tree are determined according to this training data.C4.5 algorithm selects the attributes according to their entropy quantities, while constructing a tree.
At the classification phase, a test data is used for validation of the constructed tree.If accuracy of the tree is at an acceptable ratio, then the tree is used for new data samples.Decision process in a tree is from root node until reaching a leaf, following consequent nodes.A path from root node to a leaf produces a decision rule of the tree.Decision rules resemble rules in programming languages.To classify a new sample is started from the root and queried among a top-down path until a leaf is reached.When a leaf is reached, it is determined as the class of that sample.

Experimental Results
Mean absolute error (MAE), kappa statistics and accuracy are used as model evaluation metrics for experimental results.MAE is the mean of the absolute values of the each classification errors on all samples.Equation (1) denotes the calculation of MAE where y i is the actual value and p i is the predicted value, and n is the total number of samples in the data.
Kappa statistics measures the agreement between classifier predictions with actual class values.It is used for assessing how the predictions are far from the results produced by chance and expected to be as approximate to 1 as possible: pr classifier is the proportion of data samples that the classifier predictions and actual values agree.pr chance is the proportion of the agreement which may occur by chance.A kappa value of 0 indicates that the accuracy achieved by classifier is by chance and a kappa value of 1 indicates a perfect agreement.
Accuracy is a measurement of closeness of classification results to the actual values of class labels of samples, and defined as the proportion of number of correctly classified samples to number of all samples.
In order to compare the performance of the classification algorithms without and with AdaBoost ensemble technique, WEKA data mining tool [11] is used, which is a collection of machine learning algorithms written in Java.The default parameters were used for each classification algorithm.10-fold cross validation is utilized to validate the performance of the classifier, data separated into 10 subsets, and the hold out operation is performed 10 times in each of which a subset is used for testing and the other subsets are used for training.Therefore, the eventual accuracy is calculated by averaging 10 accumulated accuracies.
According to Table 1, by employing decision tree based AdaBoost ensemble method 1622 + 236 + 162 = 2020 of 2126 is perfectly predicted, a promising result.26 samples are predicted as "suspicious", and 7 as "pathologic" whereas they have actual values of "normal".55 "normal" and 4 "pathologic" classified instances have actual values of "suspicious".Additionally, 7 "normal" and 7 "suspicious" classified samples have actual values of "pathologic".
Six classification models are evaluated with respect to metrics of MAE, kappa statistics and accuracy.Table 2 and Table 3 represent the classification results of various algorithms without AdaBoost and with AdaBoost respectively.
Unlike Table 2, Table 3 includes the results of the algorithms produced by models which are used as base classifiers in AdaBoost ensemble method.Compared to Table 2, the results of Table 3 are bolded if any improvement is achieved.That is, if there is a reduction in error quantity, MAE, and an increase in kappa statistics and accuracy, the corresponding values of the classifier is bolded.Accordingly AdaBoost ensemble method contributed to improvement of four of six models with respect to MAE, kappa statistics and accuracy.
In Table 2, the results of neural network and decision tree appear to be close to each other, approximately 92%.However, Table 3 shows that the contribution of AdaBoost ensemble is to C4.5 decision tree and accuracy is improved up to 95.014%.Additionally, the maximum improvement is achieved by both Naive Bayes and Bayesian network by approximately 6% advancement.One percent is even meaningful that is it means approximately 21 patients in the data.AdaBoost isn't able to contribute to support vector machine and neural network results, but also a considerable inverse effect is not observed with respect to all three evaluation metrics.
The analysis results are very promising as for comparing with the related works.As stated earlier Huang and Hsu [2] analyzed the same data and reached the results showing that the accuracies of DA, DT and ANN are 82.1%,86.36% and 97.78% respectively, and 80%, 10%, and the remaining 10% of the whole dataset were randomly used for training, testing, and validation respectively.ANN result of 97.78% accuracy doesn't outperform our decision tree based AdaBoost result of 95.01%, because in this study not a part of data is selected for test, but 10-fold cross validation technique is used for stability.Another admirable study [5] analyzed the same data by least squares support vector machine (LS-SVM) utilizing a binary decision tree, optimizing the parameters by PSO, they reached a classification accuracy rate of 91.62%, again outperformed by AdaBoost ensemble with base classifiers of decision trees.

Conclusions
Computer based studies in medical area lead to great advance in clinical decision support systems.The progress in machine learning area requires a simultaneous contribution to medical area with respect to quality and preventing human supplied errors.However, a very successful computer based solution for a medical or some other  problem from different area, can fail for a different problem.Therefore, search should be broadened for a computer solution especially for a medical decision.Therefore, the results of prior studies are considered in our analysis of CTG.The determination of state of fetus is especially important for early intervention of required cases, i.e. fetal distress or preventing unnecessary surgeries.The effect of using AdaBoost ensemble on classifiers is investigated for perfect determination of fetal distress from CTG data in this study.Figure 2 visually represents the promising results of experiments related to contribution of AdaBoost ensemble on classifying machine learning algorithms, confirming the fact that ensemble machine learning approaches often performs much better than single classifiers that make them up [12].The most prominent result belongs to decision tree based AdaBoost algorithm by 0.034 MAE, 0.861 kappa statistics and 95.01%accuracy, meaning that 2020 of 2126 samples are perfectly predicted.These results are an improved next step following the related studies carried out in literature.

Figure 1 .
Figure 1.Training phase of the model used by employing decision tree based AdaBoost ensemble.

Figure 2 .
Figure 2. Representation of AdaBoost ensemble contribution to classifiers.
[9]ew base classifier that fixes previous errors.Differently, AdaBoost assigns a weight value for each candidate training sample.The candidate training sample that is incorrectly classified by previous classifiers has greater weight[9].These candidates are selected according to their weights for training set of next base classifier to be added.Therefore, AdaBoost concentrates on samples which are difficult to classify correctly.Base classifiers are added until a low ratio of error is reached.Unlike boosting algorithm's decision strategy of majority vote, AdaBoost decides with respect to weighted votes.Votes are weighted according to training accuracies of classifiers.In this study, decision trees are used as base classifiers as depicted in Figure1.
These contributing classifiers are called base classifiers.Boosting produces base classifiers one after another.Each base classifier is dependent on the previous classifier, such that the training set chosen for a base classifier includes the set of incorrectly classified instances by previous base classifier.Therefore, the ensemble is strengthened by

Table 1 .
Confusion matrix of classification by decision tree based AdaBoost ensemble.

Table 3 .
Evaluation results with AdaBoost.