^{1}

^{1}

Automated diagnosis of skin cancer is an important area of research that had different automated learning methods proposed so far. However, models based on insufficient labeled training data can badly influence the diagnosis results if there is no advising and semi supervising capability in the model to add unlabeled data in the training set to get sufficient information. This paper proposes a semi-advised support vector machine based classification algorithm that can be trained using labeled data together with abundant unlabeled data. Adaptive differential evolution based algorithm is used for feature selection. For experimental analysis two type of skin cancer datasets are used, one is based on digital dermoscopic images and other is based on histopathological images. The proposed model provided quite convincing results on both the datasets, when compared with respective state-of-the art methods used for feature selection and classification phase.

Malignant melanoma is one of the most dangerous forms of skin cancer. Melanoma cases are recorded in big numbers over the last few decades [

Traditionally, in the skin cancer diagnosis process, dermoscopic images are used by dermatologists while pathologists use histopathological images of biopsy samples taken from patients and examine them using microscope. However, all the analysis and judgments depend on personal experience and expertise and often lead to considerable variability [

Due to the complex nature of skin cancer images specially the histopathological images [

Searching for the optimal feature subset, which can result in best training as well as testing performance, is a challenging task. Various studies show that DE has outperformed many other optimization algorithms in terms of robustness over common benchmark problems and real world applications [

On the other hand, for the evaluation of selected feature sets, there are various classification/learning methods proposed in literature [

The paper is organized as follows: Section 2 provides the details of the adaptive differential evolution algorithm proposed for feature selection and the semi-advised support vector machine algorithm proposed for classification. Section 3 provides the overview of the experimental model based on the proposed algorithms and presents the experimental results and finally conclusion is given in Section 4.

Differential evolution (DE) is a population based optimization method, which has attracted an increased attention in the past few years. Although it showed quite promising results in various applications but in complex applications the search performance get highly depended on the mutation strategy, crossover operation and control factors including scale factor (F), Cross over rate (Cr) and population size (NP) [

The paper proposes a DE-based feature selection technique with an adaptive approach to make the feature selection process more dynamic to be applied for complex pattern recognition applications like the histopathological image analysis. It will use advised support vector machine explained in following section for evaluation of selected feature subset. The steps of the feature selection procedure are as follows.

1) Initialize the population of NP individuals Pop_{G} = {_{G}, x2_i_{G}, x3_i_{G}, _{G}], with i = [1, 2,

2) Set the mutation parameter (F) and cross over control parameter (Cr) using the following equations

3) While the termination criterion (maximum number of iterations) is not satisfied

Do for i = 1 to NP //do for each individual

a. Perform Mutation: A mutant vector _{G}, _{G}} is created corresponding to the ith target vector

b. Crossover operation : Employ binomial crossover on each of the D variable as follows for building trial vector

Here j_{rand } [1, 2,

c. Evaluate the population with the objective function.

d. Perform Selection: Evaluate the trial vector

If

Else

4) Repeat from step 2 - 3 until G_{max }

For using automated learning techniques in any area in order to improve performance requires a proper choice of the learning algorithm and of their statistical validation. Classifier training with insufficient number of labeled data is a well-known hard problem [

In this paper, a semi-advising algorithm for SVM is proposed that extracts subsequent knowledge during the training phase using both labeled and sets of unlabeled data that is added in a batch-mode. The effect of misclassified data during the training phase is controlled by generating advice weights [

Given the dataset is divided into two subsets: a labeled data set D_{L} =

D_{UL} =

Step 1: The unlabeled data set D_{UL} is equally divided into n subsets D_{UL1}, D_{UL2}, _{ULn}, then D_{L} is taken as the initial training set T_{s} and initialize i = 1, where i denotes the i^{th} loop of the algorithm.

Step 2: SVM classifier is trained using labeled data & classifying hyperplane is found using decision function

where x_{i} is input vector corresponding to the i^{th} sample and is labeled by y_{i} depending on its class, b is constant and α_{i} is the nonnegative Lagrange multiplier that is inconsistence with standard SVM training.

As the data is comprised of nonlinearly separable cases so kernel based SVM is used to produce non-linear decision functions and radial basis function kernel (RBF)

is used to make all necessary operations in the input space.

Step 3: The misclassified data sets (MD) in the training phase is determined using following relationship.

The MD set can be null, but most experiments showed that the occurrence of misclassified data in training phase is a common occurrence. It must also be noted that any method that tries to get benefit from misclassified data, must also have some control on the impact of outlier data. We observed that when the misclassified data is comprised of resembling samples, the use of misclassified data actually improved the classification accuracy more as it can lead to the variations required in the final separating hyperplane.

If the MD is null, go to the next step, else compute neighbourhood length (NL) for each member of MD using the following mathematical relation, which is then used during advised weight calculation.

where x_{j}, j = 1, _{i} and x_{j} is computed according to the following equation with reference to the related RBF kernel.

Step 4: The labels for data samples in D_{UL1} are estimated using current classifier, and then the most confidently classified elements are determined according to the distance between the element and the separating boundary. The criteria is formulated as |x ・ w ? b} ≥ Th, where constant Th > 0 is the distance threshold. If distance between element and separating boundary is larger than Th, we take it as confident element. The most confidently classified elements with their predicted labels are represented as set R and are added with their predicted labels, to training set T_{s}, i.e., T_{s} = T_{s} ∪ R. Remaining elements of D_{UL1} are denoted as unlabeled query UL_Q_{i}.

For each sample x_{k} from the unlabelled Query set UL_Q_{i} advised weight AW(x_{k}) is computed using following mathematical relationship. These AWs represent how close data is to the misclassified data from the labelled set.

The absolute value of the SVM decision values for each x_{k} from the unlabeled Query set set are calculated and scaled to [0, 1]. For each x_{k} from unlabelled Query set, if (AW (x_{k}) < decision value (x_{k}) then _{i} add UL_Q_{i} with predicted labels to T, that is, T = T ∪ Qi.

Step 5: i = i + 1

Step 6: If i equals n, terminate; otherwise, go back to Step 2.

The proposed model is presented in

Two datasets were used in the experiments, dataset 1 is based on dermoscopic images and dataset 2 is based on histopathological images obtained from the biopsy samples of skin cancer patients. Most of the images in the datasets came from Sydney Melanoma Diagnostic Centre. Dataset one comprise of 300 labeled and 500 unlabeled images. While the Dataset 2 consists of 160 images including 60 labeled and 100 unlabeled samples.

For testing the effect of feature selection method and the number of selected features on the overall performance of the model, the performance of the proposed feature selection method is also compared with the ones based on well-established binary Genetic Algorithm BGA [

This shows that if parameter tuning and feature selection is done simultaneously and effect of misclassified data/outliers is minimized, it can improve the classification performance of the learning models. In addition to that, it can also help in minimizing the use of redundant/irrelevant features in the final optimized model, which will make system computationally less complex and will also decrease the chances of having over fitted models.

10 fold cross validation rule is used to validate the performance of the overall model for both datasets. We also compared classification performance of proposed classification algorithm with SVM and T-SVM.

This paper presents a novel learning model with adaptive differential evolution based feature selection and semi advised support vector machine based classification. The proposed feature selection method is meant to adaptively adjust the tuning parameter for the differential evolution process and do the feature selection for the corresponding dataset simultaneously. On the other hand, the proposed semi advised SVM is trained using labeled data along with adding sets of unlabeled data to deal with the misclassified data elements and improve the gene-

ralization performance of the classifier by using increased amount of training data. Experimental analysis shows that the proposed learning model works well and provides an optimal feature set with higher classification rate when compared with some other popular methods used in literature. The efficient use of unlabeled data with the aid of limited labeled dataset helped in obtaining better generalization of the model over the test data and obtained accuracy of around 94% for dermoscopic images and 86.5% for histopathological images.

Ammara Masood,Adel Al-Jumaily, (2015) Semi Advised SVM with Adaptive Differential Evolution Based Feature Selection for Skin Cancer Diagnosis. Journal of Computer and Communications,03,184-190. doi: 10.4236/jcc.2015.311029