A Novel Lung Cancer Detection Method Using Wavelet Decomposition and Convolutional Neural Network ()
1. INTRODUCTION
Cancer is a leading cause of death worldwide and Lung cancer is considered the most common type of cancer. For example, in 2018, cancer death worldwide was around 9.6 million. About 2.09 million cases were caused by Lung cancer. Breast cancer and colorectal cancer were the second and third most common types of cancer, respectively. Similarly, in 2013 Lung cancer accounted for about 228,190 new cases and 159,480 deaths in the U.S alone [1].
Lung cancer is a disease that causes cells to divide in the lungs uncontrollably. Lung cancer symptoms include persistent cough, chest pain, shortness of breath, loss of appetite, and feeling weak or tired [2 - 4]. Sometimes, Lung cancer may show no symptoms, making the identification of lung cancer in its earliest stages, a non-certain task. Lung cancer can be caused by several factors including cigarette smoking, family history of lung cancer, exposure to second-hand smoke, and exposure to radon gas [5,6]. Several treatment options can be implemented for lung cancer patients. The options include radiation therapy, surgery, chemotherapy, or a combination of these treatments [7]. Several factors dictate the treatment options for patients including the extent of the cancer, a person’s overall health and lung function, as well as certain traits of the cancer itself. In many cases, more than one type of treatment is used [8,9].
Lung cancer that originates in the lung is called primary lung cancer and lung cancer that spreads to the lungs from other parts in the body is called secondary lung cancer [10]. There are two main types of primary lung cancer: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) [11,12]. The most common type is NSCLC, and it accounts for about 85% of all lung cancers [13,14]. NSCLC is any form of epithelial lung cancer and is composed of three types of cancers: Lung Adenocarcinoma (LUAD), Lung squamous cell carcinomas (LSCC), and large cell carcinomas (LCC) [15]. One way to try to minimize the high mortality rate of Lung cancer is through early detection and treatment [16].
Presented in this paper is a novel cancer classification system based on Wavelet decomposition and Convolutional Neural Networks (CNNs). Experimental results show that the proposed WCNN system outperforms commonly proposed systems such as the Support Vector machines (SVM). The proposed system produces a high accuracy rate of 99.5%.
2. THE STATE OF THE ART IN LUNG CANCER CLASSIFICATION
Conventional cancer detection techniques use the tissue of the tumor to base their judgments as to the tumor type [17,18]. However, many tumors do not reveal distinct morphological characteristics that are essential for differential diagnosis. Hence, the subjective evaluation of histopathological and clinical information can result in an inaccurate diagnosis. Therefore, there is an increasing need for automatic methods to perform lung cancer detection.
Computer aided detection (CAD) systems, especially those systems based on artificial intelligence (AI) and machine learning algorithms, have been widely implemented in cancer detection systems [19,20]. In fact, recent reports claim that AI has outperformed health-care professionals in many imaging domains of medicine [21 - 23]. CAD systems enjoy many advantages. In addition to offering a fast screening process, CAD systems present a low cost option and improve the performance of radiologists [24].
Several Lung cancer databases are commonly used by researchers including The Cancer Imaging Archive (TCIA) database and the LUNA database. Both of these databases are publicly available online. A widely used database, available from the TCIA site [25], is the Lung Image Database Consortium (LIDC). For benchmarking and other reasons involving quality assurance and accessibility, the LIDC dataset was used in this study. By far, the LIDC database has been the most commonly used database in the recent Lung cancer classification research.
As with other domains of medical imaging, Lung computer-based cancer detection and classification systems are largely based on machine learning and AI algorithms. The majority of the methods proposed recently uses the CNN approach without doing any preprocessing to the lung images. Most of the techniques proposed used CT scans of the Lung.
To present a clear comparison of some of the techniques that have been recently proposed in the Literature, Table 1 compares the reviewed systems using the accuracy measure.
3. MATERIALS AND METHODS
In this paper we propose a Wavelet-based CNN (WCNN) system for lung cancer detection and
Table 1. Comparison of accuracy results.
classification. Due to their high resolution and low noise level, computerized tomography (CT) images are used in this study. A block diagram showing the main stages of the proposed system is depicted in Figure 1.
The sequential operations performed by the proposed system starts by taking the Wavelet decomposition of the input image I which represents any image in the employed lung cancer dataset.
3.1. Image Dataset
The raw images used in this study were obtained from the Lung Image Database Consortium (LIDC), which can be freely downloaded from the TCIA site. As the LIDC is commonly accepted by researchers in this field, its adoption in this study allows for better benchmarking and quality assessment. As outlined in Table 2, the dataset used here is composed of 600 images comprising four classes of lung cancers: LSCC, LUAD, SCLC, and Benign (non-cancerous). Each class contained 150 images. The images are 16-bit gray-scale images with spatial resolution of 512 × 512.
Figure 2 shows sample CT scan images from the LIDC dataset that is used in this project.
The first stage of the proposed system is to use the Wavelet decomposition transform to obtain discriminative features from the input image.
3.2. Wavelet Transform
The Wavelet Transform, also known as Wavelet decomposition, is a frequency transform. The transform of an image gives another way of representing the image. It does not change the energy or information content of the image. The Wavelet decomposition tree, depicted in Figure 3, illustrates the operations of the Wavelet decomposition transform. The input image, at the first level of decomposition, produces two vectors of coefficients: approximation and detail coefficients. While the approximation coefficients represent the low frequency contents of the input signal, the detail coefficients represent the high-frequency contents. At the second level of decomposition, the approximation coefficients produce two sets of approximation and detail coefficients, whose lengths are equal to half of the length of the original approximation vector. The process of decomposition further splits the approximation coefficients into two new vectors for each subsequent level of decomposition.
Figure 1. Block diagram of the proposed system.
Table 2. Dataset Labels and diseases.
Figure 3. Wavelet decomposition tree. Variables I, A1 and D1 represent the original image, approximation, and detail coefficients at level 1, respectively.
To demonstrate the decomposition operations employed by Wavelet decomposition, Figure 4 depicts Wavelet decomposition details of a sample image using the Haar transform. The decomposed image was obtained from the LSCC images and is shown in Figure 2(a). The Haar Wavelet which is also known as the db1 Wavelet is considered the first and simplest Wavelet. The db1 wavelet looks like a step function [31].
The capability of the Wavelet transforms to compress the image energy makes it suitable for image feature extractions [32,33]. To demonstrate the energy compactness property of the Wavelet decomposition, Figure 5 shows the histogram of approximation coefficients. In this example, Wavelet decomposition was applied to a lung image from our lung cancer dataset. Specifically, the example, used the Haar wavelet at level 6 decomposition. As shown by Figure 5, only a few approximation coefficients contain most of the energy. Using the locations of these coefficients along with their magnitudes can produce a discriminative feature representing the input image. For this purpose, the other approximation and detail coefficients can be discarded [34].
Finally, the feature vector (approximation coefficients) is presented to a CNN and a SVM for classification. The same sets of inputs and outputs are used to train the SVM and the proposed WCNN system. Samples of the Wavelet features (approximation coefficients) representing the four classes are depicted in Figure 6. Visual inspection of the features in Figure 6 indicates that the features are discriminative.
3.3. Conventional Neural Networks
The Convolutional Neural Network, also known as CNN and ConvNet, is a special type of Artificial Neural Network (ANN) The original ANNs, such as the multilayer perceptron (MLP), have been very successful in pattern recognition applications [35 - 38]. ANNs have inspired the creation of CNN, a Deep learning algorithm.
Deep Learning is a branch of Machine Learning that employs Deep Neural Networks, neural networks with many layers. The CNN can be thought of as an ANN where at least one layer applies a convolution
Figure 4. Wavelet decomposition of a LSCC image using the Haar Wavelet.
Figure 5. Histogram of the approximation coefficients at level 6.
Figure 6. Sample Wavelet features for: (a) LSCC, (b) LUAD, (c) SCLC, and (d) Benign.
operation before it passes its output to the next layer [39 - 41]. Commonly, the mean value and the max value functions are used in the convolution operation, but other functions could also be used [42,43]. CNNs present a quantum leap in the area of image classification and computer vision. A very famous CNN design is the AlexNet [44] which has shown superior performance in general image recognition applications.
The basic structure of a CNN consists of three components: convolutional layer, pooling layer, and output layer. The convolutional layer scans the whole image, using a moving window approach, to creates a feature map. The Pooling layer downsamples the output of the convolutional layer which reduces the amount of data to be learned. The use of the convolutional and pooling layers is often repeated several times. Fully connected input layer converts the outputs generated by previous layers into a single vector to be applied to the next layer. Fully connected layer produces a weighted sum of the input generated by the feature analysis to predict an output label. The Fully connected layer determines the output class. A typical CNN is depicted in Figure 7.
As indicated by Figure 7, the typical input to a CNN is an image of size m × m × r, where r is the number of channels (r = 1 for gray-scale and r = 3 for RGB images). Normally, the CNN has the capability to perform image classification using raw images as direct inputs. However, the implementation of the CNN in the proposed WCNN system uses Wavelet features as inputs to the CNN. This process greatly reduces the number of features; and therefore, makes the learning task of the CNN much easier. The architecture of the proposed classifier contains five layers: the input layer, the convolutional layer, the max pooling layer, the fully connected layer, and the output layer. The input layer has a size of 128 × 128 corresponding to the size of the approximation matrices. The output layer had 4 neurons corresponding to the number of classes. Next, we compare the performance of the proposed WCNN system to that of SVM system.
3.4. Support Vector Machine Implementations
Like CNNs, SVMs are supervised learning algorithms that have been widely implemented in classification applications. SVMs were originally proposed by Cortes et al. [45]. The SVM was originally designed to be a binary or wo-class classifier. However, SVMs have been altered to tackle data composed of more than two classes [46]. SVMs have shown remarkable success in solving linear and non-linear classification problems. As depicted in Figure 8, a SVM classifies data by determining the best hyperplane that isolates the data points of the two classes. In other words, an SVM tries to find the widest possible margin that separates the two classes with no interior data points.
The SVM algorithm implemented here uses the Gaussian kernel defined by:
(1)
where σ is a user-defined variance parameter.
4. DISCUSSION AND RESULTS
The implementation of the CNN training used 60% of the input matrices for the training data and the remainder, 40%, for the validation data. The dataset consisted of four classes; with 150 images for each class. The features representing each image are a matrix of Wavelet approximation coefficients. Hence, the dataset consists of 600 matrices. Randomized splitting was used to avoid biasing the results. The traces of the accuracy of the proposed system are shown in Figure 9.
The maximum success rate (accuracy) of the proposed WCNN system for this experiment is 99.5%, indicating that the approximation coefficients carry highly distinctive information about the Lung image. To show the validity of the proposed system, its accuracy is compared to the SVM classifier. When operating on the same feature matrices as the proposed WCNN system, the SVM system produced an accuracy of 95.5%.
Several statistical measures are used to analyze the performance of the proposed WCNN system. Specifically, the performance of the proposed algorithm is evaluated by computing the percentages of sensitivity (SE), specificity (SP) and accuracy (ACC) as follows:
Sensitivity: is the fraction of real events that are correctly detected among all real events and is given by:
(2)
Specificity is defined as the fraction of nonevents that are correctly rejected and is given by:
(3)
Accuracy is the fraction of the real events that are correctly detected and the non-events that are correctly rejected, among all events and non-events and is defined as:
(4)
where,
FP: number of false positive specimens (predicts non-tumor as tumor).
TP: number of true positive specimens (predicts tumor as tumor).
FN: number of false negative specimens (predicts tumor as non-tumor).
TN: number of true negative specimens (predicts non-tumor as non-tumor).
The prevalence is determined from the sensitivity, specificity, and accuracy using the following equation:
(5)
The calculated SE, SP, AC, and prevalence are given in Table 3.
Figure 9. Traces of accuracy during training at level 2 decomposition.
Table 3. Performance metrics of the proposed system.
Table 3 shows that the proposed system produces high sensitivity and specificity values, indicating that the system is robust and reliable.
5. CONCLUSION
In this paper, a novel approach to the classification of Lung cancer using Deep Neural Network is presented and developed. Other than segmentation, most of the systems that are currently proposed in the literature, present the input Lung image to the CNN classifier without doing any preprocessing. The proposed CNN system preprocesses the Lung image with Wavelet decomposition before applying the resultant coefficient matrix to the CNN classifier. Wavelet decomposition greatly reduces the dimensions of the input image which in turn, simplifies the work of the CNN classifier.
The proposed system classifies the input Lung image to one of four classes: LSCC, LUAD, SCL, and Benign (non-cancerous). The Lung images are CT scans taken from the LIDC database. To show the validity and robustness of the proposed system, its performance is compared to the SVM classifier. Both the proposed WCNN system and the SVM classifier received the same feature vectors as input.
Experimental tests on the LICD database achieved 99.5% of recognition accuracy using a decomposition level of two and the dB 1 wavelet. Simulation results have proved that the proposed system always produces higher success rates than the SVM system.