A Method of Using Information Entropy of an Image as an Effective Feature for Computer-Aided Diagnostic Applications

Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases. In general, a CAD system employs a classifier to detect or distinguish between abnormal and normal tissues on images. In the phase of classification, a set of image features and/or texture features extracted from the images are commonly used. In this article, we investigated the characteristic of the output entropy of an image and demonstrated the usefulness of the output entropy acting as a texture feature in CAD systems. In order to validate the effectiveness and superiority of the output-entropy-based texture feature, two well-known texture features, i.e., mean and standard deviation were used for comparison. The database used in this study comprised 50 CT images obtained from 10 patients with pulmonary nodules, and 50 CT images obtained from 5 normal subjects. We used a support vector machine for classification. A leave-one-out method was employed for training and classification. Three combinations of texture features, i.e., mean and entropy, standard deviation and entropy, and standard deviation and mean were used as the inputs to the classifier. Three different regions of interest (ROI) sizes, i.e., 11 × 11, 9 × 9 and 7 × 7 pixels from the database were selected for computation of the feature values. Our experimental results show that the combination of entropy and standard deviation is significantly better than both the combination of mean and entropy and that of standard deviation and mean in the case of the ROI size of 11 × 11 pixels (p < 0.05). These results suggest that information entropy of an image can be used as an effective feature for CAD applications.


Introduction
Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases [1].So far a number of CAD systems have been approved for clinical use and many more are currently under development.In general, a CAD system employs a classifier to detect or distinguish between abnormal and normal tissues in images.In the phase of classification, a set of image features and/or texture features extracted from the images are used.The performance of a CAD system largely depends on the determination and selection of the features relevant to a specific application.Therefore, feature determination and feature selection are very important tasks in the development of CAD [2].
Mutual information (MI) has been used as a similarity metric in image registration and template matching [3]- [5] and used as a feature selection criterion in CAD [6] [7].Recently, the MI has also been used as a measure for evaluation of medical imaging systems [8]- [10].In the studies of assessment of image quality, the MI is used to express the amount of information that an output image of Y contains about an input object of X.Then, the mutual information can be obtained by using the formula of MI(X; Y) = H(X) + H(Y) − H(X, Y), where H(X), H(Y), and H(X, Y) are input entropy, output entropy, and joint entropy, respectively.The more the MI value provides, the better image quality is [9].Through the studies of MI, we have noticed that the amount of output entropy greatly affects the MI value and might be used as an effective texture feature of an image.This finding motivates a further investigation of the behavior of the output entropy of an image.
The purpose of this study is to investigate the characteristic of the output entropy of an image and to demonstrate the usefulness of the output entropy acting as a texture feature in CAD systems.The ultimate goal is to develop a CAD system that utilizes relevant and powerful texture features.In order to validate the effectiveness and superiority of the output-entropy-based texture feature, two well-known texture features, i.e., mean and standard deviation were used for comparison.In the present study, we employed a support vector machine as a classifier for classification.A leave-one-out method was used for training and classification.Three combinations of texture features, i.e., mean and entropy, standard deviation and entropy, and standard deviation and mean were used as the inputs to the classifier.We evaluated the performance of the three combinations in terms of sensitivity, specificity and accuracy.

Entropy of an Image
If a set of events s 1 , •••, s m whose probabilities are given by {p 1 , p 2 , •••, p m }, respectively, then the Shannon entropy H can be expressed as ( ) Consider an image in which every pixel has a unique output belonging to one of the various pixel values or gray levels.Then the frequency of each different pixel value can be obtained.We can rewrite Equation (1) as follows: ( ) where n j is the frequency of pixel value j and n is the total of frequency (total number of pixels in an image).For example, an image size is M × N = 25 × 20 = 500.Suppose that all the pixel values lie in one of the seven bins based on the image's pixel values, and the frequencies of the 7 bins are 20, 64, 118, 98, 94, 16 and 90, respectively.Then, the entropy of the image can be calculated using Equation (2) [11]: log 500 1 500 20 log 20 64 log 64 118log 118 98log 98 2.575 It is obvious from Equation ( 3) that entropy of an image can be estimated using an image histogram that is a graph showing the number of pixels in an image at each different pixel value found in that image.Therefore, the amount of entropy of an image largely depends on the distribution and the occurrence of pixel values.Because of this property, we consider that the entropy of an image offer some advantages over texture features such as the mean and standard deviation that are commonly used in CAD applications.In practice, when calculating the entropy of an image, only the diagnostically relevant region of interest in the image is used.

Database and Texture Features
The database used in this study comprised 50 CT images obtained from 10 patients with pulmonary nodules, and 50 CT images obtained from 5 normal subjects.A physician detected pulmonary nodules from the images in these abnormal cases.All images were acquired with a multidetector-row CT scanner (Hitachi Robusto, Japan).Scanning were performed by use of a matrix size of 512 × 512; the field of view was 320 mm, the tube voltage and tube current were 120 kV and 175 mAs, respectively, with section thickness of 5.0-mm.Figure 1 shows a CT image with pulmonary nodule and a CT image from a normal subject.Extracting effective features from an image is a key characteristic of a successful CAD system [12].Typical features for medical images include area, shape, texture, gray level, and various statistics.Three textures features, i.e., entropy, mean and standard deviation of an image, were used in this study for analysis and comparison.The entropy used in this study was directly obtained from the gray level of an image rather than from gray level co-occurrence matrix calculated from the segmented image [13] [14].The entropy of image used in this study is a kind of first-order texture feature, while the entropy obtained from co-occurrence matrix belongs to second-order texture feature.Mean and standard deviation are commonly utilized texture features in CAD systems [12] [15] [16].

Classification and Evaluation
In this study we used a support vector machine (SVM) for classification.The SVM is a new generation learning system that has many attractive modeling features [17]- [19].We employed a Gaussian kernel, the most commonly used in SVM for classification.The parameter gamma, which controls the degree of non-linearity of the model, was empirically set at 0.01, and the regularization parameter C relating to the cost function for determination of an optimal hyperplane was empirically determined as 16.0.A leave-one-out method was used for training and classification.Three combinations of texture features, i.e., mean vs. entropy, standard deviation vs. entropy and standard deviation vs. mean were used for comparison of feature performance.The ROIs for the calculation of image texture features were manually selected by an experienced radiologist with a square box of predetermined sizes.

Performance Measures
We used sensitivity, specificity and accuracy as performance measures.Sensitivity measures the proportion of actual positives which are correctly identified (or true positives) when the image contains abnormal tissues in it.Specificity measures the proportion of negatives which are correctly identified (or true negatives) when abnormal tissues is not present in the image [20] [21].Classification accuracy is a measure of overall performance of CAD system and is defined as the percentage of diagnostic decisions that proved to be correct [20].

Results
Figure 3(a), Figure 3(b) and Figure 3(c) are examples of the distributions of mean versus standard deviation, entropy versus standard deviation, and entropy versus mean, respectively, obtained from the database consisting of 100 images.The three features were obtained from the selected 9 × 9 ROI sizes.Table 1 shows the classification performance (sensitivity, specificity and accuracy) of the 3 combinations of texture features for 3 different ROI sizes, i.e., 11 × 11, 9 × 9 and 7 × 7 pixels.The extended Fisher's exact test was performed to compare every three combinations, i.e., mean and entropy versus standard deviation and entropy, standard deviation and entropy versus standard deviation and mean, as well as mean and entropy versus standard deviation and mean, for a statistically significant difference.The results showed that the combination of standard deviation and entropy achieved the best classification performance in all the three performance measures for the three various ROI sizes, respectively.Table 2 summarizes Fisher's exact test results.The table shows that the combination of entropy and standard deviation is significantly better than both the combination of mean and entropy and that of standard deviation and mean in the case of the ROI size of 11 × 11 pixels (p < 0.05).As for the cases of the ROI sizes of 9 × 9 and 7 × 7 pixels, no significant differences among the three combinations was observed.The results demonstrate that the output entropy is correlated with image quality in terms of image resolution.From the preliminary results of investigations of the characteristics of the output entropy, the output entropy of an image can be used as an effective feature for CAD schemes.

Discussion
There is a limitation to this study.We only used CT images with and without pulmonary nodules for the in   vestigation.Various images obtained from different modalities such as, mammograms and digital radiography will be used for further studies.

Conclusion
We investigated the characteristic of the output entropy of an image and demonstrated the usefulness of the output entropy acting as a texture feature in CAD systems.In order to validate the effectiveness and superiority of the output-entropy-based texture feature, two well-known texture features, i.e., mean and standard deviation were used for comparison.We used a support vector machine for classification.A leave-one-out method was employed for training and classification.Our study showed that the combination of entropy and standard deviation was significantly better than both the combination of mean and entropy and that of standard deviation and mean in the case of the ROI size of 11 × 11 pixels (p < 0.05).These results suggest that information entropy of an image can be used as an effective feature for CAD applications.

Figure 1 (
a) and Figure 1(b) are the original images of the lung and the magnified, left parts of the lung, respectively.

Figure 2
illustrates the histograms of the region of interest (ROI) images shown in the figure.The ROI of 9 × 9 pixels was extracted and indicated by a square box in Figure 1(b).

Figure 1 .
Figure 1.Sample images from the database used in this study.(a) Original images of the lung, (b) left part of the lung extracted from (a).The size of the ROI is 9 × 9 pixels.The top low is an abnormal case (pulmonary nodule) and the bottom row is a normal case.The square boxes shown in (b) are the extracted regions-of-interest and used for calculation of the texture features.

Figure 2 .
Figure 2. Histograms of the ROI images shown in upper right corners.The size of ROI images is 9 × 9 pixels.(a) Abnormal case (nodule) and (b) normal case.

Figure 3 .
Figure 3. Examples of the distributions of three features.(a) Mean versus standard deviation, (b) entropy versus standard deviation, and (c) entropy versus mean.

Figure 4 and
Figure 4 and Figure 5 show the characteristics of the output entropy obtained from an image.Figure 4(a) is an image of the hip joint of the human body phantom.The areas marked with squares indicate the ROIs selected for computation of output entropy.The upper and lower square areas are the portions of hard tissue (bone) and soft tissue (muscle), respectively.Figure 4(b) illustrates the relationship between radiation exposure level and output entropy.The graphs in the figure correspond to the two ROIs shown in Figure 4(a).Figure 4(b) indicates that output entropy decreases with the increase of radiation exposure level.It is well known that reducing radiation exposure level will increase noise level in the image and resulting in sacrificing diagnostic image quality.Thus, the output entropy is highly correlated with image quality in terms of image noise.Figure 5(a) and Figure 5(b)are a sharp image and a blurred image obtained from a hand phantom, respectively.Ten each of ROIs at different positions were selected for calculation of the output entropy of the sharp and blurred images.Figure5(c)shows the output entropies at the corresponding 10 different ROI positions for the sharp and blurred images.The figure indicates that the output entropy obtained from the blurred image is greater than that of the sharp image at each corresponding position of the ROIs.The results demonstrate that the output entropy is correlated with image quality in terms of image resolution.From the preliminary results of investigations of the characteristics of the output entropy, the output entropy of an image can be used as an effective feature for CAD schemes.There is a limitation to this study.We only used CT images with and without pulmonary nodules for the in

Figure 5
Figure 4 and Figure 5 show the characteristics of the output entropy obtained from an image.Figure 4(a) is an image of the hip joint of the human body phantom.The areas marked with squares indicate the ROIs selected for computation of output entropy.The upper and lower square areas are the portions of hard tissue (bone) and soft tissue (muscle), respectively.Figure 4(b) illustrates the relationship between radiation exposure level and output entropy.The graphs in the figure correspond to the two ROIs shown in Figure 4(a).Figure 4(b) indicates that output entropy decreases with the increase of radiation exposure level.It is well known that reducing radiation exposure level will increase noise level in the image and resulting in sacrificing diagnostic image quality.Thus, the output entropy is highly correlated with image quality in terms of image noise.Figure 5(a) and Figure 5(b)are a sharp image and a blurred image obtained from a hand phantom, respectively.Ten each of ROIs at different positions were selected for calculation of the output entropy of the sharp and blurred images.Figure5(c)shows the output entropies at the corresponding 10 different ROI positions for the sharp and blurred images.The figure indicates that the output entropy obtained from the blurred image is greater than that of the sharp image at each corresponding position of the ROIs.The results demonstrate that the output entropy is correlated with image quality in terms of image resolution.From the preliminary results of investigations of the characteristics of the output entropy, the output entropy of an image can be used as an effective feature for CAD schemes.There is a limitation to this study.We only used CT images with and without pulmonary nodules for the in

Figure 4 .
Figure 4. (a) An image of the hip joint of the human body phantom, (b) relationship between radiation exposure level and output entropy obtained from different ROIs.The square boxes shown in (a) are the extracted regions-of-interest and used for calculation of the texture features.

Figure 5 .
Figure 5. (a) A sharp image and (b) a blurred image obtained from a hand phantom.(c) Output entropies at the corresponding 10 different ROI positions for the sharp and blurred images.

Table 1 .
Classification performance of the three combinations of texture features.