Segmentation Method of Breast Masses on Ultrasonographic Images Using Level Set Method Based on Statistical Model ()
1. Introduction
It can be difficult for clinicians to determine whether a lesion with breast mass is malignant or benign since they are often obscure at ultrasonography. The positive predictive value of ultrasonography, i.e., the ratio of the number of breast cancers found to the number of biopsies, is typically 10% - 20% [1] [2] [3] [4] [5] . To improve this positive predictive value, many investigators have developed computer-aided diagnosis (CADx) schemes [6] - [12] . Objective features for shape information such as shape irregularity, depth-width ratio, and degree of circularity were determined from the segmented mass region in conventional CADx schemes. The likelihood of malignancy for the mass was then evaluated by analyzing those features. Inaccurate segmentation of mass region will lead to inaccurate evaluation of the likelihood of the malignancy. Therefore, some investigators have developed computerized segmentation methods for breast masses on ultrasonographic images. Park et al. [13] proposed a computerized segmentation method based on wavelet transformation for solid nodules. Shan et al. [14] developed a completely automatic segmentation method using a region growing technique. These methods analyzed the likelihood that each pixel belonged to a breast mass on the ultrasonographic image. Therefore, these methods occasionally generated holes and isolated points in ultrasonographic images with speckle-pattern noise. To reduce the influence of speckle-pattern noise, it is important to analyze not only local image information but also global information.
A level set method based on an active contour model [15] [16] [17] [18] is one of the region extraction methods which have been widely used for medical images. Chan and Vese proposed a region-based active contour without edges model (ACWE) which analyzed global information such as the means of different regions, and showed better than other models for regions with weak edges [17] . However, it would be difficult to apply the ACWE into the ultrasonographic images with inhomogeneous intensities because it assumes that the image was statistically homogeneous.
It is known that a Gaussian mixture model (GMM) is effective statistical modeling method for modeling a complex distribution of image. In previous studies, GMM was shown to be useful in color data modeling and human skin color modeling [19] [20] . Therefore, we considered that it would be able to segment mass regions on ultrasonographic images more accurately by introducing the concept of statistical modeling method to a term of an energy function in the level set method.
The purpose of this study was to develop a novel level set method introduced the concept of statistical model. In the level set, an energy function was defined with three energy terms: a region-based term, edge-based term, and regularizing term. The concept of statistical model was introduced as the region-based term. The region of the breast mass was segmented so that the energy based on those terms was minimized. The segmentation performance was evaluated by applying our proposed method to our database, and was also compared with those for two conventional segmentation methods.
2. Materials
Our database consisted of 151 ultrasonographic breast images obtained from 151 patients at Mie University Hospital, Tsu, Japan. It included 70 malignant and 81 benign masses. The pathology of each mass was proved by pathological diagnosis. The ultrasonographic images were acquired with an ultrasound diagnostic system (APLIO XG SSA-790A, Toshiba Medical Systems Corp.) with a 12-MHz linear-array transducer (PLT-1204AT). The diagnosis of benign cases was confirmed by fine needle aspiration, and then the patients were again examined at 6 to 12 months after the initial diagnosis. The ultrasonographic image was constructed from a pixel size of 0.05 mm × 0.05 mm and a grey scale resolution of 256. The true breast mass regions were determined as gold standard by the consensus of an experienced clinician and breast surgeon. Informed consent was obtained from all patients. Institutional review board approval was obtained for this study at Mie University Hospital.
Our database was divided randomly into two datasets A and B for optimizing and evaluating our proposed methods. The dataset A consisted of 31 malignant and 44 benign masses, whereas dataset B consisted of 39 malignant and 37 benign masses.
3. Methods
Figure 1 shows a schematic diagram of our proposed method for the segmentation of region of breast masses. The contrast of mass region in ultrasonographic image was first enhanced using a modified sigmoid function [21] . The region of breast mass was segmented by the novel level set method with the energy function consisted of three energy terms: a region-based term, edge-based term, and regularizing term.
Figure 1. Schematic diagram of our proposed method for the segmentation of region of breast masses.
3.1. Contrast Enhancement
In ultrasonographic breast images, contrast between mass and background tissue was often low. We first normalized an input ultrasonographic image (I) using a following equation.
(1)
was a pixel value at each pixel. Min and Max were the minimum and maximum pixel values in the input image I. The value range of the normalized image NI was from 0.0 to 1.0. To enhance the contrast, we employed a modified sigmoid function [21] defined by
(2)
was the enhanced contrast image. c was a contrast factor, whereas th was the threshold value. In the modified sigmoid function, it is possible to change the amount of lightening and darkening to control the overall contrast enhancement by adjusting c and th [21] . Kannan et al. reported that an optimal threshold value of th was between 0.3 and 0.5 [21] . In this study, c and th were set to 6.5 and 0.4, respectively.
3.2. Energy Function for Level Set Method
The input image I (=CE) was considered as a real positive function defined in a domain. We defined a closed curve partitioning the input image I to an inside area and an outside area. was the level set function. The was represented as the set, whereas was represented as the set . Here, and. Energy function of our proposed method was defined as the following equation.
(3)
Here, the energy terms, , and were a region-based term, an edge-based term, and a regularizing term, respectively., , and were the weights for each of the energy terms. The details of those energy terms were described in the following sections. An initial contour for the level set method was manually determined by a rectangle contour surrounding breast mass.
3.2.1. Region-Based Term
By using the probability distributions and of the inside area and the outside area, the region-based term was defined as
(4)
Here, PF and PB were given by
(5)
(6)
and were defined as the following equations.
(7)
(8)
and were prior probabilities (spatial probabilities), whereas and were likelihoods. was pixel value at pixel p ().and were derived from the distance transform of the initial contour for the level set. The simple shape information would be utilized as the prior probability. The distance was normalized from 0.0 to 1.0. The prior probabilities were defined by the following equations.
(9)
(10)
Here, and were defined as the normalized distances to the and the. The likelihoods and were derived from the GMM. To determine and, we extracted five features based on the intensities in the input image I. These five features were (1) mean value, (2) standard deviation, (3) median value, (4) minimum value, and (5) maximum value. These features were the general stochastics used in image analysis. The GMM with the five features was obtained by the following equations.
(11)
(12)
K was the number of components in the mixture model. and were the mean and the covariance of the i-th Gaussian component, whereas was
the proportion of the i-th normal density in the mixture such that. An
Expectation-Maximization (EM) algorithm [22] [23] [24] was employed to fit the GMM. The EM algorithm was used for the estimation of the parameter in the GMM. When given a set of feature vectors, the maximum likelihood estimation of was defined by
(13)
The EM algorithm was an iterative method to obtain. When given the current estimation of the parameter set, each iteration of the EM algorithm re-estimated the parameter set according to an expectation step (E-step) and maximization step (M-step) [22] [23] [24] .
3.2.2. Edge-Based Term
The edge-based term was defined as
(14)
g was an edge indicator which was determined by
(15)
Here, PMD was the anisotropic diffusion kernel (Perona and Malik Diffusion) [25] . The PMD was defined as
(16)
denoted the pixel position, whereas p was neighbor pixel. was the pixel value at pixel position s and iteration t (time steps)., , and represented the spatial neighborhood of pixel position s, the number of neighbors, and a scalar determining the rate of diffusion, respectively. Image gradient (magnitude) was determined by
(17)
An edge stopping function was also given as
(18)
Here, L was a positive constant. When compared with a Gaussian filter [26] , the anisotropic diffusion kernel could smooth the image while preserving its brightness discontinuities [25] [27] .
3.2.3. Regularizing Term
The regularizing term was defined as
(19)
This term avoided converging the final contour to small area due to noise such as a speckle-pattern. This term could also prevent over-segmentation [28] .
3.3. Segmentation of Mass
In order to minimize the energy function E(C) mentioned above, the level set function introduced into the energy function E(C). The C was defined by the zero level set [28] .
(20)
H was a regularized Heaviside function which was given as
(21)
Here, was a tiny positive parameter [15] [16] [17] [18] . was given as [29]
(22)
was a dirac delta function [15] [16] [17] [18] .
The gradient flow was derived as the following equations.
(23)
(24)
and represented the initial level set function. The initial level set function was defined by
(25)
was a constant. was a subset in the image domain, whereas was the boundary of. The steps of our proposed method were summarized as follows:
Initialized the level set function by.
Determined the gradient and probability distribution of the inner and outer regions.
Updated from.
Checked the convergence of; if it had not reached steady state, continued the evolution.
3.4. Evaluation of Shape Accuracy on Segmented Mass
To evaluate the segmentation performance of our proposed method, we measured three error metrics, which were a true positive (TP) ratio, a false positive (FP) ratio, and a Jaccard similarity (JS) [30] . The TP ratio, the FP ratio, and the JS were determined by
(26)
(27)
and
(28)
was the breast mass region segmented automatically by the algorithm, whereas was the true mass region determined manually as the gold standard. The TP ratio was defined as the ratio of the overlapping area between the segmented region and the gold standard region to the area of the gold standard region. On the other hand, the FP ratio was defined as the ratio of the over- lapping area between the segmented region and the non-gold standard region to the area of the segmented region. The JS was given by the ratio of the overlapping area to the non-overlapping area between the segmented region and the gold standard region. We also measured a Dice similarity coefficient (DSC) [31] to evaluate the accuracy of the segmentation method. The DSC was defined as
(29)
The segmentation performance of our proposed method with the parameters optimized for the dataset A was evaluated in dataset B, whereas that with parameters optimized for dataset B was evaluated in dataset A. The parameters for the level set method were for the region-based term, for the edge-based term, for the regularizing term, for level set function, iteration and L for the anisotropic diffusion kernel. Here, were varied from 0.5 to 2.0. was also varied from 1.0 to 3.0. The iteration for the anisotropic diffusion kernel was varied from 5 to 30, whereas the L was varied from 5 to 10.
3.5. Results
Table 1 shows the optimized parameters for each subset. The optimized parameters for dataset A were 0.5 for, 1.0 for, 0.5 for, 1.0 for, 1.0 for, 10 for iteration of anisotropic diffusion kernel, 5 for L, respectively. Those for dataset B were also 1.0 for, 1.0 for, 1.0 for, 0.5 for, 1.0 for, 10 for iteration of anisotropic diffusion kernel, 5 for L, respectively.
Table 2 shows segmentation accuracies of our proposed method with the optimized parameters for another dataset. When applying the proposed method optimized for dataset A to dataset B, TP ratio, FP ratio, JS, and DSC were 92.2%, 8.9%, 84.4%, and 91.5%, respectively. When applying the proposed method optimized for dataset B to dataset A, those were 92.1%, 9.4%, 83.9%, and 91.2%, respectively. There were not differences substantially in segmentation accuracies between dataset A and B. With our proposed method, average TP ratio, average FP ratio, average JS, and average DSC for datasets A and B were 92.2%, 9.1%, 84.2%, and 91.3%, respectively. Figure 2 shows an example of segmented mass region by our proposed method.
Table 1. Optimized parameters for each subset.
α1, α2: parameter of region based term, β: parameter of edge-based term, γ: parameter of regularizing term, c0: level set function, L: anisotropic diffusion kernel.
Table 2. Segmentation accuracies of our proposed method with the optimized parameters for another dataset.
(a) (b) (c) (d) (e)
Figure 2. Example of segmented mass region by our proposed method: (a) original ultrasonographic image; (b) enhanced contrast image; (c) initial contour; (d) segmented mass by our proposed method; (e) gold standard region.
4. Discussion
To investigate the usefulness of our proposed method, we compared the segmentation performance for our proposed method with those for ACWE model [17] and for a level set method based on a signed pressure force function model (SPF model) [32] (see Appendix).
In the same way as our proposed method, the ACWE model and the SPF model were optimized for each of datasets A and B. The segmentation performances for those models were then evaluated by applying those models to another dataset not used for optimization.
Figure 3 shows the mean values and the standard deviations for average TP ratios, average FP ratios, average JSs, and average DSCs in each of our proposed method, the ACWE model, and the SPF model. Average TP ratio for our proposed method (92.2%) was significantly greater than that for the ACWE model (83.5%, P <0.001). Here, the p value obtained with t-test. Average TP ratio for the SPF model (93.4%) was slightly higher than that for our proposed method (P = 0.011). However, average FP ratio for our proposed method was 9.1, showing a significant improvement when compared with the ACWE model (36.1, P < 0.001) and the SPF model (25.0, P < 0.001). Higher average FP ratio means causing over-segmentation. Average JS was also greater with our proposed method (84.2) than with the ACWE model (55.2, P < 0.001) and the SPF model (71.0, P < 0.001). Although the SPF model exhibited significantly improved average DSC as compared with the ACWE model (65.8 vs. 82.2, P < 0.001), our proposed method proved further improvement in average DSC (91.3, P < 0.001 compared with the SPF model). These results would imply that our proposed method can segment masses more accurately than either the ACWE model or the SPF model.
Figure 4 shows the results of the segmented mass by our proposed method, the ACWE model, and the SPF model with the same initial contour. For a malignant breast mass with inhomogeneous intensity and an unclear boundary, the segmented regions with the ACWE model and the SPF model included a part of background tissue incorrectly as mass, as shown in Figure 4(c) and Figure 4(d).
(a) (b)(c) (d)
Figure 3. Mean values and the standard deviations for average TP ratios, average FP ratios, average JSs, and average DSCs in each of our proposed method, the ACWE model, and the SPF model: (a) average True Positive (TP) ratios; (b) average False Positive (FP) ratios; (c) average Jaccard Similarities (JS); and (d) average Dice Similarity Coefficients (DSC). “*” means a statistical difference with a p-value less than 0.001.
(a) (b) (c)(d) (e) (f)
Figure 4. Results of the segmented mass by our proposed method, the ACWE model, and the SPF model with the same initial contour: (a) original ultrasonographic image; (b) initial contour; (c) segmented image using the active contour without edges model (ACWE); (d) segmented image using the signed pressure force function model (SPF); (e) segmented image using our proposed method, and (f) gold standard region.
On the other hand, our proposed method could correctly extract the edge of the mass with fine shape, as shown in Figure 4(e).
When compared with the conventional two models, our proposed method could accurately segment even breast mass with obscure boundary and inhomogeneous internal intensities. Our proposed method used not only local information such as edge but also global information such as image statistical information to control the closed curve evolution. Therefore, we believe that our proposed method could reduce the influence for noise and inhomogeneous intensities. The energy function of our proposed method analyzed not only the regularizing term but also the spatial probability (,) obtained by distance transform (region-based term). We consider that the spatial probability prevented inaccurate expansion of segmented region for mass with obscure boundary.
In this study, an initial contour for the level set method was set manually. This would be a limitation in clinical practice. It can be boring for clinicians to set initial contour manually. Therefore, we have to develop an automated algorithm for detect mass region and set initial contour in further study.
5. Conclusion
In this study, we developed a computerized segmentation method for breast mass on ultrasonographic image by introducing the concept of statistical model to a level set method. In our proposed level set method, the energy function consisted of a region-based term, edge-based term, and regularizing term. By using the novel energy function, our proposed method was shown to have higher segmentation accuracy than either of the conventional models.
Acknowledgements
This work was supported (in part) by JSPS Grant-in-Aid for Scientific Research on Innovative Areas (Multidisciplinary Computational Anatomy), JSPS KAKENHI Grant Number 15H01118.
Consideration
The views expressed in this article do not reflect the official position of Mizuho Information & Research Institute, Inc. Any errors in this article are attributable to the authors.
Appendix
The level set evolution in an active contour without edge model (ACWE) [17] was defined as
(30)
where were fixed parameters, and was the grey level at pixel. was the gradient operator and was the Dirac function. Mean values for inside and outside curves were defined by
(31)
(32)
where was a Heaviside function.
The level set evolution in a region based signed pressure force model (SPF) [32] also was defined as
(33)
(34)
where was the balloon force parameter.
Submit or recommend next manuscript to SCIRP and we will provide best service for you:
Accepting pre-submission inquiries through Email, Facebook, LinkedIn, Twitter, etc.
A wide selection of journals (inclusive of 9 subjects, more than 200 journals)
Providing 24-hour high-quality service
User-friendly online submission system
Fair and swift peer-review system
Efficient typesetting and proofreading procedure
Display of the result of downloads and visits, as well as the number of cited articles
Maximum dissemination of your research work
Submit your manuscript at: http://papersubmission.scirp.org/
Or contact jbise@scirp.org