Computerized Scheme for Histological Classification of Masses with Architectural Distortions in Ultrasonographic Images

Architectural distortion is an important ultrasonographic indicator of breast cancer. However, it is difficult for clinicians to determine whether a given lesion is malignant because such distortions can be subtle in ultrasonographic images. In this paper, we report on a study to develop a computerized scheme for the histological classification of masses with architectural distortions as a differential diagnosis aid. Our database consisted of 72 ultrasonographic images obtained from 47 patients whose masses had architectural distortions. This included 51 malignant (35 invasive and 16 noninvasive carcinomas) and 21 benign masses. In the proposed method, the location of the masses and the area occupied by them were first determined by an experienced clinician. Fourteen objective features concerning masses with architectural distortions were then extracted automatically by taking into account subjective features commonly used by experienced clinicians to describe such masses. The k-nearest neighbors (k-NN) rule was finally used to distinguish three histological classifications. The proposed method yielded classification accuracy values of 91.4% (32/35) for invasive carcinoma, 75.0% (12/16) for noninvasive carcinoma, and 85.7% (18/21) for benign mass, respectively. The sensitivity and specificity values were 92.2% (47/51) and 85.7% (18/21), respectively. The positive predictive values (PPV) were 88.9% (32/36) for invasive carcinoma and 85.7% (12/14) for noninvasive carcinoma whereas the negative predictive values (NPV) were 81.8% (18/22) for benign mass. Thus, the proposed method can help the differential diagnosis of masses with architectural distortions in ultrasonographic images.


Introduction
Breast cancer is one of the major health problems for woman health.In the United States, one in eight women has breast cancer during their lives [1].It is estimated that about 40,290 women will die of breast cancer in a year [2].Therefore, early diagnosis and early treatment of breast cancer are very important to reduce death toll.
Ultrasound is a convenient and safe diagnostic method to distinguish benign breast lesion from malignant lesion.However, ultrasonography is an operator-dependent modality, and the operator requires much experience.In order to improve the operator dependency and increase accurate diagnosis rate, computer-aided diagnosis (CADx) systems which provide the likelihood of malignancy on mass and calcifications have been developed [3]- [7].Some investigators reported that influence of dependence on operator and clinician's diagnostic accuracy was improved by the use of CADx systems [8] [9].
Architectural distortion as well as mass and calcifications is an important indicator of breast cancer in ultrasonography images [10]- [12].It is defined in Breast Imaging Reporting and Data System (BI-RADS) as follows [10]: "The normal architecture of the breast is distorted with no definite mass visible.This includes spiculations radiating from a point and focal retraction or distortion at the edge of the parenchyma."It is a difficult task for clinicians to distinguish between benign and malignant architectural distortions in ultrasonography because they are often subtle in representation.Therefore, development of CADx systems for architectural distortions in ultrasonography has been desired from clinical practice.To our knowledge, however, no studies have developed such CADx system.
Mass is often associated with architectural distortion.To evaluate the mass with architectural distortion in ultrasonographic image, it is necessary to develop the feature extraction method to analyze both feature of mass and feature of architectural distortion.In a past study [13], we developed a CADx system that could evaluate the likelihood of malignancy and that of the histological classification of masses in ultrasonographic images.However, our CADx system did not analyze the objective features of the architectural distortion.Therefore, it might be possible to improve the classification accuracy of our previous method by adding objective features for architectural distortion.
In this paper, we describe the development of feature extraction methods for architectural distortion in the service of a computerized scheme for histological classification of masses with such distortion in ultrasonographic image.We finally employed a k-nearest neighbors (k-NN) rule along with the extracted objective features to determine the histological classifications of masses with architectural distortions.The classification accuracies were evaluated by applying the proposed method to a test set of 72 masses with architectural distortions in ultrasonographic images.

Materials
Our database consisted of 72 two dimensional ultrasonographic images obtained from 47 patients at Mie University Hospital.It included 51 malignant masses (35 invasive carcinomas and 16 noninvasive carcinomas) and 21 benign masses with architectural distortion.
The histological classifications of these lesions were made through pathologic diagnosis.The ultrasonographic images were acquired with an ultrasound diagnostic system (APLIO XG SSA-790A, Toshiba Medical Systems Corp.) with a 12-MHz linear-array transducer (PLT-1204AT).A pixel size of each ultrasonographic image was 0.05 mm × 0.05 mm, and each image was quantized using a 256-level grey scale.Figure 1 shows an example of masses with three histological classifications.The size of these images was 20 mm × 17 mm.

Methods
Figure 2 shows a schematic diagram of the proposed method for the histological classification of masses with architectural distortions.The location and shape of the mass were manually determined by an experienced  clinician.We then extracted five objective features for architectural distortion and nine objective features for masses defined in our previous study [13].We finally employed the k-NN rule using the extracted objective features to determine the histological classifications of the masses with architectural distortion.

Segmentation of Mass
For accurate extraction of image features, the locations and shapes of all masses were determined by an experienced clinician.

Extraction of Objective Features
Table 1 shows all 14 objective features that were extracted, consisting of five objective features for architectural distortion and nine objective features for mass [13].The asterisk indicates that the features were newly defined in this study.Here, the five objective features for architectural distortion are described in detail, whereas the nine objective features for mass are described briefly.To quantify the architectural distortion, we newly defined extraction methods for the retraction (convergence) of a mammary gland (ACI1, ACI2, and ACI3), and extraction methods for spiculations (NumCorners, RatioPMPC).Spiculations are a stellate-shaped distortion caused by the invasion of cancer into the surrounding tissue [14].
Average convergence index (ACI1, ACI2, and ACI3) For obtaining the objective feature concerning convergence of mammary glands, it is necessary to detect linear structures such as mammary glands.Therefore, an ultrasonographic image was first decomposed into several subimages at scales j from 1 to 3 by using a filter bank [15].Here, assume that the ultrasonographic image   W f x y corresponded to the elements of a Hessian matrix H, which was defined as The following expression states the condition that the two eigenvalues small j λ and large j λ ( small large j j λ λ < ) must satisfy for linear structures [15]: large 0, 0 Therefore, the enhanced image for linear structures (ELS) was defined by . Figure 3 shows an example of an image enhanced for linear structures by using the filter bank.The segmented image was then obtained by applying a local gray-level thresholding technique [16] to the ELS.A thinned image was obtained by applying a thinning algorithm [16] to the segmented image.
To quantify the concentration of the mammary gland, we computed the convergence index using following equation: where R ∑ was the sum of all line primitives in the concentration mask from R1 to R8 (Figure 4 shows the concentration mask), dist represented the distance between O and Q, dx was the length of line primitive Q, and α referred to the orientation of Q with respect to line OQ.The maximum value of Equation ( 3) was 1.0 and the minimum value was 0.0.The equation was obtained by modifying Hasegawa's method [17]- [19] to include the  value of enhanced image for linear structures (ELS) and using a rectangle mask instead of a circular mask.We divided mask R into eight regions Rk (k = 1 ~ 8) at 45-degree intervals, and computed the convergence index at each region Rk (k = 1 ~ 8).The mass with architectural distortion in the ultrasonographic image had varying sizes.Therefore, we computed three values of average convergence index (ACI1, ACI2, and ACI3) using concentration masks of three sizes: (length1 [pixel], length2 [pixel]) = (36, 180) at ACI1, (42, 210) at ACI2, and (48, 240) at ACI3.These values were empirically determined.ACI1, ACI2, and ACI3 were defined as ( ) 1, 2, and 3

Number of corners of the mass (NumCorners)
The number of corners of the mass (NumCorners) was determined by Chen's method [20].We first detected edges to obtain a binary edge map and extract contours, as in the curvature scale space (CSS) method.The curvature was then calculated at a fixed low scale for each contour to retain the true corners.We regarded the local maxima of absolute curvature as the corner candidates, and adaptively calculated a threshold according to the mean curvature within a region of support.Round corners were removed by comparing the curvature of the corner candidates with the value of the adaptive threshold.Based on a dynamically recalculated region of support, we calculated the angles of the remaining corner candidates to eliminate false corners.Finally, we considered the end points of the open contours, and marked them as corners unless they were in the proximity of another corner.Figure 5 shows corners in a segmented mass.
Ratio of perimeter of segmented mass to that of a circle with the same area (RatioPMPC) Ratio PMPC was determined by the ratio of the perimeter of the segmented mass to that of a circle with the same area, and was given by 1 _ _ .0 where P_mass was the perimeter of the segmented mass, and P_circle was the perimeter of the circle with the same area as the segmented mass.Figure 6 shows an example of the segmented mass and the circle.

Objective features of mass
In past work, we had proposed nine objective features for the histological classification of masses in ultrasonographic images [13].These features reflected clinicians' subjective impressions based on experience.Our method had recorded satisfactory classification performance.Therefore, we used the same objective features in this study: depth-width ratio (D/W), degree of indistinctness along the margin (IndisMargin), homogeneity in internal echoes (HomoEchoes), echo level of internal echoes (InEchoes), echo level of posterior echoes (PostEchoes), circularity measure in mass shape (Circularity), polygon measure in mass shape (Polygon), lobulated shape measure in mass shape (Lobulated), and irregularity measure in mass shape (Irregularity).was classified as belonging to the class with the highest voting power.A leave-one-out-by-patient test method was used to train and test the classifier.In this method, data pertaining to one patient was first selected as part of the testing dataset, and data from the remaining patients was used to train the algorithm.This procedure was repeated until every patient in our database had been tested once.

Evaluation of Classification Performance
Sensitivity [23], specificity [23], positive predictive value (PPV) [23], and negative predictive value (NPV) [23] were defined as ( ) ( ) ( ) ( ) where TP (true positive) represented the number of malignant masses correctly identified, TN (true negative) was the number of benign masses correctly identified, FP (false positive) represented the number of benign masses incorrectly identified as malignant, and FN (false negative) was the number of malignant masses incorrectly identified as benign.Sensitivity refers to the ability of the test to identify correctly those patients who have the disease.Specificity refers to the ability of the test to identify correctly those patients who do not have the disease.PPV means the ratio of patients who receive a positive test that actually have the disease.NPV also means the ratio of patients who receive a negative test that are actually free of the disease.

Results
Figure 7 shows the distribution of 14 objective features obtained from all masses with architectural distortions in our database.These objective features were normalized by using the average value and the standard deviation of each feature obtained from all masses.NumCorners, RatioPMPC, and Irregularity for invasive carcinomas were larger than those for other lesions.On the other hand, IndisMargin and InEchoes for invasive carcinomas were lower than those for other lesions.ACI1, ACI2, and ACI3 for the invasive carcinoma and noninvasive carcinoma were larger than those for benign mass.Circularity for benign mass also was larger than that for invasive carcinoma.
Table 2 shows the results of tests for univariate equality of group means.The F-value [24] for NumCorners was larger than that for any other features.Therefore, NumCorners made a larger contribution to determining three histological classifications of masses with architectural distortions.The p value for ACI1, ACI2, ACI3, NumCorners, RatioPMPC, IndisMargin, InEchoes, Circularity, and Irregularity satisfied the significance level (p < 0.05).Therefore, these nine objective features were statistically significant for the histological classification of masses with architectural distortions.
The k-NN rule was employed with the nine objective features to distinguish among the three histological classifications.Table 3 shows the results of the distinction of the three histological classifications by use of the classifier based on the k-NN rule with k = 3.The classification accuracy of the proposed method was 91.4% (32/35) for invasive carcinoma, 75.0%(12/16) for noninvasive carcinoma, and 85.7% (18/21) for benign mass.The sensitivity and specificity values were 92.2% (47/51) and 85.7% (18/21), respectively.The positive predictive values (PPV) were 88.9% (32/36) for invasive carcinoma and 85.7% (12/14) for noninvasive carcinoma whereas the negative predictive values (NPV) were 81.8% (18/22) for benign mass.

Discussion
To investigate the usefulness of the proposed objective features on architectural distortion in terms of classification accuracy, we compared the proposed method with our previous method [13] to assess the histological classification of masses with architectural distortions.We employed the k-NN rule with our previous objective features (D/W, IndisMargin, HomoEchoes, InEchoes, PostEchoes, Circularity, Polygon, Lobulated, and Irregularity) [13].The classification accuracy of our previous method was 85.7% (30/35) for invasive carcinoma, 31.3%(5/16) for noninvasive carcinoma, and 76.2% (16/21) for benign mass.Here, the value of k in the k-NN rule was 8.The proposed method yielded higher classification accuracy than our previous method.Therefore, the objective features for architectural distortion defined in this study were useful for the histological classification of masses with such distortions.
To investigate its usefulness in terms of classification accuracy, the k-NN rule was compared with the multiple discriminant method (MDM) [13] [21].In past work [13], we had used the MDM for the histological classification of masses.Table 4 shows the classification accuracies of the k-NN rule and the MDM based on the leave-one-out-by-patient test method.We used k = 3 in the k-NN rule.For inputs to the k-NN rule and the MDM, we used the nine objective features from this paper.The classification accuracies obtained with the k-NN rule were higher than those obtained by the MDM.It is possible that the MDM might not have accurately estimated the decision boundary [21] because the number of masses in each histological classification was small (in particular in noninvasive carcinoma).In contrast to the MDM, the k-NN rule did not implement a decision boundary, and is based on the distance measure (Euclidean distance) between test data and the specified training data.Therefore, in this study, we believe that the k-NN rule was more appropriate than the MDM for the histological classification of masses with architectural distortions.
In order to investigate the adequacy of the shape of the mask in the convergence index, we compared the classification accuracy of a computerized method using values for ACI1, ACI2, and ACI3 obtained by the circular convergence mask and six objective features (NumCorners, RatioPMPC, IndisMargin, InEchoes, Circularity, and Irregularity were the same as in the proposed method) with the results of the proposed method.The classification accuracies of the computerized method were 85.7% (18/21) for invasive carcinoma, 43.8% (7/16) for noninvasive carcinoma, and 91.4% (32/35) for benign mass.The classification accuracy of the proposed method was thus higher than that of the computerized method.In previous study [17]- [19], the shape of lesions was approximated by the circle.Thus, it was possible to use the circular concentration mask to compute the convergence index.However, masses in ultrasonographic images vary in shape [11] [25] [26].Therefore, in this study, we believe that a rectangular mask was more suitable than a circular mask to calculate the convergence index.
We also investigated the change in classification accuracy for the proposed method when k for the k-NN rule varied from 1 to 5. Table 5 shows the results for the three histological divisions in this case.With k = 3, the proposed method yielded the highest classification accuracy.
There are some limitations in our proposed method.The number of histological types used in this study was relatively small.Only three types of masses formed our database.Therefore, we need to expand the database by collecting other types of masses and re-evaluate our proposed method.Furthermore, the regions occupied by the masses were manually traced by an experienced clinician in this study.It is time consuming for clinicians to manually trace masses in clinical practice.

Conclusion
In this study, we developed a computerized determination scheme for histological classification of masses with architectural distortions in ultrasonographic image.Our proposed method was shown to yield high classification accuracy for histological classification, and could be useful in the differential diagnosis of masses with architectural distortions as a diagnostic aid.In future work, we plan to develop an automatic segmentation method for masses in ultrasonographic images.

Figure 2 .
Figure 2. Schematic diagram of the proposed method to determine the histological classification of masses with architectural distortions in ultrasonographic image.
of perimeter of segmented mass to that of a circle with the same area mass shapeLobulatedLobulated shape measure in mass shapeIrregularityIrregularity measure in mass shapeThe asterisk * indicates that the features were newly defined in this study.y for the second difference in the vertical direction of the ultrasonographic image, the vertical subimages y for the second difference in the horizontal direction of the ultrasonographic image, and the diagonal subimages y for the first difference in the vertical direction followed by the first difference in the horizontal direction of the ultrasonographic image.The pixel values of these subimages

Figure 3 .
Figure 3. Example of an image enhanced for linear structures by a filter bank.(a) Original image, (b) Enhanced image for linear structures.

Table 1 .
Definitions of features and feature codes.

Table 2 .
Tests for univariate equality of group means.

Table 3 .
Determination results of three histological classifications using the k-NN rule for k = 3.

Table 4 .
Comparison of the classification accuracies of the k-NN rule and the MDM.

Table 5 .
Results for the three histological divisions in this case.