Super-Resolution Imaging of Mammograms Based on the Super-Resolution Convolutional Neural Network ()
1. Introduction
Breast cancer is the most frequently diagnosed cancer and the second leading cause of cancer deaths among women in the United States [1] . Estimated 252,710 new cases of breast cancer in women are expected to be diagnosed in the United States in 2017 [1] .
Despite the emergence of new imaging technologies, such as breast MRI, breast ultrasound, and 2-[18F] fluoro-2-deoxyglucose positron-emission mammography (FDG-PEM), mammography is still the recommended method for breast cancer screening by organizations such as the American Cancer Society [2] , the United States Preventive Services Task Force [3] , and the American College of Radiology [4] . Many studies have demonstrated that mammography is the only method of screening for reducing breast cancer mortality [5] [6] [7] [8] .
In breast imaging, resolution is an important factor for diagnosing lesions on digital or digitized mammograms [9] [10] [11] [12] . In recent years, 4 K or 8 K diagnostic displays have come to be implemented; in particular, high-resolution displays are currently favored for mammography [13] . Magnification mammography is commonly used to increase resolution and decrease noise [14] . However, this requires additional radiation exposure and an increased radiation dose, because of the shorter distance between the breast and X-ray source during examination [15] . Therefore, it is desirable to obtain high-resolution mammograms without additional radiation exposure.
The simplest way to obtain a high-resolution image is by using linear interpolation methods, such as the nearest neighbor and bilinear interpolation methods. Image interpolation methods aim to produce a high-resolution image by up sampling a low-resolution image. Such methods are commonly used, because they can process at a high processing speed [16] . However, conventional image interpolation methods often produce over-smoothed images, with artifacts such as aliasing, blur, and halo around the edges [17] .
Learning-based super-resolution image processing has attracted much interest in terms of obtaining high-resolution images by post-processing. Such an approach can reduce artifacts resulting from the image interpolation methods. A super-resolution is a process of estimating a high-resolution image from a low-resolution image, and can increase image resolution without requiring alterations to the existing imaging hardware. Therefore, it is desirable to apply super-resolution imaging to medical imaging [18] . A recent study proposed that the application of super-resolution imaging to mammograms can generate high-resolution mammograms without the need for additional radiation exposure [19] .
Deep convolutional neural networks (DCNNs), also known as deep-learning, have revolutionized the field of computer vision by demonstrating state-of-the- art performance in many classification tasks. Moreover, DCNNs outperform classification performance over conventional machine-learning-based appro- aches in medical image classification tasks [20] [21] . DCNNs also present successful application of most machine-learning applications, including image enhancement, such as denoising [22] , deblurring [23] , and super-resolution. The super-resolution convolutional neural network (SRCNN) [24] [25] , which is one of the deep-learning-based super-resolution methods, has been proposed in computer vision. The SRCNN scheme achieves superior performance over the existing learning-based super-resolution methods [24] [25] . We previously demonstrated that the SRCNN scheme has potential as an effective and robust approach for improving magnified images in chest radiographs [26] [27] . However, its potential in breast mammograms has not yet been identified.
In this study, we applied and evaluated the use of the SRCNN scheme in enhancing image resolution in breast mammograms. In addition, we investigated the relationship between super-resolution image quality and the clinical features of breast or lesion.
2. Materials and Methods
2.1. Materials
A total of 711 mediolateral oblique (MLO) images were sampled from the Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS-DDSM) [28] [29] . The DDSM [30] is a publicly available open-access database of digitized film-screen mammograms for use by the mammographic image analysis research community. The CBIS-DDSM database includes a subset of the DDSM dataset, selected and curated by a trained mammographer. The database contains 711 MLO images, including images of 370 benign and 341 malignant cases, with verified pathology information.
2.2. Super-Resolution Image Reconstruction
In this study, we used the SRCNN method [24] [25] as a super-resolution image reconstruction scheme. This deep-learning-based super-resolution scheme can directly learn end-to-end mapping between low-resolution and high-resolution images.
The architecture of the SRCNN scheme that we used in this study is shown in Figure 1. The first step is up scaling to the desired size using the bicubic interpolations. The second step is the patch-extraction-and-representation part. This part extracts patches from the resulting bicubic interpolated low-resolution image. This step is formulated as follows:
Figure 1. Overview of the super-resolution reconstruction scheme based on the super-resolution convolutional neural network.
(1)
where
and
represent the mapping function, the bicubic interpolated low-resolution input image, the filters (with a size of 9 × 9), and the biases, respectively. The third step involves non-linear mapping, which maps the 64-dimensional feature vectors non-linearly to another set of 32-dimensional feature vectors, called the high-resolution features. The operation of the third step is formulated as follows:
(2)
where
is the filters (with a size of 1 × 1), and
is the 32-dimensional vector. The final step involves reconstruction, which aggregates the high-resolution features to generate the final high-resolution image
. The operation of the last step is formulated as follows:
(3)
The SRCNN scheme can be divided into a training phase and a testing phase. In the training phase, the mapping function
requires the estimation of network parameters
. Let us define the reconstructed images as
and the ground-truth high-resolution image as
. The loss function
is formulated as follows:
(4)
where
is the number of training images,
is a set of high-resolution images, and the set of their corresponding low-resolution images is
. The loss function was minimized using a stochastic gradient descent algorithm. Finally, the trained SRCNN models, which were trained using the MLO image dataset, were obtained. In the testing phase, the super-resolution image was reconstructed from a low-resolution input image using the trained SRCNN model.
2.3. Experiments
The evaluation scheme used in this study is summarized in Figure 2. The 711
Figure 2. Overview of the evaluation scheme with 2-fold cross-validation. Abbreviation: MLO, mediolateral oblique.
MLO images were divided randomly into two equal-sized subsets, i.e., 355 MLO images and 356 MLO images. In this study, we performed a 2-fold cross-validation scheme, which is an evaluation method that involves division of the original dataset into a training dataset to train the SRCNN scheme, and a test dataset to evaluate it. This method can evaluate the generalization performance accurately using an independent testing dataset [31] .
Figure 3 illustrates an overview of the experimental procedures. The SRCNN estimates a high-resolution image from a low-resolution image; thus, the evaluation of super-resolution imaging is difficult, because it is uncertain whether the resulting super-resolution image is correct or not. In this study, we performed the following image-restoration experiments using the down-sampled original test image. Such experiments provide an effective method for assessing whether or not the resulting super-resolution image was correctly restored, in comparison with the original image.
A total of 711 regions of interest (ROIs) centered on the lesions were cropped from each original MLO image. We first generated a low-resolution image by 4× down-sampling from an ROI image. Next, we reconstructed a super-resolution image from a down-sampled low-resolution image using the trained SRCNN models to magnify the image 4×; thus, the resulting super-resolution image has the same size as an original ROI image. For comparison, we performed the same experiment using linear interpolation methods, i.e., the nearest neighbor and bilinear interpolations.
2.4. Image Quality Assessment
For quantitative evaluation of the resulting reconstructed super-resolution images, we measured two types of image quality metrics, i.e., peak signal-to-noise ratio (PSNR) [32] and structural similarity (SSIM) [33] . The PSNR is most commonly used as a measure of the quality of noisy images [34] . The PSNR is defined as follows:
(5)
where MSE is the mean squared error, which is computed by averaging the squared intensity differences of distorted and reference image pixels.
The SSIM is a well-known quality metric used to measure the similarity between two images, and is considered to be correlated with the quality perception of the human visual system. The SSIM index is defined as follows:
Figure 3. Overview of the experimental procedures. Abbreviations: MLO, mediolateral oblique; ROI, region of interest; SRCNN, super-resolution convolutional neural network.
(6)
where
,
,
,
and
are the local means, standard deviations, and cross-covariance for images
,
.
,
, and
where
is the dynamic range value. In this study,
,
, and
are 0.01, 0.03, and 255, respectively.
2.5. Comparison of Super-Resolution Image Quality in the Mammographic Features or Pathology
To investigate the relationship between the image quality of super-resolution images reconstructed by the SRCNN scheme and the clinical features of mammographic lesions, we compared the image quality of the SRCNN scheme in three groups, categorized based on breast density, the Breast Imaging-Reporting and Data System (BI-RADS) [35] assessment, and the verified pathology information.
Table 1 shows the four BI-RADS breast composition categories (a, b, c, and d). In this study, we divided the images into two groups, the low-density breasts (categories a and b, n = 468), and the dense breasts (categories c and d, n = 243) groups.
Table 2 shows the BI-RADS assessment categories. There are seven BI-RADS assessment categories (0, 1, 2, 3, 4, 5, and 6). Category 0 represents incomplete studies requiring additional imaging evaluation [35] . The CBIS-DDSM dataset that we used in this study did not include categories 1 and 6. For this part of the study, we divided the images into two groups, the low-risk (the BI-RADS assessment categories 2 and 3, n = 198) and the high-risk (the BI-RADS assessment categories 4 and 5, n = 450) groups, and excluded the BI-RADS assessment category 0 (n = 63).
In terms of pathology information, we divided the images into two groups based on the verified pathology information, i.e., benign (n = 370) and malignant (n = 341) groups.
Table 1. The BI-RADS breast composition categories in the BI-RADS 5th edition.
Abbreviation: BI-RADS, breast imaging-reporting and data system.
Table 2. The BI-RADS assessment categories in the BI-RADS 5th edition.
2.6. Statistical Analysis
The statistical significance of the differences in the image quality metrics between linear interpolations and the SRCNN scheme was analyzed using one-way analysis of variance and Tukey’s post-hoc test. The Mann-Whitney U test was applied for comparison of the image quality yielded by applying the SRCNN scheme between each group (the low-density breast vs. dense breast groups, the low-risk vs. high-risk BI-RADS assessment groups, and the pathology-verified benign vs. malignant groups). Statistical analyses were conducted using IBM SPSS Statistics version 22.0 (IBM Corp., Armonk, NY), and for all comparisons, p < 0.05 was considered to indicate a statistically significant difference. Data are presented as mean ± standard deviation (SD).
3. Results
3.1. Comparison of Image Quality between the SRCNN and the Linear Interpolations
Figure 4 represents the image quality obtained with the following three schemes: nearest neighbor interpolation, bilinear interpolation, and the SRCNN scheme. In the PSNR, the mean ± SD of the SRCNN was 34.50 ± 3.44 dB, which was significantly higher than those of the nearest neighbor (33.12 ± 3.18 dB, p < 0.001)
Figure 4. Comparison of the image quality obtained with each method for 4× magnification: (a) peak signal-to-noise ratio (PSNR); (b) structural similarity (SSIM). Abbreviations: Nearest, nearest neighbor; SRCNN, super-resolution convolutional neural network.
and bilinear interpolations (33.78 ± 3.34 dB, p < 0.001), respectively (Figure 4(a)). In the SSIM, the mean ± SD of the SRCNN was 0.785 ± 0.103, which was also significantly higher than those of the nearest neighbor (0.733 ± 0.113, p < 0.001) and bilinear interpolations (0.753 ± 0.117, p < 0.001), respectively (Figure 4(b)).
3.2. Correlation between the Super-Resolution Image Quality and BI-RADS Breast Composition Categories
The image quality of the SRCNN scheme was compared between the low-density breast and dense breast groups. The means ± SDs of PSNR and SSIM in the dense breast group were 35.69 ± 3.06 dB and 0.822 ± 0.085, respectively, which were significantly higher than those of the low-density breast group (PSNR, 33.89 ± 3.47 dB, p < 0.001; SSIM, 0.766 ± 0.107, p < 0.001) (Table 3).
3.3. Correlation between the Super-Resolution Image Quality and BI-RADS Assessment Categories
The image quality of the SRCNN scheme was compared between the low-risk and high-risk BI-RADS assessment groups. The means ± SDs of PSNR and SSIM in the high-risk group were 34.92 ± 3.25 dB and 0.799 ± 0.093, respectively, which were significantly higher than those in the low-risk group (PSNR, 33.13 ± 3.60 dB, p < 0.001; SSIM, 0.737 ± 0.117, p < 0.001) (Table 4).
3.4. Correlation between the Super-Resolution Image Quality and Pathology Findings
The image quality of the SRCNN scheme was compared between the pathology-verified benign and malignant groups. The means ± SDs of PSNR and SSIM in the malignant cases were 34.92 ± 3.24 dB and 0.798 ± 0.094, respectively, which were significantly higher than those of the benign group (PSNR, 34.12 ± 3.58 dB, p = 0.005; SSIM, 0.774 ± 0.110, p = 0.009) (Table 5).
Table 3. Correlation between the BI-RADS breast composition categories and the image quality produced by the super-resolution scheme.
Abbreviations: BI-RADS, breast imaging-reporting and data system; PSNR, peak signal-to-noise ratio; SSIM, structural similarity; SRCNN, super-resolution convolutional neural network.
Table 4. Correlation between the BI-RADS assessment and the image quality obtained with the super-resolution scheme.
Abbreviations: BI-RADS, breast imaging-reporting and data system; PSNR, peak signal-to-noise ratio; SSIM, structural similarity; SRCNN, super-resolution convolutional neural network.
Table 5. Correlation between pathology findings and the image quality obtained with the super-resolution scheme.
Abbreviations: PSNR, peak signal-to-noise ratio; SSIM, structural similarity; SRCNN, super-resolution convolutional neural network.
3.5. Visual Examples
Figures 5-7 show examples of the super-resolution images obtained using the linear interpolation methods and the SRCNN scheme in the pathology-verified malignant cases. Figure 8 and Figure 9 show examples of the resulting super-resolution images obtained using linear interpolation methods and the SRCNN scheme in the pathology-verified benign cases. In all these cases, the reconstructed super-resolution images obtained with the SRCNN scheme were restored well to the original ROI images, as compared with those obtained using linear interpolations.
4. Discussion
The deep-learning-based super-resolution image reconstruction approaches that
Figure 5. An example of the reconstructed super-resolution images for 4× magnification in a malignant case (Breast composition category: b; Final assessment category: 5; Shape: architectural distortion): (a) original mediolateral oblique image; (b) original region of interest image (“gold-standard” image); (c) down-sampled low-resolution image; (d) nearest neighbor interpolation method; (e) bilinear interpolation method, and (f) super-resolution convolutional neural network method.
Figure 6. An example of the reconstructed super-resolution images for 4× magnification in a malignant case (Breast composition category: c; Final assessment category: 4; Shape: irregular): (a) original mediolateral oblique image; (b) original region of interest image (“gold-standard” image); (c) down-sampled low-resolution image; (d) nearest neighbor interpolation method; (e) bilinear interpolation method; and (f) super-resolution convolutional neural network method.
Figure 7. An example of the reconstructed super-resolution images for 4× magnification in a malignant case (Breast composition category: b; Final assessment category: 4; Shape: round): (a) original mediolateral oblique MLO image; (b) original region of interest image (“gold-standard” image); (c) down-sampled low-resolution image; (d) nearest neighbor interpolation method; (e) bilinear interpolation method; and (f) super-resolution convolutional neural network method.
Figure 8. An example of the reconstructed super-resolution images for 4× magnification in a benign case (Breast composition category: b; Final assessment category: 3; Shape: lobulated): (a) original mediolateral oblique image; (b) original region of interest image (“gold-standard” image); (c) down-sampled low-resolution image; (d) nearest neighbor interpolation method; (e) bilinear interpolation method; and (f) super-resolution convolutional neural network method.
Figure 9. An example of the reconstructed super-resolution images for 4× magnification in a benign case (Breast composition category: d; Final assessment category: 2; Shape: architectural distortion): (a) original mediolateral oblique image; (b) original region of interest image (“gold-standard” image); (c) down-sampled low-resolution image; (d) nearest neighbor interpolation method; (e) bilinear interpolation method; and (f) super-resolution convolutional neural network method.
we used in this study yielded significantly higher quality of magnified images in digital mammograms than that of the conventional linear interpolation methods. In Equation (5), as the MSE approaches zero, PSNR, which is used for measuring the image restoration quality, approaches infinity; thus, a higher PSNR value implies a high image restoration quality. If the SSIM index, which has values of [0, 1] is 0, there is no correlation between two images, while an SSIM index of 1 means perfect correlation between two images; thus, a higher SSIM value means that two images are perceived as similar by the human visual system [36] . Our experimental results indicated that super-resolution images obtained using the SRCNN scheme was restored to closer to the original image than the restoration achieved by means of the conventional commonly used linear interpolation methods. This suggested that the SRCNN scheme may be more suitable for enhancing image resolution in mammograms than the conventional interpolation methods.
The breast density is a risk factor for breast cancer, as it has a masking effect on cancer detection [37] . Increasing breast density reduces the mammographic screening sensitivity and specificity [38] . In our experiments, the PSNRs and SSIMs of the super-resolution image obtained using the SRCNN scheme in the dense breast group were significantly higher than those of the low-density breast group (both p < 0.001). These results indicated that super-resolution imaging using the SRCNN is more suitable than conventional approaches for dense breasts. Identifying the factor that determines the difference of the super-resolution image quality between low-density and dense breasts will require further study.
We also compared the super-resolution image quality between the two groups based on the BI-RADS assessment categories. The BI-RADS assessment categories 4 and 5 are suspicious and highly suggestive of malignancy, respectively. Category 4 has a wide range of likelihood of malignancy and category 5 has a high probability (≥95%) of malignancy [35] . The super-resolution image quality of the high-risk group (categories 4 and 5) was significantly higher than that of the low-risk group (both p < 0.001). These results suggested that super-resolution imaging by the SRCNN is more suitable for the high-risk group. The BI-RADS category is useful for predicting the presence of malignancy [39] . However, there are some cases where benign lesions are misclassified as malignant. The dataset that we used in this study included such cases (category 4 lesions, 0.02% [3 of 166 lesions]; category 5 lesions, 49.6% [141 of 284 lesions]). Therefore, we also compared the super-resolution image quality between pathology-verified benign and malignant groups. In our experiments, the PSNRs and SSIMs of the malignant group were also significantly higher than those of the benign group (PSNR, p = 0.005; SSIM, p = 0.009). These results indicated that super-resolution imaging is more suitable for malignant lesions in mammograms. Future studies should investigate the reason for this difference of the super-resolution image quality between the benign and malignant groups.
A limitation of this study involved the objective, quantitative evaluation of super-resolution image quality, as there is not yet an appropriate method for evaluation super-resolution imaging in medical imaging. Therefore, we used image quality metrics that is typically used in computer vision. However, for the appropriate evaluation of super-resolution in medical imaging, the results of this study need to be confirmed by subjective evaluation, as it is necessary to evaluate whether the super-resolution images indeed provide more accurate diagnosis of mammographic lesions, and this remains as a topic for future studies.
5. Conclusion
In this study, we demonstrated that application of the SRCNN scheme to mammograms significantly outperformed conventional interpolation methods for enhancing image resolution. The results obtained with clinical mammograms revealed that the SRCNN can significantly improve the image quality of magnified mammograms, especially in dense breasts, high-risk BI-RADS assessment groups, and pathology-verified malignant cases.
Acknowledgements
This study was partly presented in the Radiological Society of North America 103rd Scientific Assembly and Annual Meeting (RSNA 2017) as a scientific poster presentation and published as a meeting abstract in Proceedings of RSNA 2017 (PH257-SD-THA3).