A Novel Method for Automated Lung Region Segmentation in Chest X-Ray Images

Detecting and segmenting the lung regions in chest X-ray images is an important part in artificial intelligence-based computer-aided diagnosis/detection (AI-CAD) systems for chest radiography. However, if the chest X-ray images themselves are used as training data for the AI-CAD system, the system might learn the irrelevant image-based information resulting in the decrease of system’s performance. In this study, we propose a lung region segmentation method that can automatically remove the shoulder and scapula regions, mediastinum, and diaphragm regions in advance from various chest X-ray images to be used as learning data. The proposed method consists of three main steps. First, employ the simple linear iterative clustering algorithm, the lazy snapping technique and local entropy filter to generate an entropy map. Second, apply morphological operations to the entropy map to obtain a lung mask. Third, perform automated segmentation of the lung field using the obtained mask. A total of 30 images were used for the experiments. In order to verify the effectiveness of the proposed method, two other texture maps, namely, the maps created from the standard deviation filtering and the range filtering, were used for comparison. As a result, the proposed method using the entropy map was able to appropriately remove the unnecessary regions. In addition, this method was able to remove the markers present in the image, but the other two methods could not. The experimental results have revealed that our proposed method is a highly generalizable and useful algorithm. We believe that this method might act an important role to enhance the performance of AI-CAD systems for chest X-ray images.


INTRODUCTION
Chest radiography (chest X-ray) is a commonly used medical imaging for lung diagnosis [1] and Open Access J. Biomedical Science and Engineering provides clues for diseases, such as detecting lung cancer, pneumothorax, and emphysema. The great advantages of chest X-rays include their low cost, easy operation and fewer false positives; thus it is widely used and is considered the best examination for diagnosing pneumonia [2]. However, these diagnoses are subjective and largely depend on knowledge and experience of the radiologists. Therefore, the development and applications of computer-aided diagnosis/detection (CAD) systems have been actively conducted since the 1980s to improve detection and diagnostic accuracy for the chest diseases. Chest X-ray CAD systems have been shown that they can accurately characterize specific respiratory illnesses, reduce the workload of radiologists, and enable remote diagnosis [3].
In recent years, artificial intelligence (AI)-related technology has been developed dramatically and widely utilized in various fields. In the field of diagnostic imaging support research, the development of AI-CAD systems using deep learning (DL), which is a function of AI, has been rapidly progressed in the form of inheriting the conventional concept of CAD [4,5]. Among these DL-based studies, some have been focusing on lung diseases [6][7][8][9]. The literatures discussed pattern detection of interstitial lung disease on chest CT images [6,7] and pneumonia classification on chest radiographs [8]. These studies have obtained highly accurate results. However, if the chest X-ray images themselves are used as training data [8], the DL-based networking system might learn the irrelevant image-based information resulting in the decrease of system's performance.
Detecting and segmenting the lung regions in chest x-ray images is an important part in AI-CAD systems [10]. In order to avoid the problem of learning the irrelevant image-based information, there have been some reports on pneumonia classification [9,10] that applied lung mask images prepared in advance by clinical experts. The most similar mask was automatically selected by using scale-invariant feature transform (SIFT) algorithm from a large number of lung mask images prepared in advance to segment the lung region. The SIFT was presented by Lowe [11] and is a feature detection algorithm in computer vision to detect and describe local features in images. However, existing masks may have problems with reproducibility and generalization in various cases. To address this issue, in recent years, there have been many reports of DL-based image segmentation methods [12,13]. However, using DL for pre-processing of the training data is highly costly for the DL-based tasks. In the pre-processing of training data, it is desirable to exclude only the areas that may cause network incorrect learning and/or may mislead the final judgment (classification result) in terms of time consumption and accuracy. To the best of our knowledge, there are currently no comprehensive studies that can automatically remove the unwanted part in advance from the chest X-ray images before being used for learning.
In this study, we propose a lung region segmentation method that can automatically remove the shoulder and scapula regions, mediastinum, and diaphragm regions in advance from various chest X-ray images to be used as learning data. The proposed method consists of three main steps. First, employ the simple linear iterative clustering (SLIC) algorithm [14,15], the Lazy Snapping technique [16] and local entropy filter [17] to generate an entropy map. Second, apply morphological operations to the entropy map to obtain a lung mask. Third, perform automated segmentation of the lung region using the obtained mask.
The remainder of this paper is organized as follows: In Section 2, we describe details of each stage of the proposed method and the image data used. In Section 3, we present the experimental results. In Section 4, we bring the discussion of the results. In Section 5, we draw the conclusion of this work.

METHODS
Removing the shoulder and scapula regions, mediastinum, and diaphragm regions in advance from the chest radiograph is very important for accurate learning and judgment/classification of the AI-CAD systems. A chest X-ray image is a two dimensional representation of a three-dimensional structure (blood vessels, bronchi, inflamed parts, etc.) in the lung region. The pixel values of the structure are similar to that of soft tissues around the mediastinum and thorax. Therefore, it is difficult and inefficient to extract only the lung region using the conventional thresholding in terms of pixel values. Figure 1 shows an example of thresholding using the Otsu's method, a popular non-parametric method in medical image segmentation. The Otsu's method was derived from the viewpoint of discriminant analysis to select a threshold automatically from a gray level histogram. This directly deals with the problem of evaluating the goodness of thresholds. An optimal threshold is selected by the discriminant criterion, namely, by maximizing the discriminant measure [18].
In the segmentation of a chest X-ray image, how to differentiate the important part as an object from the other part is very important. In this study, we focused on the texture of the chest X-ray images and incorporated the SLIC algorithm. The SLIC is a typical algorithm of superpixel generation and is one of the most prominent superpixel segmentation algorithms [14,15]. A superpixel can be defined as a group of pixels that share similar properties. The SLIC generates superpixels by clustering pixels based on their color similarity and proximity in the image plane. The superpixel reflects the positional relationship of similar color pixels and can segment the image into a series of structurally meaningful sub-regions. Instead of only examining a single pixel in a pixel grid, which carries very little perceptual meaning, pixels that belong to a superpixel group share some sort of commonality, such as similar color or texture distribution. As a result, superpixels are possible to specify high-quality regions as compared to pixel-by-pixel thresholding.
We used Lazy Snapping to segment an image into foreground and background regions (regions to keep and to cut out, respectively) [16]. Lazy Snapping is an interactive image segmentation technique that can separate coarse and fine processing, so that one can easily specify objects and make fine adjustments. Using this technique, the graph cut can be speedily implemented. By specifying a region on a superpixelized image, it becomes easy to treat the background and the regions one wants to remove as a combined background region. Figure 2 shows an example of the original image, the superpixelized image and the image with designated regions specified by employing Lazy Snapping technique.
Image entropy is generally considered as a measure of uncertainty of image gray spatial distribution [19,20] and is related to the complexity contained in a particular neighborhood. In the chest X-ray images, the foreground region mainly consists of the lung field (blood vessels, bronchi, inflamed areas) and the mediastinum. Thus, the distribution of pixel values in this region is considerably complicated. In this study, we employed the entropy filtering [21] for efficient extraction of the smooth portion from an image. The entropy filter calculates the entropy value of its neighborhood around corresponding pixel in the input image. Basically, the entropy filter provides randomness of a pixel of an image in its local neighborhood and it is also used in texture characterization. By calculating the neighboring pixels, the degree of variation in pixel values in that region can be clearly indicated.
The following sections provide information of the image data used in the experiment, give brief overviews of SLIC algorithm, Lazy Snapping technique and entropy mapping using entropy filter, and describe the architecture of our proposed method.

Dataset
The image data used in this study were the chest X-rays images obtained from a database, Curated Dataset for COVID-19 Posterior-Anterior Chest Radiography Images (X-Rays) Version 2, published on the internet [22]. The combined curated database was obtained by collating 15 publically available datasets. The present dataset contains 1281 COVID-19 X-ray images, 3270 Normal X-ray images, 1656 viral-pneumonia X-ray images, and 3001 bacterial-pneumonia X-ray images. Thus, ethics issues do not arise in this work and the requirement to obtain informed consent was waived. A total of 30 images from the above described datasets were randomly selected for the experiments. The collected images varied in matrix size ranging from 432 × 452 to 1610 × 1632.

Brief Overview of Simple Linear Iterative Clustering (SLIC) Algorithm
The SLIC is an efficient superpixel generation algorithm based on k-means clustering [14,15]. It consists of the following three main stages: 1) Initialization of superpixel center: The seed centers of the superpixels are initialized in the following two sub-steps. a) Place the seed centers of Superpixels at equal intervals: Determine the center locations of superpixels at equal interval, and initialize their parameters (center locations and color information). b) Look around the surrounding of each center: The centers are moved to seed locations corresponding to the lowest gradient position in a 3 × 3 neighborhood (in this study). This is to avoid centering a superpixel on an edge, and to reduce the chance of seeding a superpixel with a noisy pixel.
2) For each pixel in the image, determine which superpixel it belongs to: Based on the color and location information of each pixel, the most similar superpixel is determined. Here, in order to enhance processing efficiency, a rectangle of a certain size centered on each superpixel is defined, and comparison is performed only with pixels within that region. Specifically, the pixels in the range of i Then, it is determined whether or not the pixel belongs to C i . Here, S is an approximate value of the diameter of Superpixel (S = sqrt(N/K)), N is the number of pixels, and K is the number of superpixels. The value of S is considered the distance of adjacent seed points.
3) Update the parameters of each superpixel: The locations and colors of the pixels belonging to each superpixel are averaged. Finally, the current center and the recalculated center are compared, and if they all match (there is no center updating), the process ends. If they do not match (the center is updated), repeat stage 2 of clustering until the algorithm converges. The detailed algorithm of the SLIC can be found in [14,15].

Brief Overview of Lazy Snapping Algorithm
The purpose of Lazy Snapping is to easily crop objects from an image. Lazy Snapping is based on graphics cutting, and uses interactively drawn lines to specify the regions to be preserved (called the foreground) and to be cut (called the background) in the image [16]. When a user draws foreground and background lines on certain regions of the image, the isolated regions will be calculated and displayed for each new line. With just a few lines, the user can successfully crop the correct region. Lazy Snapping consists of two main steps, object marking and boundary editing. Object marking works at a coarse scale. It uses some marking lines to specify the object of interest. Boundary editing is performed on a finer scale or on the enlarged image, which allows users to edit the object boundary by simply clicking and dragging the vertices of the polygon.
The main task in the object marking step is to allow the user to conceptually group the foreground object against its background. The user can use lines and curves to specify the range of the object of interest without having to track the boundary of the object. The object marking step preserves the boundaries of the object as accurately as possible, but there are still some errors, especially around ambiguous and low-contrast edge boundaries. Therefore, a simple polygon editing user interface tool is prepared for the user to refine the object boundary. The detailed algorithm of Lazy Snapping can be found in [16].

Entropy Mapping Using Entropy Filter
The Shannon entropy represents an average rate at which information is produced by a stochastic data source: In image processing, local entropy (also called image entropy) is generally considered as a measure of uncertainty of image gray spatial distribution. According to Shannon entropy theory, its image entropy (IE) with image size M × N (sliding window) can be defined as: where I(i, j) represents the pixel value (or gray level) at image position (i, j), p ij refers to the distribution probability of image pixel value at (i, j), IE indicates the image entropy value at image position (i, j). Basically, the image entropy is to calculate the entropy inside of a sliding window, i.e., the window slides through the image with a stride of 1 pixel and image entropy is calculated for each center pixel, based on all other neighboring pixels inside the window. After calculating the image entropies in the whole image using the predefined window size (mask), we can obtain a local entropy map. It is regarded as a process of filtering. The image entropy can detect subtle changes in the local gray level distribution. Therefore, the calculated entropy can be used to characterize the texture of the image. Image entropy reflects the discrete degree of gray level in the window. Since the gray distribution of the slow change region is relatively uniform, the image entropy of the region is small, while the highly fluctuated region (un-uniform region), the image entropy of the region is high. In this study, an entropy map was computed using a 9 × 9 window. In other word, the size of entropy filter used was 9 × 9. The entropy filter is considered as one of the pre-processing steps in image segmentation. The high brightness region on the entropy map corresponds to high non-uniformity neighborhood of the original image. The details of local entropy, entropy image and entropy filter can be found in [17,[19][20][21].

Proposed Method
Our proposed method was specifically designed for chest X-ray images, which consists of the following four steps. In the first step, superpixels of an input image were calculated and divided into 600 small regions. In the second step, in addition to the non-existing portion of the subject (the object) in the image (generally referred to as background), the shoulder/scapula area and the diaphragm region were designated as background region as well. In this step, a Lazy Snapping-based processing was conducted. Generally, when applying Lazy Snapping technique, it is necessary to manually designate the foreground and background regions for each input image. In the proposed method, we modified the Lazy Snapping technique to enable specifying the background and foreground regions automatically depending on the position information of the lung on each input image. Since the size of the image data used in this study was various, the background and foreground regions were automatically specified in advance by considering the image size information. As a result, only the foreground regions could be automatically extracted.
The foreground-region of the chest X-ray image is mainly composed of the lung field and the mediastinum. The structures (blood vessels, bronchi, and inflamed parts) in the lung field are three-dimensionally overlapped and considerably complicated. Moreover, the pixel values of the mediastinum and the soft tissue around the thorax are similar. Therefore, the commonly used thresholding techniques are not appropriate (see Figure 1).
In the third step, in order to address the mentioned issue, we focused on the texture of the foreground-region image, and used the image entropy filter to detect and differentiate the uniformity and non-uniformity of image features. Then, thresholding was performed on the obtained entropy map. The threshold value was automatically adjusted according to the histogram of the entropy values. In the fourth step, a lung mask was generated by morphological operations and then segmentation of the original image was performed. The final segmentation was completed by examining the distribution of non-zero pixels and conducting thresholding again if necessary. It is worth to mention here that the parameter adjustment was fully automatic according to the input image type. The flowchart of the proposed method is illustrated in Figure 3. Figure 3. Flowchart of the propose method. In the second image from the top on the right, the red frames and the green frames are automatically specified as the background regions and the foreground regions, respectively.

RESULTS
We constructed an automatic lung region segmentation algorithm that could address various types of chest X-ray images. Figure 4 shows an example of superpixels executed in the first step of the algorithm.
In this study we focused on the texture of the image, thus, the entropy filter was employed. For comparison, Figure 5 shows an example of texture maps obtained by using two different filters, i.e., the standard deviation filter and the range filter, together with the entropy filter. The standard deviation filter is usually used to detect the local standard deviation of an image, and the range filter is used to detect the local range value (maximum value-minimum value) of the image. Figure 6 is an example of the segmentation results obtained using the respective methods.
In the final step of the proposed method, whether the thresholding is necessary to repeat or not was automatically determined depending on the pixel values and their distribution. Figure 7 shows some examples of the final results. Figure 7(a) shows the processing results that the iterative thresholding was automatically determined as not necessary by the algorithm. The dark blue regions shown in the figure are the areas segmented as the foreground, while the yellow regions are the areas removed. Figure 7(b) shows the processing results that the iterative thresholding was necessary determined by the proposed method (the thin feedback arrow shown in Figure 3). Both the dark blue and light blue regions are the segmented regions after first thresholding. The light blue regions are the areas removed after iterative thresholding. As a result, only the dark blue regions are left as the foreground.

DISCUSSION
In this paper, we have developed an automatic algorithm for the segmentation of lung fields that can meet various types of the chest X-ray images. The superpixels as shown in Figure 4 are small meaningful regions been divided from an image, reflecting the color and position of the pixels on the chest radiograph. When the number of small regions (number of segments) is large, the different shading on the same object can be separated as different regions. In general, over-segmentation is not suitable for the case that considering the background (the part where the subject does not exist) and the structure around the thorax in the chest X-ray image together as a background region. On the other hand, if the number of small regions is reduced, there is a risk that a part of the adjacent lung field will be regarded as a background region and it might result in the difficulty in appropriate image processing. In this study an image was empirically divided into 600 small regions, and the foreground and background regions were specified based on lazy snapping technique. By doing it, various types of chest X-ray images could be successfully addressed. Figure 5 shows an entropy map used in the proposed method as well as two other types of texture maps, namely, the standard deviation map and the range map. These maps represent texture features obtained from information about local variation in pixel intensity. The maps compute the statistics of their respective neighboring pixels, and indicate the degree of fluctuation of the pixel values in the corresponding regions. Figure 6 illustrates the results of region extraction using the entropy map, standard deviation map and range map. The marker (the letter L) located on the right part of the original image could be removed by image processing using the entropy map (see Figure 6(b)). However, the marker was extracted as the foreground when using the other two methods. Moreover, when using the standard deviation map, there was a removed area at the apex of the lung (see Figure 6(c)). Our proposed method is highly generalized and can automatically adjust the parameters of the algorithm. The threshold value was determined from both the value obtained using the entropy map and the distribution. However, standard deviation values often did not work well. On the other hand, the entropy map can represent subtle fluctuations, thus, it has the advantage of being easy to determine the threshold value. J. Biomedical Science and Engineering   Figure 7 shows the segmentation results of the proposed method. The shoulder and scapula, mediastinum, and diaphragm areas were removed from the chest X-ray image, and the lung field was extracted. However, the 1st to 3rd thoracic vertebras still remained in the foreground, and the left and right lungs were often not separated. This is because the thoracic vertebra was not specified as the background when Lazy Snapping technique was applied for region designation. The thoracic vertebra overlaps the trachea and the trachea connects to the bronchi (structures in the lung field). Thus, it was considered that specifying the thoracic vertebra as the background region was not appropriate. The thoracic vertebra also overlaps the mediastinum. Such a complicated structural area could be addressed by using the entropy map. As a result, the left and right lung separation could be achieved depending on the types of X-ray images. Moreover, if the edges of the regions to be segmented are not smooth, we consider that adding a new processing technique might cope with this issue.
There are several limitations in this study. First, as shown in Figure 8, for the case of atelectasis, that region was removed after image processing. The atelectasis region was difficult to extract as a lung field because the distribution of entropy values of the atelectasis region is similar to that of the diaphragm and mediastinum. A solution to this issue needs to be explored. Second, we only conducted qualitative comparison to evaluate the performance of the proposed method. Quantitative assessment will be made in our further studies. Third, we only used 30 chest X-ray images for experiments. More images should have been used in the experiments to demonstrate the robustness of the proposed method. We will increase the number of image data in the subsequent studies.

CONCLUSION
The aim of this study is to improve the efficiency of generation of AI-CAD training data for the chest X-ray images. As a preliminary step, in this work, we have proposed a lung field segmentation method that can automatically remove the shoulder and scapula regions, mediastinum, and diaphragm regions in advance from various types of chest X-ray images used as learning data. In the proposed method, after applying superpixels to automatically specify the regions, the lung fields were extracted from the texture features using the entropy map technique. To verify the effectiveness of the proposed method, two other texture map techniques were used for comparison. As a result, the proposed method using the entropy map was able to appropriately remove the unnecessary regions. In addition, this method was able to remove the markers present in the image, but the other two methods could not. The experimental results have revealed that our proposed method is a highly generalizable and useful algorithm. We believe that this method could act an important role to enhance the performance of AI-CAD systems for chest X-ray images.