Improving CAD Hemorrhage Detection in Capsule Endoscopy

This study explores an automated framework to assist the recognition of hemorrhage traces and bleeding lesions in video streams of small bowel capsule endoscopy (SBCE). The proposed methodology aims to achieve fast image control (<10 minutes), save valuable time of the physicians, and enable high performance diagnosis. A specialized elimination algorithm excludes all identical consecutive frames by utilizing the difference of gray levels in pixel luminance. An image filtering algorithm is proposed based on an experimentally calculated bleeding index and blood-color chart, which inspects all remaining frames of the footage and identifies pixels that reflect active or potential hemorrhage in color. The bleeding index and blood-color chart are estimated of the chromatic thresholds in RGB and HSV color spaces, and have been extracted after experimenting with more than 3200 training images, derived from 99 videos of a pool of 138 patients. The dataset has been provided by a team of expert gastroenterologist surgeons, who have also evaluated the results. The proposed algorithms are tested on a set of more than 1000 selected frame samples from the entire 39 testing videos, to a prevalence of 50% pathologic frames (balanced dataset). The frame elimination of identical and consecutive frames achieved a reduction of 36% of total frames. The best statistical performance for diagnosis of positive pathological frames from a video stream is achieved by utilizing masks in the HSV color model, with sensitivity up to 99%, precision 94.41% to a prevalence of 50%, accuracy up to 96.1%, FNR 1%, FPR 6.8%. The esOpen Access


INTRODUCTION
Most endoscopic capsule designers offer some form of basic software upon purchase of the device, which are able to manage, detect present and suggest the most suspicious images to the examiner, in order to facilitate more complete and successful diagnosis. To further assist the examiner, companies often include some form of a database of commonly known cases for diagnosed diseases, as extracted from previously taken and evaluated images of diagnosed patients. It can be argued that the development of software algorithms to recognize suspected images is not as sophisticated and innovative as the actual hardware technology of the capsule itself. They mainly proceed with: 1) the serial reading of each one of the captured images and 2) the application of a selected mix of image analysis tools (image recognition filters and operators that each company implements by their own engineers). Although the software accomplishes the tasks in a certain degree of accuracy and assists the busy schedule of a trained doctor, it does not capture the art of diagnosis and its complexities as it is expressed by the ever-changing environment of medical dynamics including doctor's expertise, technological advancements, faster processing power, etc. The existing methods and processing techniques for images taken from endoscopic capsule with the aim to detect bleeding areas are shown in Table 1.
The best published results based on color algorithms are shown in Table 2. One can see that all these proposed methods are supported by Machine Learning (ML) classifiers, such as ANFIS, ANN, BPNN, SVM, or k-NN.
In 2008, Liu and X. Yuan [1] achieve bleeding detection in endoscopy images using support vector machines (sensitivity 94.50%, accuracy 98.06%, specificity 98.95%). In 2016, Y. Yuan, B. Li, and M. Q.-H. Meng [2] considered bleeding frame and region detection in the wireless capsule endoscopy video (sensitivity 91.71%, accuracy 93.31%, and specificity 94.05%). Lately, T. Ghosh, S. A. Fattah, K. A. Wahid [3] in 2018 used color histogram for bleeding Detection in Wireless Capsule Endoscopy Video achieving sensitivity 97.85%, accuracy 99.15% and specificity 99.47%.  Our approach considers a less intensive method in computational terms, as opposed to machine learning techniques. Its origin attempts to mimic the way that experts evaluate the presence of hemorrhage based on color bounds. In the present study the identification of the hemorrhage regions from the healthy ones and lesions predisposing to bleeding is based on a blood-color chart extracted from image analysis applying mask color thresholds. These thresholds form suspicious red indicators, experimentally estimated from a database of marked bleeding frames being reference values for a nominal color of blood. The suggested bleeding reference color chart is extracted from the training dataset formed of 99 patients. Since the color gamut is not the same in different color spaces and our research interest does not explicitly hinge on biological color perception, the same color information content is investigated in RGB, and HSV color spaces. It is generally accepted that RGB color space is the most commonly preferred color space for digital image storage. Moreover, working in RGB color space is usually faster compared to using more complex color spaces, especially since RGB color space makes no attempt to correct the available light in the environment or the sensitivity of the camera. Our research is focused on quantifying color differences regardless of human perception and, as such it avoids considerations in the CIELab color space [4] because our dataset images are not taken under identical lighting conditions. The HSV is used instead, because it tends to sum up the (pure) color perception in one dimension, where color evaluation can be fast and intuitive as the experts' perception. In RGB space, the colors are analyzed in an orthogonal additive space, where all three components are highly correlated, so that they all carry significant information about the image synthesis. The HSV space introduces the artistic concept of color composed of clear color (Hue) and contamination level (Saturation), besides the Value (or brightness) component. The Hue can be measured as an angle in the range [0, 2π] relative to the red axis with red at angle 0. For example, when the Hue is red and the Value is high, the color red looks bright. When the value is low, the same red becomes darker. In this space, the abstract color perception can be represented by the Hue component alone. In Figure 1, it is shown the 3-D histogram in RGB and HSV of the same pathogenic snapshot, from an endoscopic capsule video. It is clearly shown, the red color distinction on the hue component than the RGB's red component.
In order to reduce doctor's examination time and reduce the algorithm's computation complexity, similar consecutive snapshots are detected and eliminated. The color thresholding framework goes in parallel with the decision boundaries examined by experts in the diagnosis procedure and, thus, it can be readily comprehended and adopted by medical doctors. Accordingly, all development phases of the proposed algorithmic scheme is developed in collaboration with experienced doctors at Aretaieio General Hospital, using as input real data from Pillcam TM SB2/3 (a commonly used capsule, made by Given Imaging Company). In order to assess the efficiency of our multilevel thresholding scheme in different color representations, our results are compared with those delivered by the team [5] for the currently available commercial program SBI TM , which is widely used in the medical community. More specifically, the development of algorithmic methodologies is presented in Section 2, along with the description of the dataset. The evaluation of the proposed methodology is presented in Section 3, with the discussion and implication of results followed in Section 4.

Data Resources
The confidentiality protocol applied in our research is the one approved by the ethics committee of the Greek Universities, in accordance with national and international law.
Total 138 endoscopic capsule Pillcam TM SB videos were collected from different randomized hospitalized patients. These patients were diagnosed as positive to bleeding, angiodysplasias, hemangiomas and lesions predisposing to bleeding in their small intestine region, by the attending doctors of the University Gastroenterology Departments of Medical Schools of National and Kapodistrian University of Athens, Attikon Hospital, Laikon Hospital and the Aristotle University of Thessaloniki. The feature extraction for the proposed algorithm is based on a training set formed from 99 videos, of the total pool of 138 videos. The training data, named as Set 1, as described in Table 3, includes more than 3200 images (frames), selected manually by two expert doctors from these 99 videos. The testing set, named as Set2, is formed by 39 videos, are the rest of the pool of 138 videos. This Set2 is described in Table 3 and includes more than 1000 images (frames).
Although the endoscopic capsule Pillcam TM SB has the ability to capture 250 × 250 pixels, the exported images are of 576 × 576 pixels through the camera's own extrapolation algorithm. All the initial diagnosed capsule videos are revised and tagged again by our team's gastroenterologists with experience in SBCE, by using the capsule's own software diagnostic tool SBI TM (Suspected Blood Indicator) of Rapid TM READER (Given Imaging, Yokneam, Israel), as well as manual diagnosed and segmented by our team's experienced doctor. In order to read and process snapshots from the endoscopic capsule video stream, to mark down the active bleedings and the lesions predisposing to bleeding findings within a frame, code is written in Python, using NumPy OpenCV to accommodate all needed functions. OpenCV used to read images (in BGS format), to convert between color spaces and segment out ranges, moreover, has a gamut visualizer (MATLAB does not).

Features Extraction and Algorithmic Development
In order to eliminate similar consecutive video snapshots, as shown in Figure 2, the threshold of the mean sum of squared error (MSE) is used to define the similarity of two frames based on the value of the gray scale luminance values between two consecutive images. The MSE, which is expensive to be computed [6], is only computed for pixels which satisfy the Chauchy Swartz inequality ( ) 2 2 , ≤ f g f g , where 2 f and 2 g denote the frame to frame gray scale luminance value calculation for 255 bins (the luminance sample value is 0 to 255 in this case as 8-bit are used). The MSE is generated by calculating between two consequence frames as: where f(i, j) and g(i, j) are the luminance values of the two frames, respectively, at position (i, j). m × n represents the spatial resolution of the video sequence, the information from each frame is described as a row-vector m × n. The minimum value of the MSE is experimentally determined (extracted and validated by evaluating snapshots of Set 1 subsets) equal to 0.0005. The detection of the suspicious bleeding areas, is based on mask estimation, named as "bleeding indicator". In order to extract chromatic thresholds of the active findings on each snapshot of Set 1 (on a total of at least 2300 diagnosed positively by our team's experienced doctors as of suspicious red of the Set 1 = {99 videos}). The frames are analyzed with image software analysis functions (Python, and Open CV), to determine the bleeding one pixel characteristics in experimental fashion, by applying different color models.
The bleeding indicator mask is calculated, by the up and down thresholds of our blood-chroma chart of color values, collected as shown in Figure 3, by our snapshots of our video database. The "bleeding indicator", is estimated using the function cv2.inRange() to threshold the suspicious red region. This function takes three parameters, the image, the lower range, and the higher range. Our thresholds in RGB color model estimated as RED = 120 to 255, GREEN = 40 to 65, BLUE = 10 to 50, in HSV color model estimated as Hue = 12.6˚ to 22.6˚, Saturation = 69˚ to 92˚, Value = 23˚ to 58˚. In Figure 4, is shown the result of the application of our bleeding mask indicator on a pathogenic snapshot.   In order to enhance the proposed mask (bleeding indicator) for the localization of bleeding findings within a frame, two morphological operators are used, i.e. the Minimum and Median within a structural element moving along the image, see a visual example in Figure 5 and Figure 6. After experiments, the 3 × 3 structural element is selected and validated in the Set 1 subsets. Moreover, the Opening operator was applied to eliminate false bleeding spots, as small areas within a frame, see a visual example in Figure 7. We determined as efficient the operator Opening Area ≤ 100 pixels (validated in all Set 1 subsets).
In order for a frame to be tagged as healthy or with hemorrhage findings or lesions predisposing to bleeding, a statistical "bleeding index" (marker) is estimated considering the minimum number of pixels  of suspicious red per frame. This index triggers a pathological frame when it exceeds 800 pixels. Each snapshot from endoscopic capsule is analyzed with image software analysis functions (Python, Open CV, or/and scikit-image toolbox for SciPy), in its color components, and its color histogram is calculated depending on the color model, in order to calculate the number of pixels of each snapshot which appears the color of the bleeding indicator. The snapshots whose number of "bleeding" pixels is more than the appropriate threshold are characterized as pathological. The above process is presented to the expert doctor via a visual GUI developed in Python, as shown in     A simple constructed GUI (in MATLAB©) was assembled in order to facilitate an easier input/output with our data, as well as to provide feedback to a person not accustomed to programming. Screenshots of the GUI are shown in Figures 8-11.

VERIFICATION METRICS AND EVALUATION RESULTS
The performance of the proposed algorithmic pipeline is tested using two groups of samples (testing data) that are not used during algorithm development, selected from the entire testing Set 2 = {39 videos}, see Table 3. To ensure that precision does not depend on Prevalence, the ratio of the number of cases in the disease group and the number of cases in the healthy control group (which are used to establish the Negative Predictive Value (NPV) and the Positive Predictive Value (PPV) or Precision), is equal to the Prevalence of the diseases in the studied population. The one testing group consists of 500 healthy frames and the other group of 500 disease frames (total 1000 frames). The Precision is normalized to a Prevalence of 50%.
The "bleeding index" threshold and the blood-chroma chart (or "bleeding indicator") were verified using the data of Set 2. The testing was carried out in collaboration with our experienced doctor at Aretaieion University Hospital, by applying the proposed algorithm on the data of Set 2. Moreover, the performance of the proposed algorithm was compared with the results given by utilizing the official software offered by capsule's provider.
The indicators used to measure the performance of the proposed algorithm are based on statistical metrics of accuracy, specificity, sensitivity, precision, False Positive Ratio (FPR) and False Negative Ratio (FNR). The probability of correctly recognizing a frame as True Positive (TP) and True Negative (TN) frames within video, is the accuracy (ACC) as: where FP stands for False Positive and FN for False Negative.
The higher the value of all the indicators the better. The most important and desired indicator, considering an effective medical diagnosis, is the value of the Sensitivity which needs to be as high as possible.
The Positive predictive value (PPV), or named as Precision, is computed as: ( )

TP TP Precision or Positive Predictive Value
Pr edictive condition positive TP FP = = + ∑ ∑ ∑ ∑ Low precision indicates that many positive results from the testing procedure are false positives, the higher the value is better.
FPR (False Positive Ratio), or named as false alarm, is the probability to identify healthy frames as abnormal, given as: FNR (False Negative Ratio), or named as miss rate, is the likelihood of pathogenic frames being wrongly identified as healthy, given by: The smaller values of FPR and FNR indicate better algorithmic performance.
To obtain the necessary statistical accuracy of the proposed algorithm for localizing a bleeding region in a picture diagnosed as hemorrhage snapshot, the doctor performed manual segmentation on the frames of interest. There are two true states: TP for pixels within true bleeding regions and TN for pixels within really negative (healthy) regions. There are also two false states: FN for pixels within bleeding regions that are not detected and FP for pixels in healthy regions but identified as bleeding. Precision in the evaluation of localization aspects is the ability to correctly identify the segmented region within the frame as TP and TN in accordance with the doctor's segmentation.

Evaluation Results
In this section, we evaluate and compare the performances of nine proposed algorithms, using our data of Set2 (see Table 3). The algorithms verification measured the performance of probability of correctly detecting: A) an abnormal snapshot in a video stream of images and B) to localize the abnormal regions within the snapshot tagged as abnormal.
The samples in Table 4 reflect the number of the 1000 samples of the entire testing Set2 that are diagnosed as TP and FN, as applying image analysis in different color spaces.
The calculated statistical metrics using the numbers of Table 4 are presented in Table 5.
The testing data for the localization of hemorrhage regions includes 50 true positive frames from the pool of Set2. The testing frames were selected by the doctor by the criterion to include only one "small" bleeding area (as "small" we define suspicious red continuous surface of 800 to 1000 pixels). In each frame In each frame, every correctly detected suspicious red region counts as a TP (true positive). If there is no other bleeding region within the same frame then it also counts as TN (true negative). A pixel in control area counts as FP (false positive) if the algorithm marked it as a suspicious red region. Respectively, if the algorithm is missing a bleeding area, it counts as FN (false negative). The counting of TN and FN is important for the estimation of FNR and TNR (specificity). This was implemented to compare the proposed algorithm applied on different color models. Furthermore, the proposed algorithms are compared with their competing counterparts from the literature in Table 6, regarding the performance in detecting abnormal snapshots in a video stream.
The calculated statistical metrics for the localization of hemorrhage are compared with related published work in Table 7. Notice that the proposed algorithm is evaluated only on small-scale bleeding areas, due to the nature of the selected images, whereas other algorithms are tested on a wider extent of the hemorrhagic regions. This explains the small values of the FPR derived by our approach. The performance of the currently available commercial software SBI TM (Suspected Blood Indica-tor) often used by the Gastroenterology Departments, offered in the RAPID TM Reader software suite accompanying the Pillcam TM SB capsule (by Given Imaging, Israel), was evaluated by the research team [4]. This team (Hospital Senhora da Oliveira, the Life and Health Sciences Research Institute (ICVS), School of Health Sciences, University of Minho in Portugal) applied the SBI TM on endoscopic videos from 281 patients. Their conclusions showed that SBI™ achieves low Sensitivity for the automatic detection of potentially bleeding lesions and that it effectively detects active small bowel bleeding with very high Sensitivity and Negative Predictive Value. The results in [4] for the SBITM showed a 96.6% Sensitivity for active small bowel bleeding, with a 97.7% Negative Predictive Value). Our proposed algorithm is relatively superior to other algorithms including the study of [4] on small scale bleeding.

DISCUSSION
The best statistical verification for the proposed method, for the Sensitivity and Precision of ab-normal (no hemorrhage) snapshot diagnosis from a video stream, the HSV color model proved to be superior, due to simultaneous greater Specificity of 94.2%, with Sensitivity up to 99%, Precision 94.41% to a Prevalence of 50%, Accuracy up to 96.1%, FNR 1%, FPR 6.8%. Our proposed HSV color model has a Sensitivity success rate of at least 0.85%, very well compared to the program which uses the Machine learning classifier [3]. The ratio FNR in our HSV analysis is better than 2.1%, compared to the most recently published lower FNR ratio [1] [2] [3]. The fact that the same color information as applied to different color spaces achieves difference in the performance of the proposed algorithm can be explained by the shape of the gamut in the different color spaces.
Visualizing a tagged frame in RGB color space, there are suspicious red regions of the image span across almost the entire range of red, green, and blue values. The majority of suspicious red regions are located and visually separable in HSV space. The saturation and value of the red do vary, but they are mostly located within a small range along the hue axis. This is the key point that can be leveraged for segmentation.

CONCLUSIONS AND FUTURE DIRECTIONS
HSV color space segmentation prevails useful for image analysis in endoscopy, as simple methods can still be powerful. It is our belief that our framework can be professionally implemented in a more efficient manner, in terms of speed, and ease of use, in the near future, by having some additional abilities.
First, it should utilize the faster processing power of modern processors, by expertly running through thousands of images and cleverly optimizing the reading, or by isolating the real malefactor more efficiently. Secondly, it should be able to provide the ability to change the image recognition parameters and patterns, given that the experience in doctors training can account for the cumulative knowledge and experience of applying these variations. Thirdly, it should meet the demands of having more than one model to apply and moreover to compare the computed results among various methods/parameters to trigger potentially different diagnostic results. Finally, it can be powered by open-source tools of proven speed and abilities, so as to be easily updated at no additional cost, subjected to continuous improvement and free of copyright limitations. Offering this type of antagonistic and flexible diagnosis, a doctor should feel more confident in his/her ability to control the tool itself. Comparative results will be of great benefit, as much to the diagnostic community, as to the companies that will claim better results for their offered tools.