Multi-Threshold Algorithm Based on Havrda and Charvat Entropy for Edge Detection in Satellite Grayscale Images

Automatic edge detection of an image is considered a type of crucial information that can be extracted by applying detectors with different techniques. It is a main tool in pattern recognition, image segmentation, and scene analysis. This paper introduces an edge-detection algorithm, which generates multi-threshold values. It is based on non-Shannon measures such as Havrda & Charvat’s entropy, which is commonly used in gray level image analysis in many types of images such as satellite grayscale images. The proposed edge detection performance is compared to the previous classic methods, such as Roberts, Prewitt, and Sobel methods. Numerical results underline the robustness of the presented approach and different applications are shown.


Introduction
Edge detection is a very important tool used in many applications of image processing to obtain information from the frames as a preparatory step to feature extraction and object segmentation.This phase detects outlines of an object and boundaries between objects and the background in the image [1].The detection results benefit applications such as optical character recognition [2], infrared gait recognition [3,4], automatic target recognition [5], detection of video changes [6], and medical image applications [7].
Edge detection concerns localization of abrupt changes in the gray level of an image [8].Edge detection can be defined as the boundary between two regions separated by two relatively distinct gray level properties [9].The causes of the region dissimilarity may be due to some factors such as the geometry of the scene, the radio metric characteristics of the surface, the illumination and so on [10].An effective edge detector reduces a large amount of data but still keeps most of the important feature of the image.Edge detection refers to the process of locating sharp discontinuities in an image [11,12].
Many operators have been introduced in the literature, for example, Roberts, Sobel and Prewitt [13][14][15][16][17]. Edges are mostly detected using either the first derivatives, called gradient, or the second derivatives, called Laplacien.Laplacien is more sensitive to noise since it uses more information because of the nature of the second derivatives.
Most of the classical methods for edge detection based on the derivative of the pixels of the original image are Gradient operators, Laplacien and Laplacien of Gaussian (LOG) operators [10].Gradient based edge detection methods, such as Roberts, Sobel and Prewitts, have used two linear filters to process vertical edges and horizontal edges separately to approximate first-order derivative of pixel values of the image.Marr and Hildreth achieved this by using the Laplacien of a Gaussian (LoG) function as a filter [18].To solve these problems, the study proposed a novel approach based on information theory, which is entropy-based thresholding.The proposed method is to decrease the computation time compared with Canny and LoG method.The results were very good compared with the well-known Roberts, Prewitt, and Sobel gradient results.
The outline of the paper is as follows.In Section 2, we have presented the classical edge detection methods that related to the paper.Image thresholding based on Havrda & Charvat's entropy is presented in Section 3. Section 4 describes the edge detection that was based on entropy.Section 5 illustrates the multi-threshold algorithm based on Havrda and Charvat entropy for edge detection.In Section 6, we have presented the effectiveness of proposed algorithm in the case of satellite grayscale images, and also we compared the results of the algorithm with several leading edge detection methods such as Roberts, Prewitt, and Sobel methods in the same section.Conclusions are presented in Section 7.

Classical Edge Detection Methods
Five most frequently used edge detection methods are used for comparison.These are: Gradient operators (Roberts, Prewitt, Sobel), Laplacian of Gaussian (LoG or Marr-Hildreth) and Gradient of Gaussian (Canny) edge detections [19,20].People which would like to read about this subject are referred to [21][22][23] evaluation studies of edge detection algorithms according to different criteria.The details of methods as follows,

Roberts Edge Detector
The Roberts Cross operator performs a simple, quick to compute, 2-D spatial gradient measurement on an image as shown in Figure 1.It thus highlights regions of high spatial frequency which often correspond to edges.In its most common usage, the input to the operator is a grayscale image, as is the output.Pixel values at each point in the output represent the estimated absolute magnitude of the spatial gradient of the input image at that point [20].

Prewitt Edge Detector
The Prewitt edge detector is an appropriate way to estimate the magnitude and orientation of an edge.Although differential gradient edge detection needs a rather time consuming calculation to estimate the orientation from the magnitudes in the x and y-directions, the compass edge detection obtains the orientation directly from the kernel with the maximum response.The Prewitt operator is limited to 8 possible orientations, however experience shows that most direct orientation estimates are not much more accurate.This gradient based edge detector is estimated in the 3 × 3 neighbourhood for eight directions as shown in Figure 2. All the eight convolution masks are calculated.One convolution mask is then selected, namely that with the largest module [20].
Roberts gradient estimation operator.

Sobel Edge Detector
The operator consists of a pair of 3 × 3 convolution kernels as shown in Figure 3.One kernel is simply the other rotated by 90˚.These kernels are designed to respond maximally to edges running vertically and horizontally relative to the pixel grid, one kernel for each of the two perpendicular orientations.The kernels can be applied separately to the input image, to produce separate measurements of the gradient component in each orientation (call these G x and G y ).These can then be combined together to find the absolute magnitude of the gradient at each point and the orientation of that gradient [20].The gradient magnitude is given by: Typically, an approximate magnitude is computed using: which is much faster to compute.The angle of orientation of the edge (relative to the pixel grid) giving rise to the spatial gradient is given by: ( )

Canny Edge Detector
This edge detector is due to J.F. Canny [19] (a recursive implementation of this algorithm was presented in [24]).
In his work Canny specified three main criteria for the performance of edge detectors: First criteria, (low error rate) minimum number of false negatives and false positives.Second criteria, good localization, report edge location at correct position.In other words, the distance between the edge pixels as found by the detector and the actual edge is to be at a minimum.A third criterion is to have only one response to a single edge.In order to implement the canny edge detector algorithm, a series of steps must be followed.Step 1: The first step is to filter out any noise in the original image before trying to locate and detect any edges.It uses a filter based on a Gaussian (bell curve), where the raw image is convolved with a Gaussian filter.The result is a slightly blurred version of the original which is not affected by a single noisy pixel to any sig nificant degree.The Gaussian mask used in my implementation is shown in Figure 4 with σ = 1.4.
Step 2: After smoothing the image and eliminating the noise, the next step is to find the edge strength by taking the gradient of the image using the Sobel operator uses a pair of 3 × 3 convolution masks.The approximate gradient magnitude is given by Step 3: Finding the edge direction is trivial once the gradient in the x and y directions are known.However, you will generate an error whenever sumX is equal to zero.So in the code there has to be a restriction set whenever this takes place.Whenever the gradient in the x direction is equal to zero, the edge direction has to be equal to 90 degrees or 0 degrees, depending on what the value of the gradient in the y-direction is equal to.If G y has a value of zero, the edge direction will equal 0 degrees.Otherwise the edge direction will equal 90 degrees.The formula for finding the edge direction is just: ( ) Step 4: Once the edge direction is known, the next step is to relate the edge direction to a direction that can be traced in an image.So if the pixels of a 5 × 5 image are aligned as follows in Figure 5.
Then, it can be seen by looking at pixel "a", there are only four possible directions when describing the surrounding pixels, 0 degrees (in the horizontal direction), 45 degrees (along the positive diagonal), 90 degrees (in the vertical direction), or 135 degrees (along the negative diagonal).So now the edge orientation has to be resolved into one of these four directions depending on which direction it is closest to (e.g. if the orientation angle is found to be 3 degrees, make it zero degrees).Think of this as taking a semicircle and dividing it into 5 regions as shown in Figure 6.
Therefore, any edge direction falling within the range (0 to 22.5 & 157.5 to 180 degrees) is set to 0 degrees.Any edge direction falling in the range (22.5 to 67.5 degrees) is set to 45 degrees.Any edge direction falling in the range (67.5 to 112.5 degrees) is set to 90 degrees.And finally, any edge direction falling within the range (112.5 to 157.5 degrees) is set to 135 degrees.Step 5: After the edge directions are known, nonmaximum suppression now has to be applied.Nonmaximum suppression is used to trace along the edge in the edge direction and suppress any pixel value (sets it equal to 0) that is not considered to be an edge.This will give a thin line in the output image.
Step 6: Finally, hysteresis is used as a means of eliminating streaking.Streaking is the breaking up of an edge contour caused by the operator output fluctuating above and below the threshold.If a single threshold, T 1 is applied to an image, and an edge has an average strength equal to T 1 , then due to noise, there will be instances where the edge dips below the threshold.Equally it will also extend above the threshold making an edge look like a dashed line.To avoid this, hysteresis uses 2 thresholds, a high and a low.Any pixel in the image that has a value greater than T 1 is presumed to be an edge pixel, and is marked as such immediately.Then, any pixels that are connected to this edge pixel and that have a value greater than T 2 are also selected as edge pixels.If you think of following an edge, you need a gradient of T 2 to start but you don't stop till you hit a gradient below T 1 .

Havrda & Charvat's Entropy
Regarding the statistical approach for describing texture, one of the simplest computational approaches is to use statistical moments of the gray level histogram of the image.The image histogram carries important information about the content of an image and can be used for discriminating the abnormal tissue from the local healthy background.Considering the gray level histogram where g N is the number of dis- tinct gray levels in the ROI (region of interest).If n is the total number of pixels in the region, then the normalized histogram of the ROI is the set { } The source symbol probabilities is  .This set of probabilities must satisfy the condition, 0  1 i H ≤ ≤ .The average information per source output, denoted S(H) [25], Shannon entropy may be described as: If we consider that a system can be decomposed in two statistical independent subsystems A and B, the Shannon entropy has the extensive property (additivity)

( ) ( ) ( ) S A B S A S B + = +
, this formalism has been shown to be restricted to the Boltzmann-Gibbs-Shannon (BGS) statistics.
However, for non-extensive systems, some kind of extension appears to become necessary.Havrda & Charvat's [26,27] has proposed a generalization of the BGS statistics which is useful for describing the thermo statistical properties of non-extensive systems.It is based on a generalized entropic form, where the real number α is a entropic index that charac- terizes the degree of non-extensivity.This expression recovers to BGS entropy in the limit 1 α → .Havrda & Charvat's entropy has a non-extensive property for statistical independent systems, defined by the following rule [28]: .

HC A B HC A HC B HC A HC B
Similarities between Boltzmann-Gibbs and Shannon entropy forms give a basis for possibility of generalization of the Shannon's entropy to the Information Theory.This generalization can be extended to image processing areas, specifically for the image segmentation, applying Havrda & Charvat's entropy to threshold images, which have non-additive information content.Considering 0 HC α ≥ in the pseudo-additive formalism of Equation ( 4), three different entropies can be defined with regard to different values of α .
Let f(x, y) be the gray value of the pixel located at the point (x, y).In a digital image with f as the amplitude (brightness) of the image at the real coordinate position (x, y).For the sake of convenience, we denote the set of all gray levels { } 0,1, 2, , 255  as G. Global threshold selection methods usually use the gray level histogram of the image.The optimal threshold t * is determined by optimizing a suitable criterion function obtained from the gray level distribution of the image and some other features of the image.
Let t be a threshold value and B = {b 0 , b 1 } be a pair of binary gray levels with { } Typically b 0 and b 1 are taken to be 0 and 1, respectively.The result of thresholding an image function f(x, y) at gray level t is a binary function ( ) In general, a thresholding method determines the value t * of t based on a certain criterion function.If t * is determined solely from the gray level of each pixel, the thresholding method is point dependent [25].
Let 1 2 , , , k h h h  be the probability distribution for an image with k gray-levels.From this distribution, we derive two probability distributions, one for the object (class A) and the other for the background (class B), given by: and where The Havrda & Charvat's entropy of order q for each distribution is defined as: The Havrda & Charvat's entropy HC α is parametrically dependent upon the threshold value t for the foreground and background.It is formulated as the sum each entropy, allowing the pseudo-additive property, defined in Equation (3).We try to maximize the information measure between the two classes (object and background).When HC α is maximized, the luminance level t that maximizes the function is considered to be the optimum threshold value.
In the proposed scheme, first create a binary image by choosing a suitable threshold value using Havrda & Charvat's entropy.The technique consists of treating each pixel of the original image and creating a new image, such that ( ) . When 1 α → , the threshold value in Equation ( 3), equals to the same value found by Shannon's method.Thus this proposed method includes Shannon's method as a special case.The following expression can be used as a criterion function to obtain the optimal threshold at 1 α → .
The Havrda_Charvat_T procedure to select suitable threshold value t * with α for grayscale image f can now be described as follows: Procedure Havrda_Charvat_T, Input: An image f of size r × c, and 0 α > .
Output: optimal threshold t * of f.Begin 1.Let f(x, y) be the original gray value of the pixel at the point (x, y), x = 1.. r, y = 1.. c.
The technique consists of treating each pixel of the original image and creating a new image, such that ( )

Detecting of the Edges
We will use the usual masks for detecting the edges [29].
A spatial filter mask may be defined as a matrix w of size m × n.Assume that m = 2μ + 1 and n = 2ρ + 1, where μ, ρ are nonzero positive integers.For this purpose, smallest meaningful size of the mask is 3 × 3.Such mask coefficients, showing coordinate arrangement as Figure 7(a).
Image region under the above mask is shown as Figure

7(b).
In order to edge detection, firstly classification of all pixels that satisfy the criterion of homogeneousness, and detection of all pixels on the borders between different homogeneous areas.In the proposed scheme, first create a binary image by choosing a suitable threshold value using Havrda & Charvat entropy.Window is applied on the binary image.Set all window coefficients equal to 1 except centre, centre equal to × as shown in Figure 8.
Move the window on the whole binary image and find the probability of each central pixel of image under the window.Then, the entropy of each central pixel of image under the window is calculated as S(CPix) = -p c ln(p c ). Where, p c is the probability of central pixel CPix of binary image under the window.When the probability of central pixel, p c = 1, then the entropy of this pixel is zero.Thus, if the gray level of all pixels under the window homogeneous, p c = 1 and S = 0.In this case, the central pixel is not an edge pixel.Other possibilities of entropy of central pixel under window are shown in Table 1.In cases p c = 8/9, and p c = 7/9, the diversity for gray level of pixels under the window is low.So, in these cases, central pixel is not an edge pixel.In remaining cases, p c ≤ 6/9, the diversity for gray level of pixels under the window is high.The complete algorithm can now be described as follows: Algorithm HCEdgeDetection; Input: A grayscale image A (M × N).Output: The edge detection image g.Begin 1. Select suitable t * , α, using Havrda_Charvat_T procedure.
2. Create a binary image:

Proposed Algorithm
Here, the algorithm produces three different threshold values t 1 , t 2 and t 3 .We use Havrda_Charvat_T procedure, to find the threshold value t 1 through the entire image.Then we split the image by t 1 into two grayscale parts, the object and background.Applying the Equation (11), to find the locals threshold values t 2 and t 3 of object and background, respectively.Independently, we apply HC EdgeDetection Procedure with threshold values t 1 , t 2 and t 3 .We merge the resultant edge images to obtain the reconstructed edge image.
In order to minimize the execution time, we deal with the histogram vectors, 0, 1, • 8-Call the procedure HCEdgeDetection(f) to find edge image g. 9-Display the image g, imshow(g); The above procedures can be done together in the following MATLAB program:

Results and Discussion
In order to test the method proposed in this paper and compare with the other edge detectors, common gray level test images with different resolutions and sizes are detected by Canny, LOG, Roberts, Prewitt, Sobel and the proposed method respectively.The performance of the proposed scheme is evaluated through the simulation results using MATLAB.Prior to the application of this algorithm, no pre-processing was done on the tested images (Figure 9).
The algorithm has two main phases global and local enhancement phase of the threshold values and detection phase, we present the results of implementation on these images separately.Here, we have used in addition to the original gray level function f(x, y), a function g(x, y) that is the average gray level value in a 3 × 3 neighborhood around the pixel (x, y).We use MATLAB to calculate the average time for each method at different images size by repeating 10 times for each type of image.As shown in Figure 10, the chart of the test images and the average of run time for the classical methods and proposed scheme.It has been observed that the proposed edge detector works effectively for different gray scale digital images as compare to the run time of classical methods.Some selected results of edge detections for these test images using the classical methods and proposed scheme are shown in Figure 11 and Table 2.

Conclusion
This  execution time, and it is also considered as easy implementation.The significance of this study lies in decreas-ing the computation time with generating suitable quality of edge detection.It is already pointed out in the introduction

Figure 5 .
Figure 5.The pixel "a" and the possible directions.

Figure 6 .
Figure 6.The range of edge direction in the five regions.

Figure 9 .
Figure 9.Samples of test images.

Figure 10 .
Figure 10.Comparison of run time between some classical methods and proposed method on the same datasets.
paper shows the new algorithm based on the Havrda & Charvat's entropy for edge detection using split and merge technique of the histogram of grey scale image.The objective is to find the best edge representation and minimize the computation time.A set of experiments in the domain of edge detection are presented on a sample of test images, see Figure 9.An edge detection performance is compared to the previous classic methods, such as Canny, LOG, and Sobel.Analysis shows that the effect of the proposed method is better than that of those methods in

Figure 11 .
Figure 11.Edge detections of test images using the LoG method, Roberts method, Sobel method and proposed method, respectively.

Table 1 . p and S of central under window.
••, t 1 and t 1 + 1, •••, 255 of object and background parts, respectively rather than the matrices size of them.Divide H into two parts H Low and H High using t 1 .5-Recall Havrda_Charvat_T(H Low , α) to find the optimal threshold value t 2 ; 6-Recall Havrda_Charvat_T(H High , α) to find the op-timal threshold value t 3 ; 7-Now we have 3 values of the threshold, t 2 < t 1 < t 3 .Reconstruct bitmap image f, such that: if (I x,y < t 1 and I x,y >= t 2 ) or (I x,y >= t 3 ) then f x,y =1; end;