Fully automatic identification and discrimination of sperm’s parts in microscopic images of stained human semen smear ()
1. INTRODUCTION
Infertility is a common clinical problem which causes considerable morbidity, including stress, depression and sexual dysfunction, in those couples affected [1]. The main cause of infertility is an anomaly of the sexual reproductive system. High percentage of these problems are from the male and finding ways to resolve this will be helpful to the physicians for a better and faster cure for couples. To determine infertility, some physical characteristics of the seminal plasma (such as smell, viscosity, pH and aspect) and spermatozoon’s parameters (such as concentration mobility and morphology) are analyzed [2]. Visual assessments of sperm by experts or CASA (Computer Aided Sperm Analysis) systems are the classical ways to determine the potential fertility of men. Manual methods are subjective and have led to widely varying results due to numerous factors such as different staining procedures, experience of technicians and human errors. So, manual procedures are inexact, subjective, no repeatable and difficult to teach [3]. The disadvantage mentioned above makes difficult to interpret accurately the data and remarks the need of objective, precise, and repeatable techniques to study sperm. Due to the complexity of sperm quality estimation, computerized techniques are essential tools. The majority of these computer methods have been developed to analyze human sperm morphology and have afterwards been adapted for other species [4]. The development of new methodologies is an ongoing research activity [5,6]. These researches have enriched the available knowledge on sperm cells [7] and furthermore, digital image analysis had allowed to classify subpopulations [8] or to describe shape abnormalities [6]. Most of these approaches use CASA systems [9,10] that deploy image processing techniques or propose new description and classification methods [11-14]. In this way, Sánchez et al. [4,15] proposed a technique to compute the fraction of boar spermatozoid heads which present an intracellular density distribution pattern hypothesized as normal by veterinary experts. They extracted a model distribution from a training set of heads assumed as normal by veterinary experts. They defined a measure of deviation from the model intensity distribution and for each head image (normal and non-normal) they computed the deviation from the model. Finally, they chose an optimal value of a decision criterion for single cell classification. As the preprocessing step for segmentation of sperm’s head, they used morphological closing, holes in the contours of the heads were filled and the spermatozoid tails were removed. In the next segmentation stage, spermatozoid heads were separated from the background deploying Otsu’s method [16] to find a threshold that separates the heads from the background. Bieh et al. [17] applied Learning Vector Quantization (LVQ) in automated boar semen quality assessment. The classification of single boar sperm heads into healthy (normal) and non-normal ones was based on grey-scale microscopic images only. They used the same method proposed by Sánchez et al. [15] for segmentation of sperm’s head. Alegre et al. [18] utilizing learning vector quantization (LVQ), suggested an automatic method to classify single sperm cells as acrosomeintact (class 1) or acrosome-damaged (class 2) in an optical phase-contrast microscope. As the preprocessing step, Sperm head images were cropped manually from such a boar semen sample image. In each sperm head image they segmented automatically the sperm head by binarization using Otsu’s method [16] and applying several morphological operations (dilations and erosions). Nowshiravan et al. [19] introduced a multi steps algorithm for sperm segmentation in microscopic image. At first the operator clicked on one chosen sperm, their software defined a square that is bigger than the sperm, this square had both the tail object and the separation threshold. And form this they obtained a Histogram image, and imposed this on the threshold image and at the end all the objects were obtained with the same light surface. They used some image enhancement methods to improve their pictures, these steps removed sperm tail and middle part of it which was not clear earlier, but show the head of it better than before. Nafisi et al. [20] proposed a segmentation algorithm based on a threshold level for finding sperms in low contrast images. First, an image enhancement algorithm was applied to remove extra particles from the image. Then, the foreground particles (including sperms and round cells) were segmented form the background. Finally, based on certain features and criteria, sperms were separated from other cells. Park et al. [21] proposed a method based on the Hough transform for the quantitative estimation of the morphological characteristics of the sperm. Images of the sperms were acquired into the digital format using the optical microscope, CCD camera, and flame grabber. For each sperm in the image, the region of interest for the segmentation of the sperm head was selected using the density difference between the sperm head and background. The boundary of the sperm head was approximated with an ellipse and was used for estimation of the morphological characteristics of the sperm. Carrillo et al. [22,23] introduced an approach called nth-fusion for segmentation of sperms Acrosome, Nucleus and Midpiece in a computer aided tool for the objective analysis of human sperm morphology, commonly known as Automated Sperm Morphology Analyzer (ASMA). After enclosing individual sperms (head and mid-piece) using bounding boxes, they used nth-fusion method which was based on nth-level thresholding of an image followed by intersection with n special masks. In order to obtain the desired segmentation results, aprior objects morphological model, which was based on the information fusion technique in a feature level was used. Abbiramy et al. [24] performed a method for segmenting objects in microscopic images into its constituent’s parts based on morphological operators and edge detections.
In this paper we have proposed a fully automatic method for identification and discrimination of sperm’s Acrosome, Nucleus, Mid-piece and tail that requires no trai ing and atlas. The proposed method includes two major modules:
• Segmentation of sperm’s Acrosome, Nucleus and Midpiece;
• Localized identification and discrimination of sperm’s tail.
At first, an improved hybrid method [25] is used to remove noise from the sperm image (R component of RGB color image). Then, a simple threshold is applied to build a primary mask containing sperm’s Acrosome, Nucleus, Mid-piece, and also small objects in seminal plasma. The small objects have been eliminated and to build the final localized mask, the minimum area bounding box of each individual region is computed through the Rotating Calipers method [26,27]. The detection rate and speed has been increased using the bounding boxes. Pixels inside the bounding boxes are considered as samples. Then, intensities of samples are modeled using a gaussian mixture model which consists of three kernels as Background, Nucleus, and a class of Acrosome and Midpiece. This step starts with only one kernel and uses an entropy based EM algorithm to estimate three kernels as three mentioned classes in an automatic manner which does not need initial values for parameter estimation. After estimation of the model, we would be able to classify brain pixels by knowing apriori probabilities of the classes. The next step is obtaining these apriori probabilities with no training and atlas. So, a MRF model and EM algorithm are applied to update and attain apriori probabilities and means and variances of each class. Finally, Samples in bounding boxes are classified using Bayesian classification [28,29].
After localized segmentation of sperm’s Acrosome, Nucleus and Mid-piece, the pixel at the distal point of sperm’s Mid-piece is considered as an initial point. The proposed method uses a structural similarity index [30] and Rényi entropy [31] in an iterative scheme to estimate sperm’s tail with some points which are placed on the sperm’s tail, accurately [32]. These estimated points can be used to analyze characteristics of sperm’s tail such as length, shape, and etc. In the next sections, details of the research procedure including segmentation and identification of sperm’s Acrosome, Nucleus, Mid-piece and tail are explained. Early and encouraging experiments with these methods have been presented in [32].
2. MATERIALS AND METHODS
2.1. Image Acquisition Technique
Sample Images were acquired from modified Papanicolaou stained sperm smears. Fresh Sperm samples were incubated for 30 to 60 minute in 37˚ Celsius. The Smear was then prepared after complete liquefaction and the slides were dried in the air before staining with modified Papanicolaou method. The images were captured by means of a 560 TV-line CCD camera mounted on the third eyepiece of a trinocular direct microscope (Proway BK5000) with a total magnification of 1000× using Plan Achromatic Infinity objective lenses and a resolution of 576 × 764 pixels in RGB color space. 10 to 25 Images of different fields were captured from each slide. And totally 100 slides were analyzed (each slide consists of 1 to 5 sperms).
2.2. Preprocessing
To create a primary mask containing sperm’s Acrosome, Nucleus and Mid-piece, the Red component of RGB color image is used. The Red component contains most of the information associated with the darkest color, which domains the head. The images first had to be scaled. Therefore the range between zero intensity and maximum intensity, M, in the original 12-byte data (I) was scaled to a new intensity (Is) between 0 and 255 (8-bit) which is obtained by Is = I/M × 255.
2.2.1. Remove Noise from R Component
To remove noise, an improved hybrid method [25] is applied to sperm image (R component of RGB color image). This method consists of two stages. The first stage consists of a fourth order partial differential equation (PDE) and the second stage is a relaxed median filter, which processes the output of fourth order PDE. This model enjoys the benefit of both nonlinear fourth order PDE and relaxed median filter. By using a relaxed median filter we can preserve more image details than the standard median filter. This method preserves fine details, sharp corners and thin lines and curved structures to large extent. The L2-curvature gradient flow method of You et al. [33] is used in this model:
(1)
where is the Laplacian of the image u. Since the Laplacian of an image at a pixel is zero if the image is planar in its neighborhood, the PDE attempt to remove noise and preserve edges by approximating an observed image with a piecewise planar image. The desirable diffusion coefficient should be such that (1) diffuses more in smooth areas and less around less intensity transitions, so that small variations in image intensity such as noise and unwanted texture are smoothed and edges are preserved. Another objective for the selection of is to incur backward diffusion around intensity transitions so that edges are sharpened, and to assure forward diffusion in smooth areas for noise removal. The PeronaMalik diffusivity function [34] is used in the implementation as below:
(2)
The discrete form of non-linear fourth order PDE described in Eq.1 is as follows:
(3)
where
(4)
(5)
(6)
is the time step size and h is the space grid size. Relaxed median filter [35,36] is used in combination with Eq.1 to remove large spike noises. The proposed hybrid method by Rajan et al. [25] is defined as follows:
(7)
where RM is the relaxed median filter with lower bound and upper bound. If is the output of a relaxed median filter, then can be written as
(8)
where is the median value of then samples inside the window. The sliding window is
(9)
to be the window located at position i. The lower bound and upper bounds for relaxed median used in the experiments are 3 and 5 respectively. By using a relaxed median filter we can preserve more image details than the standard median filter. This method preserves fine details, sharp corners and thin lines and curved structures to large extent. Then, a simple threshold is applied to build a primary mask, , which contains sperm’s Acrosome, Nucleus and Mid-piece, and also small objects in seminal plasma. The threshold value is calculated according to in which and are mean and standard deviation of the noise-removed image, , respectively. The primary mask, is defined as
(10)
The small objects have been eliminated and this mask, containing sperm’s Acrosome, Nucleus and Mid-piece is used to build the final mask at the next step. It is reminded that thr is not enough accurate to detect sperm’s Acrosome, Nucleus, Mid-piece areas in all sperm images (especially about the distal points of Mid-piece). So, has been only used as a primary mask to build the final mask containing all candidate pixels which may belong to the sperm’s head and Mid-piece.
2.2.2. Finding the Best-Fitted Rectangle for Each Region
To build the final mask, the minimum area bounding box of each individual region is computed through the Rotating Calipers method [26,27]. This method is capable of computing the minimum area enclosing rectangle in linear time. To apply Rotating Calipers method, the two dimensional convex hull of all visible points (for each region) is computed using the monotone chain algorithm [37]. This algorithm is linear with respect to the number of input points O(n), assuming that input points are sorted by increasing x and increasing y coordinates. The minimum rectangle enclosing a convex polygon P has at least one side collinear to one edge of P [26], using this property, a brute-force approach would be to construct an enclosing rectangle for each edge of P. This has a complexity of O(n2) since we have to find minima and maxima for each edge separately. The rotating calipers algorithm rotates two sets of parallel lines (calipers) around the polygon and incrementally updates the extreme values, thus requiring only linear time to find the optimal bounding rectangle. Figure 1 [38] illustrates one step of this algorithm: The support lines are rotated (clock-wise) until a line coincides with an edge of P. If the area of the new bounding rectangle is less than the stored minimum area rectangle, this bounding rectangle becomes the new minimum. This procedure is repeated until the accumulated rotation angle is greater than 90 degrees.
Figure 2 shows the obtained results for each step
Figure 2. Result of applying an improved hybrid method and Rotating Calipers algorithm to a typical sperm image: (a) The original R component of RGB color image; (b) Noise removed image; (c) Selected objects after applying threshold; (d) Primary mask without small objects; (e) Final mask which is created using Rotating Calipers method; (f) Sperm image overlaid on its bounding boxes.
when the proposed method is applied to a typical sperm image. The proposed method for creating a mask containing sperm’s Acrosome, Nucleus and Mid-piece is summarized as below
• Selection of the R component of RGB color image.
• Removing noise from R component using an improved hybrid method.
• Building a primary mask containing sperm’s Acrosome, Nucleus, Mid-piece, and also small objects in seminal plasma, by applying a simple threshold to the noise removed image.
• Elimination of small objects.
• Finding the best-fitted rectangle for each region using rotating calipers.
2.3. Problem Formulation
Support that X is a n-dimensional random variable and comes from a Gaussian mixture model of M > 1 components. Then the probability density function of the Gaussian mixture model can be repressed as the following:
(11)
where corresponds to the weight of each component which satisfies. For the Gaussian mixtures model, each component density is a Gaussian probability density given by
(12)
where T denotes the transpose operation, is the mean vector and is the covariance matrix which is assumed positive definite. Here we encapsulate these parameters into a parameter vector, writing the parameters of each component as, to get
. Eq.11 can be rewritten as
(13)
If we knew the component from which x came, then it would be simple to determine the parameters. Similarly, if we knew the parameters, we could determine the component that would be most likely to have produced x. The difficulty is that we know neither.
2.4. Bayesian Classification
Bayesian Classification is a probabilistic technique of pattern recognition and is based on the principle of Bayes decision theory [39], given in Eq.14 below
(14)
where, x is a given feature vector, denotes a class, or state of nature, is the prior probability of class, is prior probability of the feature vector x, is aposteriori probability, which a feature vector should be classified as belonging to class, is the conditional probability that a feature vector occurs in a given class. For the approach here, the feature x shall consist of one component, intensity of brain pixels. The quantity is known as the evidence, and serves only as a scale factor, such that the quantity in Eq.14 is indeed a true probability, with values between zero and one. So, the maximum a posteriori (MAP) estimate of Eq.14 is used as below
(15)
According to Bayesian theory [40], the feature vector x is classified to of which the aposteriori probability given x is the largest between the classes.
(16)
Bayes decision rule is optimal in the sense of minimization of the probability of error. It is quite obvious that such an Ideal Bayesian solution can be used only if distributions, and the apriori probabilitles
are known. In the context of classification of brain tissue, the probability models are not known, and therefore, must be approximated. The performance of the Bayesian classifier is directly related to how well these distributions can be modeled.
2.5. Segmentation of Sperm’s Acrosome, Nucleus and Mid-Piece
Sperm’s Acrosome, Nucleus and Mid-piece are segmented using a fully automatic method which is based on entropy based EM algorithm and Markov random field model [29]. This method estimates a gaussian mixture model with three kernels as Background, Nucleus and a class of Acrosome and Mid-piece. To estimate this model, an automatic Entropy based EM algorithm [28] was used to find the best estimated Model. Then, Markov random field (MRF) model and EM algorithm were utilized to obtain and upgrade the class conditional probability density function and the apriori probability of each class. After estimation of Model parameters and apriori probability, samples in bounding boxes were classified using Bayesian classification.
Based on the explanations mentioned above, the block diagram of our method for segmentation of sperm’s Acrosome, Nucleus and Mid-piece is shown in Figure 3,
Figure 3. Block diagram of the proposed approach for fully automatic segmentation of sperm’s Acrosome, Nucleus and Midpiece.
and is summarized below 1) Smoothing the R component of RGB color image using a Gaussian filter.
2) Selection of pixels inside the bounding boxes as input samples.
3) Estimation of input samples distribution using Entropy based EM algorithm and Markov Random Field Model [29] with three kernels as Background, Nucleus and a class of Acrosome and Mid-piece.
4) Bayesian classification of the input image (only pixels inside the bounding boxes) using the obtained Gaussian Mixture Model.
5) Postprocessing: Acrosome and Mid-piece are classified into the same class (i.e., two separated areas in one class). These two areas are grouped into two separated classes (i.e., Acrosome and Mid-piece) using their positions respect to the Nucleus and bounding box corners. At first, the distance between each corner of bounding box and center of Nucleus is computed. Then, the corner whose value is less than others is considered as origin. Acrosome is the region whose distance from origin is less than Mid-piece (other region).
2.6. Simulation Results: Segmentation of Sperm’s Acrosome, Nucleus and Mid-Piece
Figure 4 shows the results of the proposed algorithm for a typical sperm image, including estimated distribution obtained through the Entropy based EM algorithm and MRF model overlaid on distribution of the samples (inside the bounding boxes). In Figure 5, results of applying the proposed algorithm to other sperm images have been shown.