Automatic Classification for Various Images Collections Using Two Stages Clustering Method

In this paper, we propose an automatic classification for various images collections using two stage clustering method. Here, we have used global and local image features. First, we review about various types of feature vector that is suitable to represent local and global properties of images, and similarity measures that can be represented an affinity be-tween these images. Second, we consider a clustering method for image collection. Here, we first build a coarser clustering by partitioning various images into several clusters using the flexible Mean shift algorithm and K-mean clustering algorithm. Second, we construct dense clustering of images collection by optimizing a Gaussian Dirichlet process mixture model taking initial clusters as given coarser clustering. Finally, we have conducted the comparative experiments between our method and existing methods on various images datasets. Our approach has significant advantage over existing techniques. Besides integrating temporal and image content information, our approach can cluster automatically photographs without some assumption about number of clusters or requiring a priori information about initial clusters and it can also generalize better to different image collections.


Introduction
The general goal in the image clustering is to classify the different image objects or patterns in such a way that samples of the same cluster are more similar to one another samples belonging to different clusters.However, clustering task is a difficult problem due to many assumptions, different contexts and the variety of input data [1].In the last few decades, it has been growing interests in developing effective and fast methods for classify an input image into different clusters.These methods are mainly divided in two types of clustering algorithms, such as supervised and unsupervised methods.In the supervised image clustering algorithms, the researchers incorporate a priori knowledge, such as the number of image clusters.Huang et al. [2] propose a hierarchical classification tree that is generated via supervised learning, using a training set of images with known class labels.The tree is next used to categorize new images entered into the database.Carson et al. [3] used a naive Bayes algorithm to learn image categories in a supervised learning scheme.The images are represented by a set of homogeneous regions in color and texture feature space, based on the "Blob-world" image representation.Yang et al. [4] propose a new clustering algorithm, re-ferred to local discriminate models and global integration (LDMGI), which utilizes both manifold information and discriminate information for data clustering.They theoretically prove that K-means and DisKmeans are both special cases of LDMGI.They also show that LDMGI is a type of special clustering algorithm.Thus, they provide a new perspective to discover and understand the relationships between K-means (or DisKmeans) and other spectral clustering algorithms.Sleit et al. [5] propose Content Based Image retrieval (CBIR) scheme that extracts color, texture, and shape feature of images.Then, they group similar images together using K-mean clustering.They use the color histogram, Gabor filters, and Fourier descriptors for color, texture, and shape features respectively.The main restriction in supervised image clustering is that human intervention is required.On the other hand, unsupervised methods aim at providing the correct number of image clusters without any a priori information.Goldberger et al. [6] combine discrete and continuous image models with information-theoretic based criteria for unsupervised hierarchical image-set clustering.The continuous image modeling is based on mixture of Gaussian densities.The unsupervised image-set clustering is conduct by the information bottleneck principle.Krinidis et al. [7] present a unsupervised image clustering approach based on the image histogram, which is processed by the empirical mode decomposition (EMD).The Ensemble Empirical Mode Decomposition (EEMD), which provides noise resistance and assistance to data analysis, decomposes the image histogram into a number of Intrinsic Mode Functions (IMFs).The local maxima of the IMFs summation provide the desire number of image clusters and a combination of them is used as a criterion for image clustering.In this paper, we present an unsupervised clustering method for large images dataset using two statistical clustering methods based on local and global invariant features.First, we think about a various types of feature vector that is suitable to represent local and global properties of images, and similarity measures that can be represented an affinity between these images.Next, we consider a clustering method for image collection.Here, we first build a coarser clustering by partitioning various images into several clusters using the Mean shift clustering and K-means clustering algorithms.Second, we construct dense clustering of images collection by optimizing a Gaussian Dirichlet process mixture model taking initial clusters as the derived coarser clustering.Finally, we have conducted the comparative experiments between our method and existing methods on various images datasets.

Global Image Feature
First, we consider global properties of color image.Here, we will use three kinds of feature information for clustering of given images.These are color feature, texture feature and shape feature.Here, we first consider color feature as color histograms [8].It indicates the frequency of occurrences of every color in an image, and can be defined as a mass function.Our work is based on the HSV color histogram feature extraction.Second, we will consider texture feature as Discrete Wavelet Transform (DWT) [9].Texture refers to visual patterns with properties of homogeneity that do not result from the presence of only a single color such as clouds and water.Texture features typically consist of contrast, uniformity, coarseness, and density.There are two basic classes of texture descriptors, namely, statistical model-based and transform-based.The former one explores the gray-level spatial dependence of textures and then extractes some statistical feature as texture representation.The latter approach is based on some transform such as DWT.Third; we consider a shape feature vector as moment invariants of image.Moment invariants have been frequently used as features for shape representation of object.They are computed based on the information provided by both the shape boundary and its interior region.

Similarity Estimation
In order to verify the ability of distinguish between global feature vectors we compute the similarity or dissimilarity measures based on exponential the cosine distance for these two vectors.This measure between two feature vector i f and j f is defined as Figure 1 shows the similarity matrix between color feature vectors for 10 group images consisting of 11 images with same colors.In Figure 1, the main diagonal areas represent the similarities of images with the same color, but the non-main diagonal areas represent the similarities of images with different colors.Therefore, the images with same colors represent by pure black color, otherwise they represent by white color.

Local Image Feature
Here, we introduce image representation using the bag of visual words models based on local features.We have first various detectors and descriptors describing the image characters that are locally invariant with image rotation, scale transformation and illumination changes.We then describe a local features histogram that is made from bag of visual words using numerous local descriptors.A salient region in an image is a connected part of an image showing a significant and interesting image property.It is usually determined by the application of a region of interest detector to the image.If a region de-tector returns only an exact position within the image, we also refer to it as interest point detector.The most important information that ideal region detectors give to us is the location of features, but other characteristics such as shape (scale) and orientation of a region of interest have to deliver additionally.Next, we have also discussed the interest feature descriptors and their characteristics.A descriptor is a process that takes information of features and image to produce descriptive information i.e. features' description, which are usually presented in form of features vectors.The descriptions then are used to match a feature to one in another image.And two important aspects that a descriptor has to satisfy are discriminative and invariant.

Automatic Images Clustering
We automatically categories cluster images collections using two-stage clustering method.

Figure 2. Block diagram of automatic images clustering
The first step builds initial coarser clustering by considering the contents of color images using Mean shift clustering and K-means clustering algorithm.The second step constructs accurate dense clustering by considering simultaneously the global and local features of color images using a Gaussian Dirichlet process mixture model.Fig. 2 shows the block diagram of two-stage clustering method using global and local features.

Dense Clustering
Here, we first introduce a statistical theory of the Gaussian Dirichlet process mixture model.And then we construct accurate dense clustering by considering simultaneously the global and local features of color images using a Gaussian Dirichlet process mixture model.The Dirichlet process, denoted as , is a random measure on measures and is parameterized by the innovation parameter α and a base distribution 0 G [13].One of the most important applications of the Dirichlet processes is as a nonparametric prior distribution of a mixture model.We want to model this data by mans of nonparametric Bayesian formulation of the Gaussian Dirichlet process mixture model.For this purpose, since the number of mixture component is unknown, we have to consider the mixture model with countable infinite components.Therefore, we will use a Diriclet process mixture model as the prior distribution over the number of components generating the data, and we also assume the probability distribution of observations as the multivariate Gaussian distribution.Here, we assume a coarser clustering model of given image-sets obtained by Mean shift clustering or K-means algorithm as an initial clustering model for Gaussian Dirichlet process mixture model.If we apply Variational Bayesian inference principle for the Gaussain Dirichlet process mixture, then we can obtain the approximating likelihoods and posterior distributions   q  for all model parameters and latent clustering variables.They are represented with the following formulas.First, for the posterior distributions over the DP parameters, we have  and ψ( )  denotes the digamma function.Second, regarding the posteriors over the likelihood parameters, we have y y y y .and we use the notation defined as

S y y y y
Finally, the posteriors over latent clustering variables generating the clustering model is given as As a last step, after conducting the updates of all posterior distributions and likelihood parameters at each iteration of the variational Bayesian inference algorithm for the Gaussain Dirichlet process mixture, the estimates Z of the latent clustering variables must be also update.Then, we have obtained the cluster membership of each image by maximization of posterior distribution over k.Hence, if each image is assigned to the cluster with cluster membership , then we obtained final clustering result for given image datasets.

Conclusion
In this paper, we present methods for clustering images using Mean Shift algorithm and Gaussian Dirichlet mixture model.Our approach has significant advantage over existing techniques.Besides integrating temporal and image content information, our approach can cluster automatically photographs without some assumption about number of clusters or requiring a priori information about initial clusters and it can also generalize better to different image collections.

Figure 1 .
Figure 1.Similarity matrix between global feature vectors extracted for 110 images.

For
image classification, we use the Bag-of -Visual Words approach, where images are represented as a histogram of visual words.The visual words denote local features extracted from the images and the vocabulary is learnt task-specifically from a training database [10-11].The construction procedure of histogram features of visual words given from images goes as follows.First, we extract local feature descriptors form image patches around feature detectors which are invariant with scale or rotation change, and apply PCA transformation with these descriptors to reduce their dimensionality.Second, to create efficient codebook of visual words, we have portioned the local descriptor space into several information regions using various clustering methods such as K-means clustering or GMM clustering model.Here, we can create a bag of visual words by choosing the center of each cluster as the visual word.Third, a bag of visual words is used as the codebook to build an image histogram of local features

First, we briefly
review the Mean Shift clustering and K-means clustering algorithms.And then we built initial coarser clustering initial coarser clustering by considering the contents of color images using Mean shift clustering and K-means clustering algorithm.First, we briefly review the traditional Mean shift clustering procedure[12].This is guaranteed to converge to a point where the gradient of density function is zero.Here, the key point of Mean shift clustering procedure is how to take a size of bandwidth.We have used the flexible bandwidth size obtained through multiple iterations of the implementation.Second, we briefly review the K-means clustering algorithm.The most common algorithm uses an iterative refinement technique.The algorithm is deemed to have converged when the assignments no longer change.Here, we construct initial coarser clustering for experimental image collections using the Mean shift clustering algorithm with the flexible bandwidth size, and K-means clustering algorithm based on global and local feature vectors respectively.Here, global feature consist of color, texture and shape features.They are the histogram of RGB color space, histogram of coefficients generated by 2-step discrete wavelet transformation and invariant moments.Moreover, local feature is used by the histogram of visual words occurrences derived from applying K-means with SIFT descriptors.

Figure 4 . 4 . 2 .
Figure 4. Dense clustering results using means-shift + VDPGMM and K-means+VDPGMM based on color histogram with 8 bins for each channel: (a) Mean shift + VDPGMM(8 bins), (b) K-means + VDPGMM(8 bins), (c) Mean shift+VDPGMM(4 bins), (d) K-means + VDPGMM4.2.Real image dataWe conducted experiments using Matlab on an image database that is collected from internet public data set CBIR.Our image database consist of 10 different groups, namely, Festival, Bus, Horse, Rose, Beach, Dinosaurs, Food, Scene, Elephant, The Parthenon temple as shown in Fig.5.Each group contains 10 similar images.The spatial resolution of each image is size of 128 × 128 pixels.We first test our clustering algorithm using global features such as color, texture, shape features, K-means and GDPM clustering method.In the coarser clustering step, we build initial clusters by considering the global features of images and K-means clustering method.And then in the dense clustering step, we construct fine clusters using a Gaussian Dirichlet process mixture model taking an initial clustering model as K-means clustering results.Our experimental results show that we can cluster properly an example image collection into 10 groups.Our clustering result is shown in Fig 5. Second, we test our clustering algorithm using local features generated by SIFT, visual word and local image histogram, K-means and GDPM clustering method.In the coarser clustering step, we build initial clusters by considering the local features of images and K-means clustering method.And then in the dense clustering step, we construct fine clusters using a Gaussian Dirichlet process mixture model taking an initial clustering model as K-means clustering results.Our experimental results show that we can cluster properly an example image collection into 10 groups.

Figure 5 .
Dense and coarser clustering results