Classification of Multi-User Chirp Modulation Signals Using Wavelet Higher-Order-Statistics Features and Artificial Intelligence Techniques

Higher order statistical features have been recently proved to be very efficient in the classification of wideband communications and radar signals with great accuracy. On the other hand, the denoising properties of the wavelet transform make WT an efficient signal processing tool in noisy environments. A novel technique for the classification of multi-user chirp modulation signals is presented in this paper. A combination of the higher order moments and cumulants of the wavelet coefficients as well as the peaks of the bispectrum and its bi-frequencies are proposed as effective features. Different types of artificial intelligence based classifiers and clustering techniques are used to identify the chirp signals of the different users. In particular, neural networks (NN), maximum likelihood (ML), k-nearest neighbor (KNN) and support vector machine (SVMs) classifiers as well as fuzzy c-means (FCM) and fuzzy k-means (FKM) clustering techniques are tested. The Simulation results show that the proposed technique is able to efficiently classify the different chirp signals in additive white Gaussian noise (AWGN) channels with high accuracy. It is shown that the NN classifier outperforms other classifiers. Also, the simulations prove that the classification based on features extracted from wavelet transform results in more accurate results than that using features directly extracted from the chirp signals, especially at low values of signal-to-noise ratios.


Introduction
Automatic signal classification plays an important role in various applications.For example, in military applications, it can be employed for electronic surveillance and monitoring.In civil applications, it can be used for spectrum management, network traffic administration, signal confirmation, cognitive radio, software radios, and intelligent modems [1].The early researches were concentrated on analog signals in [2] and have been recently extended to digital types of signals used in modern communication systems [3][4][5].In this paper, we present an automatic digital signal type classifier for multi-user chirp signals in additive white Gaussian noise channels.Chirp modulation has been considered for many applications as beacons, aircraft ground data links via satellite repeaters, low rate data transmission in the high frequency (HF) band.It is commonly used in sonar and radar, but it has other applications.For example, it can be used in multiuser spread spectrum and UWB communications.
Higher order statistical (HOS) features have been recently proved to be very efficient in the classification of wideband communications, radar and biomedical signals with great accuracy [6][7][8][9].For example, an automatic classifier of different digital modulation signals, in additive white Gaussian noise channels, was suggested using a combination of the higher order moments and higher order cumulants up to order eighth as features and using multilayer preceptor neural network (NN) in [3], and using a Hierarchical support vector machine (SVM) based Classifier in [4] and [5].The bispectrum features were used as to classify mental tasks from EEG signals in [6] and to classify heart rate signals in [7].Classification of arrhythmias has been made using K-means clustering in [8].Classifying emotions using fuzzy C-means (FCM) and fuzzy K-means (FKM) were introduced in [9].Using combination of fuzzy clustering and hierarchical clustering for symbol based modulation classification was described in [10].FCM algorithm was suggested for texture based segmentation in [11].The Mary Shift Keying Modulation Scheme Identification Algorithm using Wavelet Transform and Higher Order Statistical Moment is made in [12] and the automatic modulation recognition in wireless systems using cepestral analysis and neural networks with features that are extracted from discrete transforms has been considered [13].
A preliminary investigation on the classification of multi-user chirp modulation signals using higher order moments and cumulants with four artificial intelligence classification types along with FCM and FKM clustering has been considered by the authors in [14] and [15].Bispectrum features were also considered by the authors in [16].In this paper, we also consider using wavelet transform (WT) for efficient features extraction.Wavelet transform has a variable time-frequency resolution, which leads to locality in both the time and frequency domains [17].The locality of the transform of a signal is important in two ways for pattern recognition.Firstly, different parts of the signal may convey different amounts of information.Secondly, when the signal is corrupted by local noise in time and/or frequency domain, the noise affects only a few coefficients if the coefficients represent local information in the time and frequency domains.In fact, the wavelet transform is used to divide a given modulated signal into different subbands of different scales to study each scale, separately.The idea of the discrete wavelet transform (DWT) is to represent a signal as a series of approximation (low pass version) and details (high pass version) at different resolutions.The signal is low pass filtered to give half of its length called an approximation signal and high pass filtered to give another half of its length called details signal.Both of them can be used to model the signal.The simplest type of wavelets is Haar wavelet.Haar wavelets are related to a mathematical operation called the Haar transform in the discrete form.All other wavelet transforms used the Haar transform as a prototype.
In general, automatic digital signal classification is divided into two main steps which are the feature extraction and classification.In this classifier, the additive white Gaussian noise (AWGN) corrupted input signals are normalized to have zero mean and unit variance and the normalized signals are passed to the feature extraction step.In this paper, features are extracted by using three methods.The first one is the selected combination of the higher order moments and higher order cumulants up to order eighth from the signal itself.The second method is the selected combination of the higher order moments and higher order cumulants up to order eighth from the DWT of the signal.The third feature extraction method is the selected peaks of the bispectrum of the signal itself and its bi-frequencies.Different types of classification techniques are utilized to use these features to classify the input signals and get the signal type.Different types of classifiers were used such as maximum likelihood classifier, k-nearest neighbor classifier, support vector machine classifier, and neural network classifier as well as FKM and FCM clustering.
This paper is organized as follows.Section 2 describes higher order statistics, and Section 3 describes multi-user chirp modulation signals.Section 4 describes features extraction and Section 5 describes classification techniques.Section 6 shows simulation results and finally, Section 7 concludes the paper.

Higher Order Statistics
The auto-moment of the random variable may be defined as follows [18] and [19]: The p th order moments of a discrete signal s is defined as For example, Assuming a zero-mean discrete based-band signal sequence of the form s = a + jb, the p th order cumulant is defined as: Cum , , , , , where, and the summation is being performed an all partitions The higher order statistics have the ability to suppress additive colored Gaussian noise of unknown power spectrum, identify non minimum phase system or reconstruct non minimum phase signal and extract information due to deviation from Gaussianity.A non Gaussian signal can be decomposed into its higher order cumulant functions where each one of them may contain different information about the signals.This can be very useful in signal classification problems where distinct classification features can be extracted from higher order spectrum domain.

Multi-User Chirp Modulation Signals
Chirp modulation has been considered for many applications as beacons, aircraft ground data links via satellite repeaters, low rate data transmission in the high frequency (HF) band, in the market; from imaging radars, test signals, optical imaging to instrumentation and silicon yield enhancement.It is commonly used in sonar and radar, but has other applications, such as in spread spectrum communications.In spread spectrum usage, surface acoustic wave (SAW) devices are often used to generate and demodulate the chirped signals.In optics, ultra short laser pulses also exhibit chirp due to the dispersion of the materials they propagate through.The linear frequency sweep of a multi-user chirp signals are characterized by the same bandwidth.Chirp signals are categorized as spread-spectrum signals and have good advantages in interference rejection.The use of matched chirp modulation (MCM) for efficient digital signaling in dispersive communication channels has also been considered by El-Khamy et al. in [20] and [21].Chirp modulation has also been considered for multi-user.A novel form of multi-user chirp signals with the same power as well as the same bandwidth was introduced by El-Khamy et al. [22][23][24].Each signal is characterized by two different slopes, one slope for each of the two halves of the signal duration.The general expression for these multiuser chirpmodulated (M-CM) signals can be expressed as, where, K is the user number, K = 1, 2, •••, M, M is the total number of users, E is the signal energy in the whole bit duration T, ω c = 2πf c is the carrier angular frequency, Δf is the frequency separation between successive users at 2 t T  , α K is the slope within the first half of signal duration, i.e. 0 2 t T   and K   is the complement slope within the second half of signal duration, i.e.
The signal slopes in the two halves of its duration are given by, , 2 2 The bandwidth of the different M-CM signals is the same and is given by B = MΔf and their time-bandwidth product is given by In this paper, we used the eight chirp signals (Sig1, Sig2, Sig3, Sig4, Sig5, Sig6, Sig7, and Sig8) that are generated using equations ( 1) and ( 2 (5)

Features Extraction
In this paper, features are extracted using three methods the first one is a combination of higher order moments and cumulants from the signal used, the second one is the higher order moments and cumulants from the discrete wavelet transform coefficients of the signal, and the third one is the peaks of the bispectrum of the signal used itself and its bi-frequencies.

Higher Order Moments and Cumulants
We used six features for classification; these features are the even higher order moments and cumulants up to eight.Even order moments and cumulants expressions up to eighth order are found in [18] and compare its performance with the cases of only using two features that is the fourth order moments and cumulants, the two only features that have the highest standard deviation (STD) of each feature for these signals, and the only four sixth and eighth order features are used.The selected features are those which show significant differences between the different chirp signals.

Features from Discrete Wavelet Transform
The features are extracted using higher the even higher order moments and cumulants up to eight from wavelet transform coefficients and approximation coefficients and details coefficients of the eight chirp signals we used six features for classification and compare its performance with using the features from the signal itself.

Bispectrum Features
The third order cumulants generating function is called the tricorrelation and is shown in Equation (8).The Fourier transform of the tricorrelation is a function of two frequencies and called the bispectrum or the third order polyspectrum in Equation ( 8) [25] and [26].
, , The bispectrum or the third order poly-spectrum is the easiest to compute and hence the most popular and falls in the category of the Higher Order Spectral Analysis Matlab Toolbox (HOSA) [26].The features are the highest peaks of the bispectrum and the corresponding two frequency components.The selected features are those which show significant differences between the different chirp signals.

Maximum Likelihood Classifier
In the maximum likelihood (ML) approach, the classification is viewed as a multiple hypothesis testing problem, where a hypothesis, H i , is arbitrarily assigned to the i th modulation type of m possible types.The ML classification is based on the conditional probability density function [27].

K-Nearest Neighbor Classifier
K-Nearest Neighbor algorithm (KNN) is one of the simplest but widely using machine learning algorithms.An object is classified by the "distance" from its neighbors, with the object being assigned to the class most common among its k distance-nearest neighbors.If k = 1, the algorithm simply becomes nearest neighbor algorithm and the object is classified to the class of its nearest neighbor [28].

Support Vector Machine Classifier
SVMs were introduced on the foundation of statistical learning theory.The basic SVM deals with two-class problems; however, with some methods it can be developed for multiclass classification [29].Binary-SVM performs classification tasks by constructing the optimal separating hyper-plane (OSH).OSH maximizes the margin between the two nearest data points belonging to the two separate classes.The performance of SVM depends on penalty parameter (C) and the kernel parameter, which are called hyper-parameters.In this paper we have used the GRBF, because it shows better performance than other kernels.Thus hyper-parameters (σ and C) are selected to have the values one and 10 respectively for all SVMs.There are three widely used methods to extend binary SVMs to multi-class problems.One of them is called the one-against-all (OAA) method.Suppose we have a P-class pattern recognition problem.P independent SVMs are constructed and each of them is trained to separate one class of samples from all others.When testing the system after all the SVMs is trained, a sample is input to all the SVMs.Suppose this sample belongs to class P1.Ideally, only the SVM trained to separate class P1 from the others can have a positive response.Another method is called the one-against-one (OAO) method.For a P-class problem,   SVMs are constructed and each of them is trained to separate one class from another class.Again, the decision of a testing sample is based on the voting result of these SVMs.The third method is called a hierarchical method.
In this method the received signal is fed to the first SVM (SVM1).SVM1 determines to which group the received signal belongs.This process will be continued in the same manner until the signal types are identified by the last SVMs.One of the advantages of this structure is that Copyright © 2012 SciRes.IJCNS the number of SVMs is less than in cases of OAO and OAA.

Neural Network Classifier
We have used a MLP neural network with back-propagation (BP) learning algorithm as the classifier.A MLP feed forward neural network consists of an input layer of source nodes, one hidden layer of computation nodes (neurons) and an output layer.The number of nodes in the input and the output layers depend on the number of input and output variables, respectively and the number of nodes in the hidden layer is 17 neurons.And the classifier is allowed to run up to 5000 training and with MSE is taken to be 10-6, the activation functions used for hidden layer and for output layer respectively are Hyperbolic tangent sigmoid and Linear transfer function [3].

Fuzzy K-Means Clustering
The main idea behind fuzzy k-means is the minimization of an objective function, which is normally chosen to be the total distance between all patterns from their respective cluster centers.Its solution relies on an iterative scheme, which starts with arbitrarily chosen initial cluster memberships or centers.The distribution of objects among clusters and the updating of cluster centers are the two main steps of the c-means algorithm.The algorithm alternates between these two steps until the value of the objective function cannot be reduced anymore [2].

Fuzzy C-Means Clustering
The c-means algorithm allows for fuzzy partition, rather than hard partition, by using the objective function.Fuzzy c-means clustering is a data clustering algorithm in which each data point belongs to a cluster to a degree specified by a membership grade.This algorithm is proposed as an improvement to fuzzy k-means clustering technique.FCM partitions a collection of n vector into c fuzzy groups, and finds a cluster center in each group such that a cost function of dissimilarity measure is minimized.The steps of FCM algorithm are therefore first described in brief [2].

Simulation Results
In this section, we evaluate the performance of automatic signal classification of the eight considered multi-user chirp modulation signals (Sig1, Sig2, Sig3, Sig4, Sig5, Sig6, Sig7, and Sig8) shown in Figure 1.We choose 100 realizations as training data and 50 realizations as testing data sets from each signal type so we used 150 realizations and each signal has 4096 samples length (1 second).The features are extracted using three methods after passing these signals to white Gaussian noise channel.

Higher Order Moments and Cumulants
The features are extracted using even order moments and cumulants up to eight using equations in [18].Table 1 shows the features for the eight chirp signals.These values are computed under the constraints of zero mean, unit variance and noise free.From the results, we show that the second order moments and cumulants for all signals are the same, for this reason, we don't use it as features.We use the higher order moments and cumulants as features for classification.The fourth order moments are the same for each signal so we use one of them for each signal.Also for the fourth order cumulants, the sixth and eight order moments and cumulants, we used one for each, i.The above method is compared with a one using the features F3 (M6, C6, M8 and C8) and the features F1 (M4 and C4) as in [30].The standard deviation (STD) of each feature for these signals is arranged and the two highest values which are (C8 and C6) used as features F2 for classification.The performance of the mutli-user chirp modulation signals using multilayer perceptron neural network and features F1, F2, F3, and F4 are shown in Figure 2. Figure 3 shows the performance of the multiuser chirp modulation signals using different Classifiers   details coefficients of the wavelet transform using one decomposition level.From our results, we note that the features extracted from the details are more different than those extracted from the approximation coefficients and wavelet transform coefficients, so we use these features for classification for different decomposition levels.Figure 4 shows the performance of the eight signals using the features extracted from these details coefficients using one, two, three decomposition levels, and from the signal itself in the first method using multilayer perceptron neural network classifier.Figure 5 shows the performance for different classifiers using features extracted from details coefficients and two decomposition levels.
and features F4.From the results, we note that the performance using F4 as features outperforms using F1, F2, and F3 and the MLP classifier is the best classifier.

Features from Discrete Wavelet Transform
In this section, the features are extracted using higher order statistics from wavelet transform coefficients and approximation coefficients and details coefficients of the eight chirp signals after passing these signals to white Gaussian noise is added to these signals using db2 and one, two, and three decomposition level to get wavelet coefficients.Table 2 shows the six features extracted from

Bispectrum Features Extraction
The features are extracted by first dividing each signal into segments.Each segment has length (Ns) which equal 32 samples.Then we apply the function (bispeci) from the higher order spectral analysis matlab toolbox in [26] to each segment in order to estimate the bispectrum using the indirect method where maximum number of lags is 31 and without overlapping and biased estimate.After that the features are extracted by taking the maximum peaks of the absolute value of the bispectrum and the corresponding two frequencies of that peaks.Figure 6 shows the contour plot of the magnitude of the bispectrum of the signal S2 for Ns = 32 and the regions R1 and R2.The number of peaks is high so it needs to be reduced.First, the region of the bispectrum (R1) is used.
From our study, we note that there is symmetry so we use only the region R2.The features values are computed under the constraints of zero mean, unit variance and noise free, where f11 and f21 are the values of the frequency of the first high peak in the horizontal and vertical axes respectively and P1 is the value of that peak.Also, f12 and f22 are the second high peak in the horizontal and vertical axes respectively and P2 is the value of that peak.All these six features are called F3 and used for classification.This method is compared with using the features F1 (f11, f21 and P1) and F2 (P1 and P2).If we divide each signal into segments with length (Ns) of the 128 samples, we will get another six features called F4.The mesh plot of the magnitude of the bispectrum of the signal S2 for Ns = 32 is shown in Figure 7.     neural network outperforms other classifiers such as maximum likelihood classifier, support vector machine classifier, k nearest neighbor classifiers, fuzzy c-means clustering, and fuzzy k-means clustering because it take long time for training.In addition, the performance of the fuzzy c-means clustering is better than the fuzzy k-means clustering for most the SNR in case of using higher order moments and cumulants as features.Also, using features extracted from WT get better performance than without WT because the WT have denoising properties which remove the noise, features extracted form details coefficients is better than using features extracted from the approximation and wavelet coefficients and using two decomposition is better than one and three and bispectrum features is better than higher order moment and cumulant features and features extracted from the details discrete wavelet transform coefficients using two decomposition levels.We also note that, in the support vector machine classifiers, the performance of the one-against-all classifier is better than one-against-one and hierarchical support vector machine.Finally, we note that, the performance of non clustering techniques is better than clustering techniques but clustering don't need to train the classifier and it is easy to implement and faster than non clustering techniques.
shows the features for the eight chirp signals when Ns is 32 samples.Figure 8 shows the contour plot of the eight signals where Ns = 32 samples is used and the region R2 is shown.The performance of the mutli-user chirp modulation signals using multilayer perceptron neural network and features F1, F2, F3, and F4 is shown in Figure 9.
Figure 10 shows the performance of the multi-user chirp modulation signals using different classifiers and features F4 is used.From these results, we note that using F4 as features outperforms using F1, F2, and F3. Figure 11 shows the comparison between the performances of the multi-user chirp modulation signals using multilayer perceptron neural network classifier and the three different features extraction methods using higher order moment and cumulant (F4), using features extracted from the details discrete wavelet transform coefficients using two decomposition levels, and bispectrum features (F4).From this figure, we note that using bispectrum features is better than using features extraction from higher order moment and cumulant (F4) and using features extracted from the details discrete wavelet transform coefficients using two decomposition levels at low signal to noise ratio and the low signal power From these results, we note that the performance using

Conclusions
the classifier and it is easy to implement and faster than non clustering techniques.Also, using features extracted from WT get better performance than without WT because the WT have denoising properties which remove the noise, features extracted form details coefficients is better than using features extracted from the approximation and wavelet coefficients and using two decomposition is better than one and three and bispectrum features is better than higher order moment and cumulant features and features extracted from the details discrete wavelet transform coefficients using two decomposition levels.So this signal is type of UWB because it needs low signal to noise ratio and then it is low signal power and also, we deals with low power level spectrum signal, so this chirp signal is type of spread spectrum signals.
In this paper, we presented classification of multi-user chirp modulation signals using wavelet higher order statistics features and artificial intelligence techniques.In this method, different types of classifiers are used and different features extraction methods are used.We note the dependence of the classifier performance on the classifier type, the classifier parameters, the features used, the discrete wavelet coefficients, number of decomposition levels, the method of features extraction, and the length of each segment.Simulation results show that the performance of the multilayer perceptron neural network classifier is better than other classifiers such as maximum likelihood classifier, support vector machine classifier, k nearest neighbor classifiers, fuzzy c-means clustering, and fuzzy k-means clustering because it take long time for training.In addition, the performance of the fuzzy c-means clustering is better than the fuzzy k-means clustering for most the SNR in case of using higher order moments and cumulants as features.We also note that, in the support vector machine classifiers, the performance of the one-againstall classifier is better than one-against-one and hierarchical support vector machine.Finally, we note that, the performance of non clustering techniques is better than clustering techniques but clustering don't need to train ) by putting M = 8.Assume T = 1 sec, f c = 1 kHz and the time-bandwidth product ζ = 1500.Plots of the instantaneous frequencies of these eight chirp signals are shown in Figure 1.

Figure 1 .
Figure 1.Instantaneous frequency of multi-user chirp modulation signals over the carrier frequency.

Figure 2 .
Figure 2. The performance of the multi-user chirp modulation signals using MLP Classifier and higher order moment and cumulant features.

Figure 3 .
Figure 3.The performance of the multi-user chirp modulation signals using different Classifiers and higher order moment and cumulant features.

Figure 4 .
Figure 4.The performance of the multi-user chirp modulation signals using MLP classifier and wavelet based features extraction.

Figure 5 .
Figure 5.The performance of the multi-user chirp modulation signals using different classifiers and clustering and features from details coefficients and two decomposition levels.

Figure 6 .
Figure 6.Contour plot of the magnitude of the bispectrum of the signal S2 for Ns = 32 on the bi-frequencies (f1, f2) and the regions R1 and R2.

Figure 7 .
Figure 7. Mesh plot of the magnitude of the bispectrum of the signal S2, Ns = 32.

Figure 8 .
Figure 8. Contour plot of the magnitude of the bispectrum of the eight signals on the bi-frequencies (f1, f2) and the region R2.

Figure 9 .
Figure 9.The performance of the multi-user chirp modulation signals using MLP classifier and bispectrum features.

Figure 10 .Figure 11 .
Figure 10.The performance of the multi-user chirp modulation signals using different classifiers and Bispectrum features F4.