Comparison of ICA and WT with S-transform based method for removal of ocular artifact from EEG signals

Ocular artifacts are most unwanted disturbance in electroencephalograph (EEG) signals. These are characterized by high amplitude but have overlapping frequency band with the useful signal. Hence, it is difficult to remove the ocular artifacts by traditional filtering methods. This paper proposes a new approach of artifact removal using S-transform (ST). It provides an instantaneous time-frequency representation of a time-varying signal and generates high magnitude S-coefficients at the instances of abrupt changes in the signal. A threshold function has been defined in S-domain to detect the artifact zone in the signal. The artifact has been attenuated by a suitable multiplying factor. The major advantage of ST-filtering is that the artifacts may be removed within a narrow time-window, while preserving the frequency information at all other time points. It also preserves the absolutely referenced phase information of the signal after the removal of artifacts. Finally, a comparative study with wavelet transform (WT) and independent component analysis (ICA) demonstrates the effectiveness of the proposed approach.


INTRODUCTION
Electroencephalogram (EEG) is the recorded electric potential from the exposed surface of the brain or from the outer surface of the head.The raw EEG data is contaminated with numerous high and low frequency noise known as artifacts.High frequency noise is a result of atmospheric thermal noise and power frequency noise.Low frequency noise is primarily caused by eye movements, respiration and heart beats.Such artifacts are characterized by amplitude in the millivolt range (whereas the actual EEG is in microvolt range) in the frequency band of 0 -16 Hz [1].
The presence of these artifacts in the raw EEG poses a major problem to researchers.Various methods to remove such artifacts have been proposed in the literature.These methods include Principal Component Analysis (PCA), Independent Component Analysis (ICA) and wavelet based thresholding.Croft and Barry [2] and Kandaswamy et al. [3] reviewed several methods of artifact removal.Lagerlund et al. [4] use a Principal Component Analysis (PCA) based filtering of EEG signal for artifact removal.Therefore, a major drawback of the PCA method is that it cannot completely separate ocular artifacts from EEG signals, when both waveforms have similar voltage magnitudes.Decomposition into uncorrelated signal components is possible with PCA; however, it is not sufficient to produce independence between the variables, at least when the variables have non-Gaussian distributions.The PCA method therefore cannot accommodate higher-order statistical dependencies.
Jung et al. [5], Delorme et al. [6] and LeVan [7] developed some artifact removal techniques using ICA which demonstrates the potential not only to decorrelate but also to work with higher-order dependencies.The most significant computational difference between ICA and PCA is that PCA uses only second-order statistics (such as the variance which is a function of the data squared) while ICA uses higher-order statistics (such as functions of the data raised to the fourth power).Variables with a Gaussian distribution have zero statistical moments above second-order, but most signals do not have a Gaussian distribution and do have higher-order moments.These higher-order statistical properties are put to good use in ICA.While ICA is an extension of PCA method, these component-based artifact removal procedures are not automated and require tuning of several parameters such as number of sources, permutation of these sources etc. through visual inspection.
Krishnaveni et al. [1] recently proposed a wavelet based thresholding method to remove ocular artifacts from raw EEG data.Kumar et al. [8] developed a similar wavelet based statistical method for removing ocular artifacts.The wavelet based artifact removal techniques are notable for preserving the shape of the waveform of the signal in the artifact free zone compared with other artifact removal methods.But this method is limited by the introduction of noise beyond the specified frequency band when applied to band limited signals.
In this paper the authors propose a new method of ocular artifact removal from EEG signal using the S-transform [9,10].The high amplitude signal components, like the ocular artifacts are filtered out in S-domain by setting a statistical threshold function.The ST based method is an improvement of the wavelet method since it localizes the power and preserves the absolutely referenced phase information of the original signal.Moreover, integrating the S-transform over time, results in the Fourier transform (FT).This direct relation to the Fourier transform simplifies the task of inverting to the time domain.This property of the S-transform led to the development of S-transform filter, which has been explored in this work to serve the purpose of removing ocular artifacts.
The paper has been organized as follows.The methodology used for signal preprocessing and artifact removal using ICA, WT and S-transform is described in Section 2. Results and discussions are contained in Section 3 followed by the conclusions in Section 4.

METHODOLOGY
The methodology involves pre-filtering followed by the application of various methods such as ICA, wavelet and S-transform as discussed below.

Filtering
As discussed above the raw EEG data is contaminated with high frequency noises other than ocular artifacts.The signal is passed through a low pass filter with cutoff frequency of 30 Hz followed by normalization.Normalization ensures removal of any unwanted bias that may have crept into the experimental recordings.These normalized signals have been used further for artifact removal.

Independent Component Analysis
Independent component analysis is a powerful technique which can be used to separate sources from a given linear mixture of signals.The brain is supposed to have many independent sources which produce electrical signals.The EEG signal that is obtained by various electrodes actually contains a linear mixture of the signals produced by independent sources in the brain.In a signal having artifacts the linear mixture also contains the signals produced by ocular and other visceral sources.If we are able to separate the sources from the signals recorded by the various electrodes we can easily identify the artifact sources and can separate them from the original EEG sources.Once the independent time courses of different brain and artifact sources are extracted from the data, "corrected" EEG signals can be derived by eliminating the contributions of the artifactual sources.The method described here basically tries to separate and eliminate the ocular sources.Subsequently the linear mixture is reconstructed using the artifact free sources.This linear mixture is then called the artifact free signal.
The signal obtained from the electrodes can be expressed in the following manner .

 x As
(1) Here is called the observation vector.The observation vector in our case is composed of the signals obtained from the electrodes.The observation vector has ( ) components.The source vector is given by ; it gives the different independent sources which results in observation vector after linear mixing.The source vector has (  ) components and they are assumed to be statistically independent; and that they are non-Gaussian.A is called the mixing matrix composed of ( m n  ) constant elements ij .This matrix gives the linear transformation between the source vector and the observation vector.

a
The independent component analysis problem is formulated as finding a demixing or separating matrix W from the given observation matrix x, such that . y Wx (2) here is the estimate of original source vector s and the components i are as independent as possible.This can be achieved by maximizing some function , , F y y  m n y that measures independence.The various approaches differ in the specific objective function and the optimization method that is used.There are certain basic assumptions in the problem formulation.Firstly the number of observed signals should be equal to the number of independent sources, i.e.,  .Therefore A represents a full rank square mixing matrix.This is not really a restriction since PCA can always be applied to reduce the dimension of the data set x, to equal that of the source data set s. Another assumption is that source signals must be statistically independent of each other or in practice as independent as possible (including uncorrelatedness).The independence condition can be defined by stating that the joint probability density of the source signals is equal to the product of the marginal probability densities of the indi- Two categories of ICA algorithm exist.In the first type, source separation can be obtained by optimizing an objective function [11] which can be scalar measure of some distributional property of the output y.More general measures are entropy, mutual independence, divergence between joint distribution of y and some given mode and higher order decorrelation.
The ICA method can be formulated as optimization of a suitable objective function which is also termed as contrast function.The problem in optimization of contrast function is that, they rely on batch computation using the higher order statistics (HOS) of data or lead to complicated adaptive separation.It is often sufficient to use simple HOS such as kurtosis, which is the fourth order cummulant with zero time lags.The kurtosis of the source signal Cum 3 If i s is Gaussian, then its kurtosis is zero.Source signals that have negative kurtosis are called sub-Gaussian and have a probability distribution flatter than usual Gaussian distribution.Source signals having a positive kurtosis are called super-Gaussian and have a probability distribution with sharp peak and longer tails then the standard Gaussian ones.A contrast function based on kurtosis is given by where stands for cummulant.It is maximized by a separating matrix W, if the sign of the kurtosis is same as all the source signals.For pre-whitened input vector x and orthogonal separating matrices,  is minimized for sources having negative kurtosis and maximized for sources with positive kurtosis.
The second category uses neural implementation of ICA algorithms like non-linear PCA based subspace learning [12,13] for achieving source separation.In this category, there are adaptive algorithms [13] like minimum mutual information method [14] and maximum entropy method [15] based on stochastic gradient optimization.An excellent treatment of the various approaches, their strengths and weaknesses can be found in Cichicki et al. [16], as well as Hyvärinen et al. [17].
The method that has been used in this study is based on minimization of mutual information between outputs which is equivalent to maximization of their joint en-tropy.The details of the algorithm which is known as infomax principle can be found in [18].
In this study we have used 19 electrodes to collect EEG data from a human subject.Hence exactly 19 independent components (sources) can be separated out using infomax based ICA, from the observed recordings (mixtures).Out of these 19 sources, the sources containing ocular artifacts can be identified and removed and subsequently artifact free signals are obtained reconstructing from the rest of the components.

Artifact Removal Using Wavelet Transform
Wavelet transform is a useful tool for time frequency analysis of neurophysiological signals.Wavelets are small wave like oscillating functions that are localized in time as well as in frequency [19,20] A similar expression holds for , the scaled and shifted version of the scaling function A signal   h t can be expressed mathematically in terms of the above wavelets at level as j where   j a k and   j d k are the approximate and detailed coefficients at level .These coefficients are computed using filter bank approach as proposed by Rioul and Vetterli [21].
The original signal   h t is first passed through a pair of high pass and low pass filters.The low frequency component approximates the signal while the high frequency components represent residuals between original and approximate signal.The approximate component is further decomposed at successive levels.The output time series is down-sampled by two and then fed to next level of input after each stage of filtering.
As the wavelet coefficients represent the correlation of signal with the mother wavelet, the signal will generate high amplitude coefficients at places where artifacts are present.These coefficients can be eliminated using a simple thresholding technique.The threshold can be computed as here j C is the wavelet coefficient at level of decomposition.If the value of any coefficient is greater than the threshold it is reduced to a suitable fraction of the actual coefficient value [8].

S-Transform
The S-transform provides a useful extension to the wavelets by having a frequency dependent progressive resolution as opposed to the arbitrary dilations used in wavelets.The kernel of the S-transform does not translate with the localizing window function, in contrast to the wavelet counterpart.For that reason the S-transform retains both the amplitude and absolutely referenced phase information [9].Absolutely referenced phase information means that the phase information given by the S-transform is always referenced to time .This is the same meaning of the phase given by the Fourier transform, and is true for each S-transform sample of the time-frequency space.The wavelet transform (WT), in comparison can only localize the amplitude or power spectrum, not the absolute phase.There is, in addition, an easy and direct relation between the S-transform and the natural Fourier transform which cannot be achieved by wavelets.Since its development, the S-transform has been widely used in applications ranging from geophysics [10], oceanography [22], atmospheric physics [9,[23][24][25], medicine [26,27], hydrogeology [28] and mechanical engineering [29].
The S-transform of a continuous time signal   h t is defined as [30,31] The window function w, is generally chosen to be positive and normalized Gaussian [25]   where f represents the frequency, t is the time and  is the delay.It is worth emphasizing that in order to have an inverse S-transform, the window must be normalized The S-transform can also be written as the convolution of two functions over the variable t where Here    respectively.Equation ( 14) can also be written as where is the Fourier transform of ( 12) and the exponential term is the Fourier transform of the Gaussian function (13).Thus the S-transform can be retrieved by applying the inverse Fourier transform (  to  ) to the above equation for One of the important characteristics of the S-transform

 
H f   , applying Fubini's theorem and taking into account the normalization condition (10), the above equation reduces to a simple Fourier transform This spectral property enables the definition of an inverse S-transform through the inverse Fourier transform.Thus This is clearly different from the concept of wavelet transform.The S-transform can be written as operations on the Fourier spectrum The discrete analog of ( 20) is used to compute the discrete S-transform by taking advantage of the Fast Fourier Transform (FFT) and the convolution theorem.Using ( 13), the S-transform of a discrete time series is given by (letting and for it is equal to the constant defined as 0, where and The discrete inverse of S-transform [9] is given by The amplitude of the S-transform has the same meaning as the amplitude of the Fourier transform.This provides a frequency invariant amplitude response in contrast to the wavelet approach [31].The term "frequency invariant amplitude response" means that for a sinusoidal signal with amplitude , the S-transform returns amplitude A regardless of the frequency f.

Artifact Removal Using S-Transform
When only single channel signal is available, component based methods like PCA and ICA cannot be applied.In such case, the use of S-transform has been demonstrated to be an important tool to remove ocular artifacts from EEG signals, since it not only solves the task efficiently but also retains the local phase information of the original signal even after denoising.The following is a summary of the proposed algorithm.
1) Decompose the entire EEG signal using S-transform.
3) Determine the mean and standard deviation of the absolute values of S-transform coefficients   , S i j in the selected frequency band.
4) Set the threshold function as: is the absolute value of .

 
, S i j 5) Set the multiplying factor .
Reconstruct the signal using the modified S-transform coefficients.
High amplitude S-transform coefficients are generated at the places where artifacts are present.These coefficients are eliminated by using the above threshold tech-nique.The time average of the local spectral representation (i.e., the S-transform) is the complex-valued global Fourier spectrum as in (18).As a result, the ST can be interpreted as a generalization of Fourier transform to a nonstationary signal.Since the EEG signal is highly nonstationary, the S-transform is the ideal method for representing its time-frequency distribution.It is also customary in Fourier spectra to show only the positive frequency part of the spectrum, because the amplitude spectrum is symmetric, and the phase spectrum is anti symmetric.However, any operation (here, thresholding) in the S-domain must be applied to both the positive and negative frequency parts of the spectrum.Otherwise, the summed ST will not collapse to FT.When any operation is applied to the positive or negative part of the spectrum only, the inverse FT leads to a complex (not real) time series, and would be in error.
Here, signal reconstruction performance is measured by the signal-to-noise ratio (SNR) and is defined by where h and rec h represent the original and reconstructed signal respectively.

RESULT AND DISCUSSIONS
In this paper we have evaluated the performance of the proposed denoising method using S-transform to remove ocular artifacts from EEG signal with respect to the ICA and wavelet based method discussed in Section II.The Figures related to ICA presented in this paper are produced using the EEGLAB software package which operates in the MATLAB environment and available at http://sccn.ucsd.edu/eeglab.
In Figure 1, the removal of ocular artifacts from EEG signals collected from multiple sensors is demonstrated.A five-second epoch of raw EEG time series containing prominent artifact due to eye movement is shown in Figure 1(a).First, the raw 19-channel EEG data are decomposed into nineteen independent components using ICA based on infomax principle (Figure 1(b)).As expected, correlation between ICA traces are close to zero.The 2-D scalp component map of all nineteen ICs is illustrated in Figure 1(c).Since the EEG spectrum of eye artifacts decreases smoothly, their scalp topographies look like component 1.This artifactual component is also relatively easy to identify by visual inspection of component time course (Figure 1(b)).Since this component accounts for eye activity, we may wish to subtract it from the data.After removing component 1 we reconstruct the data using the rest 18 components in order to get ocular artifacts free EEG signals which is shown in Figure 1(d).ICA decomposition is possible only for multi channel data and hence cannot be applied when the data is available in single channel.Therefore the proposed S-transform technique is effectively applied in order to remove artifacts from single channel signal.A raw EEG signal of 8 sec duration sampled at 256 Hz and its S-transform is shown in Figure 2(a).Using infomax based ICA method, ocular artifacts are removed from all the nineteen channel EEG signals and the exact segment of the signal (Figure 2(b)) that has been used for WT and ST study also is taken for comparison.It can be observed in Figure 2(b), that the high amplitude ocular artifacts are significantly reduced in the reconstructed signal but its S-transform plot reveals that the time-frequency distribution as compared to the original signal is massively disturbed which implies substantial loss in data in the useful signal outside the artifact range.In Figure 2(d), the time frequency plot after thresholding the S-coefficients and the corresponding reconstructed signal are illustrated.The S-coefficients in the range of 0.5 -16 Hz have been subjected to thresholding since most artifacts lie within this band of signal.It can be observed in the Figure 2(d), that the high amplitude ocular artifacts in the specified band are significantly reduced in the reconstructed signal.In addition, the reconstructed signal retains all other amplitudes and frequencies as well, of the signals.This is also apparent in the corresponding ST plot and can be explained by the "frequency invariant amplitude response" property of S-transform.The artifact removed signal using wavelet transform and its corresponding time frequency representation using S-transform is demonstrated in Figure 2(c).This time-frequency plot contains some unwanted frequency components beyond the actual frequency range of the signal.
A comparison of the power spectral density (PSD) of the raw signal and the reconstructed signal is shown in     In this example, reducing the multiplying factor of the wavelet coefficients, results in a distortion of the smooth profile of the signal, which introduces sharp changes in the artifact zone (Figure 4(a)).This is also observed in the time-frequency plot (Figure 2(c)) of the WT based artifact free signal.The variation of multiplying factor for ST does not affect the PSD in the non-artifact zone within the specified band (16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30).The signal to noise ratio in the reconstructed signals for all the three methods is also compared and observed that, it is more in the case of ST based reconstructed signal (2.33 dB) as compared to the WT (1.81 dB) and ICA (0.98 dB) based methods.The proposed method is applied to ten epochs of an EEG signal of duration of four seconds and compared with the method based on wavelet transform.As shown in Figure 5, in each epoch the reduction of ocular artifacts is improved in the case of the proposed method with minimal impact to the other part of signal.

CONCLUSIONS
This paper demonstrates the effectiveness of the Stransform based approach in removing the ocular artifacts from the EEG signal.It outperforms some of the recently proposed methods based on wavelet transform and ICA.The wavelet transform only localizes the power spectrum as a function of time.It does not retain the absolutely referenced phase information that the S-transform contains and is therefore not directly invertible to the Fourier transform spectrum.Additionally, we have also demonstrated that the PSD of the raw EEG is not affected by the proposed method, which implies the preservation of the information content is better than the other method based on wavelets and ICA.
In conclusion, the S-transform method can be used as a very effective tool for removing ocular artifacts from EEG signals.Additional research in this field will refine our techniques and lead to improved methodology for filtering noise from EEG signals.


be the Fourier transform (from  to  ) of the S-transform .By the convolution theorem the convolution in the  , (time) domain becomes a multiplication in the  (frequency) domain:

Figure 1 .
Figure 1.Demonstration of ocular artifact removal using ICA in EEG signals.(a) A five-second epoch of a nineteen-channel EEG time series containing prominent artifact due to eye movement; (b) Corresponding ICA component activations and (c) scalp map of all the nineteen ICs; (d) Ocular artifact-free EEG signals by removing component 1 in (b).

Figure 3 .
The PSD of S-transform based reconstructed signal almost replicates the PSD of the raw signal be

Figure 2 .
Figure 2. (a) Normalized raw EEG signal; (b) Artifact free EEG signal using ICA; (c) Artifact free EEG signal using wavelet transform; (d) Artifact free EEG signal using S-transform.The figures in the right side are the corresponding time-frequency plot using S-transform, of the signals in the left.

Figure 3 .
Figure 3.Comparison of artifacts free signals using ICA, WT, ST in time domain and their PSD.
yond the artifact range ([0,16] Hz).It can be seen that the PSD is least affected due to the removal of artifacts within the range (below 16 Hz).With the wavelet based reconstructed signal the PSD has a good correlation with that of the raw EEG, but beyond the specified band it has wide variations.When thresholding is applied to any frequency band of the wavelet transformed signal, the sample values in the time domain are replaced.Replacing sample values in the time domain introduces a step discontinuity after the threshold replacement.A series of such replacements in the time domain at arbitrary time locations will give rise to high frequency noise in the frequency domain which is portrayed in Figure 2(c).

Figure 4 (
b) shows the comparison of PSD of artifact removed signals for various multiplying factors   mf .

Figure 4 .Figure 5 .
Figure 4. (a) An artifact zone of the signal and its removal using WT and ST for three different multiplying factors (mf = 0.9, 0.7 and 0.4) and (b) corresponding PSDs; (c) An artifact zone of the signal and its removal using ICA, WT and ST.
. In discrete domain, any finite energy time domain signal can be decomposed and expressed in terms of scaled and shifted versions of a mother wavelet t .The scaled and shifted version of the mother wavelet is mathematically represented as

Table 1 .
Comparison of signal-to-noise ratio.