An Improved Signal Segmentation Using Moving Average and Savitzky-Golay Filter

Analysis of long-term EEG signals needs that it be segmented into pseudo stationary epochs. That work is done by regarding to statistical characteristics of a signal such as amplitude and frequency. Time series measured in real world is frequently non-stationary and to extract important information from the measured time series it is significant to utilize a filter or smoother as a pre-processing step. In the proposed approach, the signal is initially filtered by Moving Average (MA) or Savitzky-Golay filter to attenuate its short-term variations. Then, changes of the amplitude or frequency of the signal is calculated by Modified Varri method which is an acceptable algorithm for segmenting a signal. By using synthetic and real EEG data, the proposed methods are compared with original approach (simple Modified Varri). The simulation results indicate the absolute advantage of the proposed methods.


Introduction
Biomedical signals such as electroencephalogram (EEG) and electrocardiogram (ECG) are usually known as nonstationary signal i.e. its statistical characteristics change over the time [1].The purpose of the segmentation signal is dividing a signal to several epochs with the same statistical characteristics such as amplitude and frequency [2,3].Since Analysis of stationary signal is easier than nonstationary signal, signal segmentation is usually applied as pre-processing step for non-stationary signal analysis.
There are two kinds of signal segmentation, namely, constant segmentation and adaptive segmentation.In constant segmentation signal is segmented to fixed epochs [4].Although constant segmentation is simple and easy in case of implementation, this method has small reliability.Inasmuch as the duration of the signal segments isn't usually equal, today signal segmentation is performed automatically that it is named adaptive segmentation [1].In [5][6][7][8][9][10][11] different methods of adaptive segmentation can be seen.
In Modified Varri method two sliding windows are used.This method is based on combination of a frequency measure estimated by the sum of the difference of consecutive signal samples and an amplitude values of the signal in the relevant windows as follows [1]: where l and x k are the window length and the k th signal point, respectively.Thus, the measure difference function (G) is defined as below: where m is the number of the window; A 1 and F 1 are constant coefficients which change in various applications.Local maxima in the G function, above a threshold that is defined before, specify boundaries of the segments [1].Real data usually contains noise.When data is noisy, in many applications such as signal segmentation may not be completely analyzed correctly.Therefore, decreasing noise in experimental time series is an important issue.This issue has become increasingly important in health sciences, systems biology, nano-sciences, information systems, and physical sciences [12].Filters and smoothers not only can reduce destructive noise and short-term components of a signal, but also these can increase the speed of the simulation.
Since simple moving average (MA) is the easiest digital filter to understand and use, it is the most common filter in analysis a signal.In spite of its simplicity, the MA filter is optimal for a common task such as reducing random noise.The Savitzky-Golay filter is an effective tool for de-noising and smoothing a signal.The obtained outputs of Savitzky-Golay filter demonstrate the filtered signal contains less noise than the original signal and exhibits less distortion than moving average filtering technique of the same order [13][14][15].In this paper in order to increase the accuracy of Modified Varri, MA and Savitzky-Golay filter are used.After filtering the signal, Modified Varri which is a powerful method for segmenting a signal is applied.
This paper is organized as follows.Next section explains MA and Savitzky-Golay filters.In the third section the proposed method is described briefly.The performance of the suggested method is evaluated by synthetic data and real EEG signal which is represented in Section 4. Finally, Section 5 provides conclusion.

Moving Average Filter
One of the most common tools for smoothing data is the MA filter, often used to try to capture important trends in repeated statistical surveys.This approach that is known as a type of Finite Impulse Response (FIR) filter is applied to a set of data points by creating an average of different subsets of the full data set.In this paper MA is defined for samples as follows: (4) where x(n) is the original signal.

Savitzky-Golay Filter
The Savitzky-Golay filter is a powerful tool for smoothing a signal that was proposed by Savitzky and Golay in 1964.The filter is defined as a weighted moving average with weightening given as a polynomial of specific degree [13][14][15].The coefficients of a Savitzky-Golay filter, when applied to a signal, perform a polynomial P of the degree k, is fitted to points of the signal, where N describes window size.N r and N l are signal points in the right and signal points in the left of a current signal point, respectively [13][14][15].One of the best advantages of this filter is that it tends to keep features of the distribution such as relative with, maxima and minima which are often flattened by other smoothing techniques such as MA [13][14][15].

Proposed Method
Proposed method is a three-stage procedure that is described below: 1) First the original signal is filtered by MA or Savitzky-Golay filter to represent the important underlying unadulterated from of the time series by attenuating its short-term variations.The MA is very fast and it can be implemented simply.The main advantage of Savitzky-Golay filter is that it tries to preserve the features of time series such as its relative minima and maxima, which it is very important issue in segmentation a signal.Unlike wavelet, these filters don't have shifting effect after filtering the signal which is very important characteristic to detect true boundaries of epochs.Mathematically, these filters described in Section 2.1 and 2.2, respectively.
2) As mentioned before, two sliding windows move along the signal and for each window G is determined to find the boundaries of the signal segments.the G variations are computed as follows: where m is the number of the window; A 1 and F 1 are constant coefficients which change in various applications.
3) Local maxima in the G function, above a threshold, the mean value in the distribution of the G defined before, identify the boundaries of the segments.

Performance Evaluation
The following methods were implemented using MAT-LAB R2009a from Math Works.The performance and efficiency of these methods were evaluated using 50 synthetic multi-component data, and real EEG data.

Figures 1(a) and (b)
show 50 seconds of original signal and the result of applying simple Modified Varri, respectively.Figure 1(b) dedicates that this algorithm cannot detect one segment boundary of the signal.Also, obtained output shows this method has three False Boundaries (FBs).The threshold value for this method is chosen mean value of obtained output.where N t , N m and N f represent the number of true, missed, and falsely detected and N shows actual number of segment boundaries.

Real EEG
In Table 1 the results of segmentation for 50 synthetic data using the proposed methods are shown next to the results of simple Modified Varri method.The obtained results indicate which those proposed methods using filters such as MA and Savitzky-Golay as a pre-processing step can improve TP, FN, and FP ratios.As can be seen in Table 1, TP, FN and FP ratios obtained of Modified Varri with Savitzky-Golay filter are better than Modified Varri with MA filter.By using Savitzky-Golay filter we can achieve TP and FP ratios equal to 100% and 24% on a set of 50 synthetic signals without noise, respectively.Electroencephalography (EEG) is the neurophysiologic measurement of the electrical activity of the brain using electrodes which are placed on the scalp [4].As described before, signal segmentation is a pre-processing step for EEG signals.In this part we have used a real newborn EEG signal that is shown in Figure 4(a).The length of this signal and the sampling frequency are 500 milliseconds and 256 Hz, respectively.
The result of applying simple Modified Varri and Modified Varri with Savitzky-Golay filter is shown in Figures 4(b) and 5(c), respectively.In this paper, for real   EEG data, we have used an order 3 polynomial Savitzky-Golay filter and the frame size of 51 samples.In Figure 5(c) can be seen that all five segments segmented accurately.It should be mentioned that simple Modified Varri could not detect one segment boundary of the signal.We can see the influence of this method compared with achieved outputs.

Conclusion
One of the existing methods for signal segmentation is Modified Varri.Because real signals usually include different noises, this method is unreliable for segmenting a signal.For overcoming this problem we use moving average and Savitzky-Golay filters.These filters reduce short-term noise for a signal that caused that the reliability of this method increased considerably.Although moving average filter is easier and simpler than Savitzky-Golay filter, but performance of using Savitzky-Golay filter is better than moving average.The results indicate that the Modified Varri method with moving average filter has better performance compared to the simple Modified Varri and the Modified Varri with Savitzky-Golay filter has better accuracy comparing with moving average filter.

Figure 1 (
a) is also segmented using Modified Varri with MA as a pre-processing step (Figure 2).Also in Figure 3, Savitzky-Golay filter is applied as a pre-processing step.In this paper, for synthetic data, we have used an order 3 polynomial Savitzky-Golay filter and the frame size of 9 samples.As can be seen in Figures 2(c) and 3(c), the boundaries for all seven segments can be accurately detected.In order to make the signals more similar to real signals, Gaussian noise is added to original signal and then the performance of the proposed methods are assessed.In this paper 50 synthetic multi-component signals are used.Three parameters are used to assess the performance of the proposed methods: True Positive (TP) Miss or False Negative (FN) and False Alarm or False Positive (FP) ratios.These parameters are shown follows:

Figure 2 .
Figure 2. Signal segmentation in synthetic signal.(a) Original signal; (b) Filtered signal by MA; (c) Output of the Modified Varri (window length = 2 s, A 1 = 7 and F 1 = 1).As can be all seven segments can be truly detected.

Figure 5 .
Figure 5. Signal segmentation in real EEG data.(a) Original signal; (b) Filtered signal by Savitzky-Golay filter; (c) Output of the Modified Varri (window length = 0.04 s, A 1 = 7 and F 1 = 1).As can be all five segments can be truly detected.