Criteria for Weighted Moving-Mean Method

The moving-mean method is one of the conventional approaches for trend-extraction from a data set. It is usually applied in an empirical way. The smoothing degree of the trend depends on the selections of window length and weighted coefficients, which are associated with the change pattern of the data. Are there any uniform criteria for determining them? The present article is a reaction to this fundamental problem. By investigating many kinds of data, the results show that: 1) Within a certain range, the more points which participate in moving-mean, the better the trend function. However, in case the window length is too long, the trend function may tend to the ordinary global mean. 2) For a given window length, what matters is the choice of weighted coefficients. As the five-point case concerned, the local-midpoint, local-mean and global-mean criteria hold. Among these three criteria, the lo-cal-mean one has the strongest adaptability, which is suggested for your usage.


Introduction
The moving-mean method is commonly used for removing data noise or for finding trend curve of a given data [1] [2]. The documented improvements on this method lied in weighting the smoothing coefficients in various ways, such as the exponential moving-mean [3], auto regressive moving-mean [4] and mean filtering [5]. When the value of the time series is affected by cyclical and random fluctuations, particularly for the case with large fluctuations, it is difficult to grasp the development trend of the event. The moving-mean method can meet this request. It eliminates the influence of these factors and draws out the trend study can find a uniform or regular criterion to select the window length and weight coefficient for different data, this will bring great convenience and feasibility to the application of the moving-mean.
The present study, regarding the window length, for m points moving-mean, uses the smoothing coefficient of 1/m for exploration. Note that the given data is , then the moving-mean formula for the k-th point is: Regarding the weight coefficient, this study takes a five-point moving-mean as an example. For the convenience of calculation, the moving coefficient is set to be symmetrical. The data after the moving-mean is:

Research on Window Length of Moving-Mean Method
By investigating many kinds of data, the results show that: Within a certain range, the more points which participate in moving-mean, the better the trend function. However, in case the window length is too long, the trend function may tend to the ordinary global mean, the range of the trend line obtained after the moving-mean will become smaller, and it will not reflect the fluctuation of the data (see Figure 1(a)). This situation is particularly evident in the periodic function (see Figure 1(b)). It can also be seen from the figure that when the number of smoothing points is too large, the local fluctuation direction of the trend line does not match the original data, and is greatly affected by the surrounding data.
For the periodic change data, after multiple-point smoothing, it is found that the best number of moving points is the number of points close to the period but not exceeding the period. For example, the data with a period of 12 points has the best effect when the number of smooth points is 11 points (see Figure 2). In

Research on Weight Coefficient of Moving-Mean Method
In the sense of least squares, the variance between the trend line and the data is Journal of Applied Mathematics and Physics It can be seen that the variance is the smallest when the smoothed data is coincident with the original data. Therefore, it is explained that the moving average cannot solve the variance of the smoothed data and the original data in the sense of least squares.
In this study, we propose the following three criteria for the selection of the moving-mean weight coefficient: 2) local-mean criterion.

Local-Midpoint Criterion
Between a maximum point and two adjacent minimum points, or between a minimum point and two adjacent maximum points, this is a period. That is to say, there is half a period between two adjacent extreme points. The pole symmetry refers to the midpoint of the line segment between adjacent maximum points and minimum points. All symmetrical midpoints are on the zero value line, where the midpoint is called the local midpoint.
By performing cubic spline interpolation on the obtained local midpoints, a median curve ( ) m k can be obtained. Using this curve as a criterion, the variance 2 ε between the data after the multiple-point weighted moving-mean and the data corresponding to the interpolation line of the local midpoint is obtained by using formula (1.3). And solve the corresponding weight coefficient when the variance is minimum.
The second chapter says that the moving-mean corresponds to the window length should be close to but not more than the number of periodic points. This study focuses on the 5-point weighted moving-mean as in formula (1.2). It is a bit less when it is for 12 points of data for the season change cycle. The calculated results are a = 0, b = 0, c = 0.5. It is suitable for wind speed data with many Journal of Applied Mathematics and Physics varying frequencies. Therefore, this study solves the corresponding smoothing coefficients for the 15 groups of wind speed data with the smallest variance.
When the calculation accuracy is 0.1, 11 groups of data have a result of a = 0.3, b = 0.2, c = 0.15; 3 groups of data have a result of a = 0.4, b = 0.2, c = 0.1, and 1 group of data has a result of a = 0.3, b = 0.3, c = 0.05. The line after the moving-mean is smoother. Figure 3 is the local midpoint interpolation line and the smooth graph of corresponding coefficients of one set of wind speed data. For the convenience of observation, the same set of data was used for the following study. Referring to the two-line interpolation in the ESMD method [6] [7], the local midpoints here is divided into the odd-order midpoint interpolation line and the even-order midpoint interpolation line according to the coordinates, and the boundary processing is the same as the ESMD method. Then take the mean of the two interpolation lines. The smoothing coefficient with the smallest variance is solved by using this mean line as a standard. In the end, most of the data is obtained as a = 0.2, b = 0.2, c = 0.2, which is the same as the traditional 1/5 moving effect.

Local-Mean Criterion
Different from Smith's local mean method [8], the data in each half cycle of this study is all added and divided by the corresponding number, which is the local mean. Cubic spline interpolation is performed on the obtained local mean to obtain a mean curve ( )  Figure 4 is the local mean interpolation line and the smooth graph of corresponding coefficients of one set of wind speed data.
Most of the results are basically similar to the local-midpoint results, but there are 5 groups of data that are different from its results. When the calculation accuracy of the five groups of data is 0.01, it is found that the weight coefficients obtained by the local-mean and the local-midpoint are very similar (see Table  1), which indicates that the calculation accuracy is limited.

Global-Mean Criterion
It has been explained above that in the case of least squares, the case where the Journal of Applied Mathematics and Physics variance of the moving-mean and the original data is the smallest is consistent with the original data. If the global-mean is used as the standard, the calculated result will be very different from the original data. Therefore, considering the original data and the global-mean as the constraint of the coefficient solution, the global-mean criterion is proposed.
The global-mean criterion refers to first finding the variance between the smoothed data and the original data, and then finding the variance between the smoothed data and the global-mean of the original data, and finally summing the two variances. The weight coefficient corresponding to the minimum sum is the best coefficient. The line is not smooth after the moving-mean processing. Figure 5 is a smooth graph of corresponding coefficients of wind speed data.

Comprehensive Comparison of Criteria
The study [9] has shown that in the processing of asymmetric data, the local mean is more representative than the local midpoint. However, for general data, the two methods have little difference, which is why the weight coefficients obtained by the local-mean criterion and the local-midpoint criterion are relatively close. Therefore, in the comprehensive comparison here, the local-midpoint and the local-mean are classified into one category.
The local-mean criterion, the global-mean criterion, the traditional criterion are comprehensively compared. The comparison content includes the variance of the smoothed data and the original data, the weight coefficient corresponding to the minimum variance, and the characteristics of the data line after the smoothing process. The unit of variance is the square of the original data unit. The unit of standard deviation is the same as the unit of the original data. It is relatively intuitive to help the actual physical meaning. Moreover, the value of the variance is small, and it is difficult to observe the relationship between the data, so the difference between the data and the original data after 15 groups of data is smoothed is compared by the standard deviation (see Table 2). Journal of Applied Mathematics and Physics It can be seen from Table 2 that the variance and corresponding weight coefficients of the local-midpoint and the local-mean criterion are similar, but the variance of the local-mean is generally smaller. This is because for a uniformly distributed data, the position of the local-midpoint is similar to the local-mean, but the local-midpoint is susceptible to extreme points, so the local-mean can better reflect the trend of the data than the local-midpoint.    Non-smooth, can retain peaks, and the degree of noise reduction is generally greatly after smoothing.
3) The smooth line under the local-mean criterion is in the whole, and the smoothed data fluctuation is small, which achieves better noise removal effect and can better observe the overall trend of data.
As can be seen from the comprehensive comparison in Table 3, in general, the variance of the local-mean criterion smoothed data and the original data is the smallest, and the smoothed data of the local-mean criterion is smoother than the data obtained by other criteria. However, for the data with more data in the cycle, its Smoothing speed is slower than the other two criteria.

Conclusions
This paper draws some uniform conclusions through the study of the weighted moving-mean method. The extent to which the data is smoothed depends on the choice of the moving-mean window length and the weight coefficients. The selection of these two parameters needs to be selected according to the change pattern of the original data.
Regarding the smoothed window length, within a certain range, the more points which participate in moving-mean, the better the trend function. However, in case the window length is too long, the trend function may tend to the ordinary global mean, the range of the trend line obtained after the moving-mean will become smaller, and it will not reflect the fluctuation of the data. This situation is particularly evident in the periodic function. When there are too many smoothing points, the local fluctuation arc of the trend line does not match the original data, and is greatly affected by the surrounding data. For periodically varying data, the best number of moving points is the number of points that is close to the period and does not exceed the period.
Regarding the choice of smoothing weight coefficients, this study compares the three newly established criteria with traditional ones. 1) If the smoothed data is to achieve the noise removal effect and the change is smooth, the local-midpoint criterion and the local-mean criterion can be used to solve the weight coefficients.
2) If you want the data to be smoothed quickly without being affected by intermediate data, you can use the traditional criterion to solve the weight coeffi-cients. 3) If you want the data to retain the peak of the detail and the smoothness is greater, you can use the global-mean criterion to solve the weight coefficients.
By observing the experimental data, it is found that the weight coefficient obtained by the five-point weighted moving-mean under the local-midpoint and local-mean criteria is similar to the coefficient of the binomial. Therefore, it is assumed that the n-point weighted moving-mean coefficient is similar to the (n − 1)-th binomial coefficient (that is, the n-th column coefficient of the Yanghui triangle). Whether there is a relationship between the two? It is also necessary to research a large number of data in a multiple-point smoothing experiment to make a specific judgment.