An Integrated Use of Advanced T 2 Statistics and Neural Network and Genetic Algorithm in Monitoring Process Disturbance

Integrated use of statistical process control (SPC) and engineering process control (EPC) has better performance than that by solely using SPC or EPC. But integrated scheme has resulted in the problem of " Window of Opportunity " and autocorrelation. In this paper, advanced T 2 statistics model and neural networks scheme are combined to solve the above problems: use T 2 statistics technique to solve the problem of autocorrelation; adopt neural networks technique to solve the problem of " Window of Opportunity " and identification of disturbance causes. At the same time, regarding the shortcoming of neural network technique that its algorithm has a low speed of convergence and it is usually plunged into local optimum easily. Genetic algorithm was proposed to train samples in this paper. Results of the simulation experiments show that this method can detect the process disturbance quickly and accurately as well as identify the disturbance type.


Introduction
In an intense market competition environment, product quality plays an important role in facing competition and gaining competitiveness.Both Statistical Process Control (SPC) and Engineering Process Control (EPC) are effective techniques of maintaining and improving the produce quality.EPC is used to adjust the variables for compensating the short-term output deviation by uncontrollable factors.In regard to long-term process improvement, SPC is effective technique which is used to detect out-of-control conditions and remove the controllable factors.So, lots of scholars have proposed the integrated use of SPC/EPC.However, it is very difficult to monitor the EPC process using commonly SPC methods because of the problem of "Window of Opportunity" and autocorrelation [1].In the past time, monitored variable of SPC techniques was only process output.The information of process inputs was usually ignored.For the EPC processes, once output deviation is compensated by feedback-controlled action, there is only a short window of detecting process disturbance.Even SPC charts fail to detect out-of-control when output deviation is small because EPC's feedback mechanism can compensate for such small disturbance quickly and completely.And the optimality of SPC techniques rests on the assumption of time independence.However, process output of no same time is autocorrelation for each other.
To overcome these shortcomings, a little of papers have developed some joint-monitoring methods under the feedback control processes.These methods may be categorized into two aspects.The first is that various types of conventional SPC charts are integrated to monitor the process [2][3], such as Huang C.H proposed Shewhart control chart and Cusom control chart simultaneously to detect the manufacturing process.This method can detect out-of-control, also can recognize the disturbance type.However, the inherent problems of conventional SPC charts caused by the effects of feedback control actions have still not been solved.The second is that the strategy of jointly detecting the controlled outputs and manipulated inputs using bipartite SPC is suggested such as multivariate CUSUM chart, multivariate EWMA chart, T 2 statistics and multivariate profile chart [4,5].Although, these methods have solved effectively the autocorrelation problem, but the WO problem has not been settled completely and effectively because these methods can not monitor the small process disturbance quickly within the scope of the WO.Furthermore, these methods are not easy to identify the disturbance types which are crucial links of confirm and remove the controllable factors.
In this research, we put forward a new program of integrated use of T 2 statistics technique, artificial neural networks and genetic algorithm: use T 2 statistics technique to solve the problem of autocorrelation and information missing; adopt neural networks technique and genetic algorithm to solve the problem of "Window of Opportunity" and identification of disturbance causes.

Feedback-Controlled Process
For better understanding, we consider the following process under the feedback mechanism shown in Figure 1.
Where θ, Φ are constants.а t represents white noise which complies with a standard normal distribution with mean =0 and σ 2 =1.Also, let B be the usual backward shift operator, i.e., Bа t =а t-1 .m t represents random form of the process disturbance such as step change and process drift.y t is the measured output value.Without loss of generality, the target value is assumed to be zero.Then, y t represents the output deviation from the target value.
u t is the feedback control action decided by the feedback process mechanism.In the industrial practice, several feedback controllers are used such as PI controllers, I controllers, PID controller and EWMA controllers in which PID controllers are the most extensively adopted.Its feedback control rules can be expressed as [6].In light of the function 4, output at the different times is autocorrelation, and input at the different times is autocorrelation.Moreover, output and input are autocorrelation for each other.So, traditional SPC control charts, such as Shewhart chart, EWMA chart and Cusum chart are invalid to monitor the above process.

Design of Standard T 2 Statistics Technique
Standard T 2 statistics method is used to deal with the multiple-input process.In this paper, the devised approach is similar to the standard T 2 statistics but the data vectors are made up of the process input and output at the different times.It can measure the overall distance of observation from reference values including process output, input and covariance of output and input, hence, it will come to the most commonly used schemes.According to the function 4, complete monitoring information should include control action at time t and t-1, the process output at time t, t-1 and t-2.However, to detect the closed-loop process, the five sets are co-linear.In other words, arbitrary set is equal to a linear combination of other sets.So, we can only select two sets, three sets or four sets from the above five sets to make the monitoring scheme.We design the monitoring model of T 2 statistics as follows Where Σ is the covariance matrix of Z t .In light of the above analysis, the options N of Z t are equal to 432 555

NCCC =++
There is not commonly admitted approach for confirming the best Z t selection in the T 2 statistics model.Selection of the model parameter is based on the problem which will be solved.Therefore, the design of model is scientific as well as art.Hotelling, Montgomery and Alt discussed the possibilities and advantages of the T 2 statistics method used to monitor the EPC process [7][8][9].They designed the simplest and the most basic form of Z t , i.e.Z t = [y t ,u t ].On the basis of the above study, FUGEE TSUNG elaborated on the problem and proposed that one could define Z t =[y t ,u t ,u t-1 ,u t-2 ] T or Z t =[y t , u t , y t-1 , y t-2 ] T [1].However, in these methods, ∑ is not estimated from the historical data directly, but obtained from a very complex function based on the parameter of Φ, θ, k p ,k D ,k I .So the T 2 control chart is not available for these methods.
According to function 2 and 3, since outputs and inputs are correlated, all inputs can be expressed as the combination of the process outputs at different times.In other words, all information concluding the process inputs and the process outputs can be monitored as long as we detect the outputs at different times.We proposed to define Z t = [y t , y t-1 , y t-2 ,…, y t-s ] T .Selection of s value is a very difficult and challenging task.Now there is no universally recognized method for confirming the value.In this research, simulation experim-ents are implemented to determine the value of s.Aiming at each choice, experiments simulate the feedback-controlled process with the step-change step=5, 2 and 0.8.Value of the parameter Φ, θ, K P , K I and K D is randomly set to 0.8, 0.5, 0.5, 0.5 and -0.In light of these figures, when step values are the same, the larger is s, larger is the value of T 2 and the quicker is to detect the disturbance.However, the larger is s, the greater is false alarm such as Figures 2 to 6.To the process with step=5, when s is equal to 4, there are two out-of control points.However, when s is equal to 5, there are four out-of control points in which two points fall into false alarm.In the same way, to the process with step 2 and step 0.8, when s increases from s=2 to s=4, the dots of false alarm grow from 0 to 2.
To the different step changes, the smaller is step value, the larger is to need the value of s to detect the process.For example, it only need s=1 to monitor the process with step=5, but need s=3 to detect the process with step=2 and step =0.8.According to Figure 2-14 and the above analysis, Z t can be expressed as Z t = [y t , y t-1 , y t-2 , y t-3 ] T .In light of the function 7, y t is a linear combination of the y t-i , (i=0, 1, 2, 3).So Z t accords with a multivariate normal distribution and Z t has a chi-squared distribution with p degrees-offreedom.The control limit UCL for Z t should be χ 2 α,p .D t contains the information of output, input and correlation for each other.So the advanced T 2 statistics can solve effectively the problem of autocorrelation and reduce the problem of "Window of Opportunity".Moreover, it is difficult to interpret the results and search for the root cause of process disturbance once system monitored out-of-control such as Figure 15.
Figure 15 shows the process with the drift disturbance slope=1.But it has not essential distinction between Figure 15 and Figure 2-14 to identify the disturbance type such as significant upward or downward trend.So, advanced T 2 statistics technique can not be used solely.

Artificial Neural Networks
Artificial neural networks are modeled following the neural activity in human brain and rapidly developed since the last century 80's [10].The main characteristics of neural networks are the overall use of network, Largescale parallel distributed processing, Ability to study association, high degree of fault tolerance and robustness.However, neural networks are easy to fall into local optimum, slow convergence and cause oscillation effect.Genetic algorithm [11] has strong macro-search capabili-ties and greater probability of finding the global optimal solution.So, genetic algorithm can overcome the shortcomings of neural networks if it is used to finish the pre-search.In this paper, a novel algorithm combining neural networks algorithm and genetic algorithm was proposed.The framework of neural network is shown in Figure 16.
Network is composed of an input layer, a hidden layer and an output layer.Input layer has three neurons which are expressed as D t , D t-1 and D t-2 representing the parameter value of T 2 statistics at time t, t-1 and t- Input layer of the network is a key decision which has a great impact on the effectiveness of the network.Now there is no commonly accepted method for selecting the input layer.In this paper, an all-possible-regression analysis [12,13] is used to define the input layer according to the R 2 P , AIC and C P criterion.It is assumed that input layer is a possible combination from D t , D t-1 , D t-2 , D t -D t-1 , D t-1 -D t-2 , D t -D t-2 .Purpose of this method is to select a good combination so that a detailed examination can be made of the regression models, leading to the selection of the final input vectors to be utilized [12].The result is shown in Table 1.In light of the R 2 P , AIC and C P criterion, we select the (D t , D t-1 , D t-2 ) combination because it has the largest R 2 P , the smallest AIC and C P values in the Table 1.

Neural Network Training Based on Genetic Algorithm
1) Determination of fitness function Purpose of which genetic algorithm is used to optimize the network weights and threshold of neural network is to obtain the optimum combination of weights value and threshold.Output error measures the effect of combination.Hence, fitness function of individual chromosome should be the function of output error of BP network.Ideal output value is expressed as D j and actual output value is expressed as A j .The fitness function () fE can be written as 2 1 ()(()) (1) 2) Genetic manipulation Assumed that Group size is M and Fitness of individual i is F i .Individual probability of being selected can be expressed as follows: Arithmetic crossover operator is adopted which is specially used to solve floating-point cross, and uniform mutation operator is introduced.

Simulation Experiments
It is assumed that value of group size M, crossover probability, Mutation probability, training error and generation gap is 100, 0.8, 0.05, 0.005 and 0.7 respectively.To verify the performance of the above method, we make a great deal of simulation experiments on the actual production.The experiments are divided into three stages.
First, 500 "in-control" sample sets (m t =0) and 500 "out-of-control" sample sets (m t ≠0), each which involves 200 data points and generated from an AMAR(1,1) noise model, are selected to train the neural network.The 500 out-of-control samples sets perform a process which is upset respectively by step-change with the step of   0.5/1/2/3/5 at data 50 and is eliminated quickly and completely at the data 150.Likewise, the 500 out-of-control sample sets are generated with the process drift with the slope of 0.25/0.5/1/2/3at data 50 and are removed quickly and completely at data 100.Second, once output error is within the permitted scope, objectives of training the neural network based on genetic algorithm have been achieved successfully.The neural network can be used to monitor the process disturbance.200 out-of-control sample sets, which are generated with the use of step=0.5, 1, 1.5, 2, 3, 5 at time t and slope= 0.5, 1, 1.5, 2, 3 at time t, is given to verify the performance of the above method.The result is shown in Table 2.
At last, for comparison, Shewhart chart of Minitab software is used to simulate the above sample sets with step change.Result is shown in Table 3.

Result Analysis of Simulation Experiment
As seen from Figure 17 and 18, the alone neural networks need 1200 steps to converge at the error target value.However, neural networks based on the genetic algorithm only need 550 steps to converge at the error target value.So neural networks based on the genetic algorithm can reduced training time significantly.Its training speed is faster.Furthermore, if alone neural networks are used, error target value cannot gain when step is small such as 1 and 0.8.
In actual manufacturing industry, parameters often change with the change of environment.So we choose five combinations of Φ, θ, k p , k D and k I in order to verify the method and cover a reasonable range of the parameter space.In terms of the Table 1, the value of parameter Φ, θ has serious impact on the resolution capability of the integrated method.It is very applicable to combine a positive and large Φ with a positive and small θ.On the contrary, the combinations of a positive Φ and a negative θ worsen with the ability to identify the process disturbance accurately.There is no obvious correlation between change of the controller parameter k p , k D , k I and monitoring ability.With respect to the drift disturbance, step change is easier to be monitored.
According to Tables 3 and Table 4, the advantage of the integrated method is significant.The neural network requires only one sample to recognize the disturbance and identify the disturbance type.But Shewhart chart requires an average 3 to 7 samples to recognize the process disturbance with step 5.When step=2 and 3, an average of 70 to 100 samples are required to detect the disturbance.Even the disturbance with step=1 and 0.5 can not be monitored.

Figure 14 .
Figure 14.T 2 chart detects the disturbance with the parameter step=0.8 and s=4

Figure 16
Figure 16.A three-layer neural network

Figure 17 .
Figure 17.Training error curve of neural networks