^{1}

^{*}

^{2}

^{2}

The data topology structure of uniform experiment design (UD) is too complex to be reasonable regressed. In this paper, the principle and method of distinguish the training data and testing data were described to make a reasonable regression when uniform experiment design combined with support vector regression (SVR). Two equivalent ways which were the smallest enclosing hypersphere perceptron (SEH) and the enclosing simplex perceptron (ES) were provided to discover the topology relationship of the process parameter datum. To give an application, a series of experiments about laser cladding layer quality were conducted by UD to get the relationship of load, velocity and wearing capacity. Results showed that only the testing datum recommended by the two perceptrons got a good forecasting by SVR. Therefore, the two perceptrons could guide experiments with process parameter data of complex topology structure. Further, the application could be extended over a much wider field of experiments.

Many researches focus on experimental design combining with nonlinear regression. The experiment design method includes uniform design, central composite experimental design and Taguchi’s approach; the nonlinear regression method includes artificial neural network, support vector machine and so on. Some focus on the experimental design optimizing the nonlinear regression parameters; others focus on the nonlinear regression optimizing the experimental design to get the best process parameters under the desired results.

Yanwei Li et al. [_{6}Al_{4}V powder on TC4 substrate. Response surface methodology was used to build the mathematical model. Dongxia Yang et al. [

However, an important step has been neglected in existing research, which is the regression validity. The searching point might go out of the experimental domain, which is not obvious to know. So the searching point will find a bad forecasting value on the regression surface, particularly when the process parameters arranged by the experimental design have complex topology boundary, for example the uniform design.

In this paper, we choose a case of uniform design combining with support vector machine to proposed two perceptrons to determine the parameters topology boundary (distinguish training data and testing data). And then an experimental datum set of wear behavior of laser cladding layer is studied to show the function of the two perceptrons.

The process parameter vector should be inside the domain of the experiment, if not, the forecasting is not reasonable because of data absence. For example, in

Should the forecasting value always be constant outside the experiment data space because we do not do the experiment? Certainly, no. So, the SVR forecasts well within the experiment data space, while it could not forecast the outside of the experiment data space.

The forecasting is false outside the experiment domain by SVR. The curve/surface goes horizontally towards a constant value (which is the constant b (Equation (9)) of the pattern function of SVR) outside the experiment domain and causes a false forecasting.

Although distinguishing the inside and outside of the experiment data space is easy in one-dimension (only one parameter in

In this paper, the task is to distinguish the training data and the testing data among lots of experimental data when the experiment is conducted by UD. The key point is whether the testing data lie inside the training data domain or not, because the function of regression is generated by the training data. So, if the testing data lie outside the training data domain, the regression is not reasonable because of data insufficiency.

Two equivalent perceptrons are provided to discover this topology relationship in higher dimension of parameter space. They are the smallest enclosing hypersphere (SEH) perceptron and the enclosing simplex (ES) perceptron.

The SEH in a feature space defined by a kernel k enclosing a dataset

The pattern function is:

where:

And k is a radical base kernel:

where

Another simple method could judge whether a point lies inside or outside a point set. A point inside the specimen point set should be enclosed in the simplex consisted of n closest points. In s dimension space, n equals to s + 1. For example, in 2D space, an internal point should be enclosed by the 3 closest points (for example, in

Generally, in s dimension space, there are s + 2 determinants about a point

If

The distance L between two points

The ES perceptron includes distance calculation, reorder, and determinant calculation, while the SEH perceptron iterates depend on 3 artificial parameters (

Considering a training dataset:

Choosing parameter

With the optimization solution

where

SVR could get a nonlinear regression function g(x) based on a training dataset without artificial judgment of the function power ahead of time. And the training dataset is recommended by the SEH or ES perceptron.

UD was proposed by Wang Yuan and Fang Kaitai [

A laser cladding layer quality forecasting experiment is conducted for a clear view of the principle. We coat the Ni-based alloy on the CrMo in order to develop the resistance to wear of the substrate material via laser cladding process. The experiment task is to get wearing capacity under the variety of the combinations of the load (the pressure to the material) and the loading velocity (the velocity of moving the material). So, in this experiment, we have two experimental factors (process parameters): load and velocity, and the experimental target is wearing capacity.

The levels number of the load and velocity could be given as any natural number you want, and it means that you should do much more experiments if the levels number is higher. When the experiment factors and levels are determined, the experiment could be arranged by UD table and its application table which could be known in Ref. [

Here, the levels are given as 16, so, it means that we need only 16 experiments (orthogonal experiment needs 16^{2} = 256 experiments). The levels of the load and velocity are respect 1 - 16 Mpa and 0.24 - 3.87 m/s. The combination of the two parameters and corresponding experimental results are listed in

The wearing capacity varies with different load and velocity, so the regression aim is to obtain the function: wearing capacity = f (load, velocity). But it is troublesome that the domain is not in good order. If the testing data we choose lie outside of the domain, the testing data is invalid. Here, we use the SEH or ES perceptron to judge whether the testing data lie inside the domain or not.

Firstly, the process parameters are normalization. The corresponding nondimensional quantities are shown in

Order number | Load l (Mpa) | Velocity v (m/s) | Wearing capacity w (mg) |
---|---|---|---|

1 | 1 | 2.42 | 1.56 |

2 | 2 | 0.73 | 91.38 |

3 | 3 | 3.14 | 21.45 |

4 | 4 | 1.45 | 9.53 |

5 | 5 | 3.87 | 90.18 |

6 | 6 | 2.18 | 38.01 |

7 | 7 | 0.48 | 100.58 |

8 | 8 | 2.90 | 127.95 |

9 | 9 | 1.21 | 44.91 |

10 | 10 | 3.63 | 141.38 |

11 | 11 | 1.93 | 74.76 |

12 | 12 | 0.24 | 126.53 |

13 | 13 | 2.66 | 147.12 |

14 | 14 | 0.97 | 11.44 |

15 | 15 | 3.38 | 212.07 |

16 | 16 | 1.69 | 204.93 |

Order number | Load l | Velocity v | Wearing capacity w |
---|---|---|---|

1 | 0 | 0.6 | 0 |

2 | 0.0666666666666667 | 0.133241379310345 | 0.426705301159035 |

3 | 0.133333333333333 | 0.8 | 0.0945278358350751 |

4 | 0.2 | 0.333241379310345 | 0.0379218126543796 |

5 | 0.266666666666667 | 1 | 0.421005130153905 |

6 | 0.333333333333333 | 0.533241379310345 | 0.173142694280828 |

7 | 0.4 | 0.0664827586206897 | 0.470406612198366 |

8 | 0.466666666666667 | 0.733241379310345 | 0.600418012540376 |

9 | 0.533333333333333 | 0.26648275862069 | 0.205918677560327 |

10 | 0.6 | 0.933241379310345 | 0.664212426372791 |

11 | 0.666666666666667 | 0.46648275862069 | 0.347710431312939 |

12 | 0.733333333333333 | 1.95904116683588E−18 | 0.593767813034391 |

13 | 0.8 | 0.66648275862069 | 0.69147824434733 |

14 | 0.866666666666667 | 0.2 | 0.04678890366711 |

15 | 0.933333333333333 | 0.86648275862069 | 1 |

16 | 1 | 0.4 | 0.965941478244347 |

where p is given process parameter, p_{min} and p_{max} are the minimum and maximum of p, and u is the unitary processing result. The 16 load-velocity 2D points are shown in

The strategies described in this paper have been implemented in Delphi package, and the artificial parameters for SEH are given as: δ^{2} = 0.5, iteration step length = 0.01, iterations = 100000. There are no artificial parameters for ES.

If the training data and testing data are chosen randomly (among the 16 points), the testing data might lie outside the training data, so the regression analysis is invalid. For example, if point 15 (an edge point)is chosen as the testing data and the remain as the training data, it is found that point 15 lies outside the training data, and this could be felt by the SEH perceptron, as shown in

However, if an internal point is chosen as the testing data, the situation will be changed. As shown in

If a point at the edge of the training data which is also in the SEH is chosen as a testing datum, the forecasting result will be just as good as shown in

When point i is a testing datum and others (exclude i) are training data, the forecasting value is compared with experimental value in

Category | Order number | Regressive value | Experimental value | Error |
---|---|---|---|---|

Training data | 1 | 9.99999999998066E−5 | 0 | 9.99999999998066E−5 |

2 | 0.426605301159035 | 0.426705301159035 | 0.000100000000000194 | |

3 | 0.0946278358350749 | 0.0945278358350751 | 9.9999999999808E−5 | |

4 | 0.0380218126543794 | 0.0379218126543796 | 9.9999999999808E−5 | |

5 | 0.420905130153904 | 0.421005130153905 | 0.000100000000000193 | |

6 | 0.173242694280828 | 0.173142694280828 | 9.99999999998061E−5 | |

7 | 0.470506612198366 | 0.470406612198366 | 9.99999999998062E−5 | |

8 | 0.600318012540376 | 0.600418012540376 | 0.000100000000000191 | |

9 | 0.205818677560327 | 0.205918677560327 | 0.000100000000000193 | |

10 | 0.664312426372791 | 0.664212426372791 | 9.99999999998063E−5 | |

12 | 0.593667813034391 | 0.593767813034391 | 0.000100000000000191 | |

13 | 0.69157824434733 | 0.69147824434733 | 9.9999999999804E−5 | |

14 | 0.0468889036671098 | 0.04678890366711 | 9.99999999998016E−5 | |

15 | 0.9999 | 1 | 0.0001 | |

16 | 0.965841478244347 | 0.965941478244347 | 0.000100000000000188 | |

Testing data | 11 | 0.290105323107535 | 0.347710431312939 | 0.057605108205404 |

Category | Order number | Regressive value | Experimental value | Error |
---|---|---|---|---|

Training data | 1 | 0.000100000000000065 | 0 | 0.000100000000000065 |

2 | 0.426605301159035 | 0.426705301159035 | 9.9999999999935E−5 | |

3 | 0.0946278358350751 | 0.0945278358350751 | 0.000100000000000066 | |

4 | 0.0380218126543797 | 0.0379218126543796 | 0.000100000000000067 | |

5 | 0.420905130153905 | 0.421005130153905 | 9.99999999999346E−5 | |

6 | 0.173242694280828 | 0.173142694280828 | 0.000100000000000065 | |

7 | 0.470506612198366 | 0.470406612198366 | 0.000100000000000066 | |

8 | 0.600318012540376 | 0.600418012540376 | 9.99999999999322E−5 | |

9 | 0.206018677560327 | 0.205918677560327 | 0.000100000000000065 | |

10 | 0.664312426372791 | 0.664212426372791 | 0.000100000000000065 | |

11 | 0.347610431312939 | 0.347710431312939 | 9.99999999999336E−5 | |

12 | 0.593667813034391 | 0.593767813034391 | 9.99999999999296E−5 | |

13 | 0.691578244347331 | 0.69147824434733 | 0.000100000000000063 | |

14 | 0.0468889036671101 | 0.04678890366711 | 0.000100000000000061 | |

15 | 0.9999 | 1 | 9.99999999999317E−5 | |

16 | 0.965841478244347 | 0.965941478244347 | 9.99999999999297E−5 |

If point 5 is chosen as a testing datum, the three closest points are 3, 8, 10, (

Category | Order number | Regressive value | Experimental value | Error |
---|---|---|---|---|

Training data | 1 | 9.99999999998298E−5 | 0 | 9.99999999998298E−5 |

2 | 0.426605301159035 | 0.426705301159035 | 0.0001 | |

3 | 0.0946278358350749 | 0.0945278358350751 | 9.99999999998314E−5 | |

4 | 0.0380218126543795 | 0.0379218126543796 | 9.99999999998319E−5 | |

5 | 0.420905130153904 | 0.421005130153905 | 0.0001 | |

6 | 0.173242694280828 | 0.173142694280828 | 9.99999999998295E−5 | |

8 | 0.600318012540376 | 0.600418012540376 | 0.0001 | |

9 | 0.206018677560327 | 0.205918677560327 | 9.99999999998301E−5 | |

10 | 0.664312426372791 | 0.664212426372791 | 9.99999999998302E−5 | |

11 | 0.347610431312939 | 0.347710431312939 | 0.000100000000000169 | |

12 | 0.593667813034391 | 0.593767813034391 | 0.0001 | |

13 | 0.69157824434733 | 0.69147824434733 | 9.99999999998279E−5 | |

14 | 0.0468889036671098 | 0.04678890366711 | 9.99999999998255E−5 | |

15 | 0.9999 | 1 | 0.0001 | |

16 | 0.965841478244347 | 0.965941478244347 | 0.0001 | |

Testing data | 7 | 0.511627554231799 | 0.470406612198366 | 0.0412209420334334 |

Testing datum | Forecasting value | Experimental value | Error |
---|---|---|---|

1^{*} | 0.0946121461542633 | 0 | 0.0946121461542633 |

2^{*} | 0.304917078854684 | 0.426705301159035 | 0.121788222304351 |

3 | 0.105857476045276 | 0.0945278358350751 | 0.0113296402102009 |

4 | 0.058732997030738 | 0.0379218126543796 | 0.0208111843763584 |

5^{*} | 0.697172327371774 | 0.421005130153905 | 0.276167197217869 |

6 | 0.23007565198397 | 0.173142694280828 | 0.0569329577031421 |

7 | 0.511627554231799 | 0.470406612198366 | 0.0412209420334334 |

8 | 0.638136672782094 | 0.600418012540376 | 0.037718660241718 |

9 | 0.228954275971837 | 0.205918677560327 | 0.0230355984115102 |

10 | 0.720643430299488 | 0.664212426372791 | 0.0564310039266972 |

11 | 0.290105323107535 | 0.347710431312939 | 0.057605108205404 |

12^{*} | 0.0340008498964455 | 0.593767813034391 | 0.559766963137946 |

13 | 0.62181360078527 | 0.69147824434733 | 0.06966464356206 |

14 | 0.017009590761577 | 0.04678890366711 | 0.029779312905533 |

15^{*} | 0.564395828283023 | 1 | 0.435604171716977 |

16^{*} | 0.142331101478923 | 0.965941478244347 | 0.823610376765424 |

^{*}Vertex of the specimen parameter points set.

If a point lies outside the domain slightly, ES will found this, while SEH won’t. So, the domain determined by ES is smaller than SEH.

The important contribution of this paper is to answer such a question: why and how to distinguish the training data and testing data when uniform experiment design combined with nonlinear regression.

In this paper, two equivalent perceptrons which are the SEH perceptron and the ES perceptron are proposed to discover the topology boundary of the process parameter vectors and to distinguish training data and testing data. The distinguishing procedure is to determine if a testing datum lies inside the training datum domain. To give an application, experiments about laser cladding layer quality forecasting are conducted to prove if it is better that SEH or ES combines with SVR. The forecasting values of the testing data recommended by the two perceptrons are compared with their experimental values which are conducted based on uniform design. Results show that only the testing data recommended by the two perceptrons get a good forecasting by SVR, and the domain determined by ES is smaller than SEH.

So, the two perceptrons could guide experiments with process parameter data of complex topology structure. Further, not restricted to the experiment in this paper, the application could be extended over a wider field of experiments.

This work was supported by China Post-doctoral Foundation No. 2012M520572, Tianjin Municipal Education Commission Grant No.20120401, and Tianjin Municipal Science and Technology Commission Key Grant No. 14JCZDJC39500.

No. | Three closest points | Determinants | Inside or outside the triangle | Recommend or not |
---|---|---|---|---|

15 | 13, 16, 10 | Outside | No | |

11 | 13, 9, 14 | Inside | Yes | |

5 | 3, 8, 10 | Outside | No | |

8 | 10, 6, 11 | Inside | Yes | |

17^{* } | 7, 2, 4, | Outside | No | |

18^{*} | 13, 11, 8 | Inside | Yes |

^{*}Represents the order number of new experiment.

^{*}Corresponding author.

None.