Experimental Design for Optimizing a Mixture of Materials plus an Evaporating Solvent

The paper illustrates a further new way of using the CARSO procedure for response surfaces analyses derived from experimental designs based on Double Circulant Matrices (DCMs). We report a case study regarding a design based on a mixture of three chemicals plus an evaporating solvent, in order to compare the relative reliability of designs based either on 3 only or on all 4 factors. We show that both designs give the same results, but the second is preferable because it represents the real situation at the beginning of the process, so that it is possible to know the required amount of solvent that should be used for each experiment. Obviously this applies to any number of factors using the correspondent DCM.


Introduction
Following our previous paper published last November [1], where we have showed a new strategy for collecting the data needed for defining a response surface on the basis of an innovative strategy that requires only a very low number of experimental data, based on Double Circulant Matrices (DCMs), thus obtaining better results with respect to the previous method based on D-optimal design [2], for the objective of industrial research in reaching the best level of technological properties the mixture is expected to exhibit.The DCMs have similar requirements to Central Composite Designs (CCDs), which represent the best way to generate response surfaces.The final response surface model is obtained by the formerly developed CARSO method [3], where the surface equation is derived by a PLS model on the expanded matrix, containing linear, squared and bifactorial terms, and studied at extreme points by Lagrange analysis.Now we present the results for the cases where one of the components of the mixture is a solvent, which vaporizes during the deposition.Indeed this situation is quite frequent in painting industrial materials and we wish to understand what may be the best way in deciding the experimental design.The most intriguing case is a mixture of three materials and an evaporating solvent.

Problem Formulation
In principle it would be possible using either a three or a four materials mixture, depending upon excluding or including the solvent, but this would generate two different nomenclatures for the same formula, which may be difficult to handle.Therefore we decided that, since the solvent must be used from the beginning, it seems appropriate to include it into the experimental design, even if at the end of the process it will not be present any more.This will simplify a lot the handling of the experimental work.The solvent is a temporary ingredient which brings the solid chemicals into a fluid state.It is however desirable that the mixture should very quickly assume the solid state.Solvents remain in the mixture during the whole process of printing until they are removed immediately after impression.They are practically absent in the finished print, but provide an extremely important expedient in all stages before drying.The objective of this paper is to investigate whether the experimental designs with 3 or 4 variables give the same or different results.

Experimental Design and Data Collection
The experimental design with coded data is that used in ref. 1 on DCM4, but using only one central point (Table 1), whereas in Table 2 we show the decoded (true) values of the variables, and in Table 3 we report the true compositions of our nine mixtures (objects).The experimental collection of the technological property on polypropylene for each of these mixtures is reported in Table 4.The variables x 1 , x 2 and x 3 are solid components, while x 4 is the volatile part of the mixture.
As we stated in our previous paper [1] the response property should be modelled by the linear, quadratic and bifactorial terms: in other terms the matrix expansion is needed for obtaining a quadratic model.
On using only the non evaporating compounds the expansion is the following: After expansion the X block is reported in the following Table 5.

Linear PLS Models on an Expanded Matrix
For maintaining the maximum of the symmetry of the experimental design we decided to use a single point at the centre.The absence of replicates does not destroy the extraction of systematic information, if it exists in the experimental plan.We first adopted linear models on an expanded matrix for our need to develop, within the CARSO method [3], the GOLPE procedure [4] [5], aimed to describe the y parameter by a quadratic model, in order to find out the best combination of x variables for its highest value.Indeed, for searching a minimum we should find the maximum of the inverse of y.
The characteristics of the models DCM4 DCM4-S are very close.Both use nine objects, one dimension, show good plots, and similar values (0.33 and 0.35) for the SDEC (Standard Deviation of Error of Computations) and for the y variance explained (93.6% and 93.1%).However, the best comparison can be derived by the coefficients evaluating the relative importance of each x parameter (GAM1) listed in Table 6.

Comparative Evaluation of Variables Contributions
The most important step of the SIMCA philosophy is the dissection of each element of a table of data into two contributions, relative to the objects (scores, that evaluate the relative position of each of them into the reduced space of the main information) and to the variables (loadings, that evaluate the relative importance of each variable so that the sum of their squares is equal to one) [6].
In PCA, where there are no dependent variables, the parameters of x variables are called beta and their sum of squares is equal to one.The same parameters in PLS are called gamma: they are very similar to the beta, but their sum of squares is not equal to one.However, the program inventors using gamma instead of beta gives better predictions [7].
The terms gamma 1 reported in Table 6 indicate the gamma values (i.e. the relative importance of each variable) for the first latent variable.The results obtained are the core of this paper.The first and main result is the almost identical gammas for the two models, thus demonstrating that the experimental designs with four variables (including the solvent) and that with 3 variables (without the solvent) give exactly the same picture.
The relative importance of variables in the model guided by the values of the technological property shows that the main importance in the mixture of variables are x3 (gamma 0.67/0.68)and x1 (gamma −0.66/−0.67).Since the squares of these figures are around 0.45 their sum reaches 90%: as a consequence on diminishing the amount of x1 and increasing x3 we could obtain a mixture with a slightly higher y value.The relative contributions of the other variables are much lower: namely the squared gamma 1 for x2 (−0.23/−0.24,6%) and for the bifactorial x2x3 (0.17, 3%).
In other words the model DCM4 shows: a) neglecting roles for all terms containing x4 (vaporizing solvent), and b) the same coefficients of the quadric for both the PLS model with or without x4.Even more it allows to know from the beginning how much solvent should be used for each mixture to be tested.

Application of DCM4 and DCM4-S Models on the Training and Testing Set
In this section we wish to show, in Table 7, the values of the results of our investigation of the technological property on polypropylene: the first 9 objects are measured experimentally (training set), while the subsequent 16 objects (testing set) show their predictions by the model obtained from the first block.We decided to use the testing set by selecting the extreme points of the domain covered: since we have 4 variables we have an hypercube with 16 vertices, ordered according Yates.The results confirm once more that the two models are equivalent, but the planning was meant to this end and they are somewhat flat and cannot find significant improvements of the response.Data collected on polystyrene (not reported) show exactly the same trend.
The results reported in the first part of Table 7 list the real y values (y exp) of the objects of the training set, and their predictions by each of the two models DCM4 and DCM4-S, i.e. with and without x4, that are very close, thus confirming that x4 does not contribute to the mixture performance.
Columns 10 and 11 list the differences between the experimental and the real data: the larger differences are shown by the objects 4, 7 and 9 (from −0.73 to 0.70), whereas column 12 lists their differences.In order to explore the range of variation outside the experimental design we decided to use as testing set all the extreme points of the domain covered.Since we have 4 variables we have a hypercube with 16 vertices, The second part of Table 7 reports the predictions of y for the 16 hypervertices of the 4D space.
On inspecting the training set we find only 3 objects out of 9 with significant difference between experimental and prediction value, so that the model is reliable.However, since all predictions are somewhat similar to the experimental values: this means that the surface model is flat, without sufficient information for finding the way for increasing the technological property.The results confirm once more that the two models are equivalent, but since the planning was meant to this end the models cannot find significant improvements of the response.Also the data collected on polystyrene (not reported) show exactly the same trend.
To help rising the response we can go back to inspect the relationship between y values and the amounts of the x values.The results on the vertices H5, H7, H13 and H15 show that reducing x1 and increasing x3 should increase the response.

Extension to DCM5
The same strategy can be also applied to larger DCM matrices, always selecting the first submatrix and the most different sequence of the other coded values.In order to illustrate how to perform a plan of experiments with a DCM5, we show in Table 8 a selected couple of the 24 submatrices (1 + ?) with the maximum diversity in the sequence from x2 to x5 coded values.The complete plan requires 11 experiments.The positive and negative values reported as 0.7 and −0.7 should be red 0.71 and −0.71.Details will be reported in a future paper.

Table 2 .
Real values of 4 variables by simple linear functions, on defining the highest and lowest levels for each variable by the equation in the last row.

Table 3 .
Real values of the compositions of the nine mixtures (objects) on reporting the real values ofTable 2 instead of the coded ones in Table 1.

Table 4 .
Experimental responses of the coded compositions of the nine mixtures

Table 5 .
Coded values of DCM and DCM4-S.

Table 6 .
Comparison of the parameters gamma 1 for the two models.

Table 7 .
Results for the training and testing sets.

Table 8 .
Experimental plan for a DCM5.