Identify the Optimal Baseline Design from the Plackett-Burman Design ()
1. Introduction
Previous research on regular and nonregular designs has largely been based on orthogonal parameterization. In recent years, baseline designs based on baseline parameterization have garnered significant attention due to their wide applicability. Below, we will provide specific examples to explain what baseline parameterization and orthogonal parameterization are.
For an orthogonal design with
rows and
columns, where the factor levels are 0 and 1, select any
columns
from this design to form a set
. Let
denote the vector obtained by summing the corresponding elements of the columns in
. Let
denote the sum of the elements in the vector
. Let
. It is important to emphasize that when the factor levels of the design under discussion are −1 and 1, the definition of
undergoes the following corresponding changes:
. Where
represents the element in the
-th row and
-th column of the design. Regardless of whether the factor levels in the orthogonal design are 0, 1, or 1, −1. If
, then the
columns are said to be completely orthogonal. If
, then the
columns are said to be completely confounded. If
, then the
columns are said to be partially confounded. What has been introduced above is orthogonal parameterization. The specific details can be found in Deng and Tang (1999) [1]. The left table in Table 1 represents an orthogonal design with factor levels of 0 and 1, while the middle table in Table 1 represents an orthogonal design with factor levels of −1 and 1. The first and second columns of both tables are completely orthogonal, while the first, second, and fourth columns are completely confounded. The difference between baseline parameterization and orthogonal parameterization lies in the fact that, for a baseline design with
rows and
columns, the
factors do not exhibit orthogonality or confounding. Furthermore, in a baseline design, the factor levels are restricted to 0 and 1. When selecting
factors from the
factors in the baseline design, the interaction effects among these
factors can be represented by the product of the corresponding elements in the
columns. The specific details can be found in Mukerjee and Tang (2012) [2]. In Table 1, columns 1 to 5 in the right-hand table represent the design matrix of the baseline design, while column 6 represents the interaction between columns 1 and 2, and column 7 represents the interaction between columns 1, 2, and 3. Baseline design is a design where factor levels are restricted to 0 and 1, and the number of 0s and 1s in its design matrix can vary arbitrarily. In contrast, orthogonal design under orthogonal parameterization, the factor levels can be 0, 1, or −1, 1 and the design matrix must satisfy orthogonality between any two columns.
Table 1. The table on the left and the table in the middle represent orthogonal designs, while the table on the right represents the baseline design.
1 |
2 |
3 |
4 |
5 |
1 |
2 |
3 |
4 |
5 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
0 |
0 |
0 |
0 |
0 |
−1 |
−1 |
−1 |
−1 |
−1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
−1 |
−1 |
1 |
−1 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
1 |
−1 |
1 |
−1 |
1 |
1 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
1 |
1 |
0 |
−1 |
1 |
1 |
1 |
−1 |
0 |
1 |
1 |
1 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
−1 |
−1 |
1 |
−1 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
1 |
1 |
1 |
−1 |
1 |
1 |
1 |
1 |
0 |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
1 |
−1 |
−1 |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
1 |
−1 |
−1 |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
We use
to denote a baseline design with
rows and
columns. The matrix
is a matrix with elements 0 and 1, 0 represents the baseline level, while 1 denotes the test level. Let
be the set of all
submatrices of
. Unless otherwise specified, we will denote this set by
. Here is an example of a baseline design with 8 rows and 5 columns.
Mukerjee and Tang (2012) [2] proposed a theory related to the optimality of baseline designs. They focused on main effects designs and proved that designs with strength 2 have universal optimality in estimating main effects. When estimating main effects, the presence of active interaction effects can cause bias in the estimation of the main effects. Consider the baseline design
, where the
factors of
are denoted as
. Let
be any
numbers selected from
. We denote
as the interaction effects of the
factors
. Let
denote the number of occurrences of the row vector
in the submatrix of
formed by the
-th columns. Here, we denote
as an
column vector. We then define the
-th element of the vector
as follows. When
, let the
-th element of
be
; when
, let the
-th element of
be
, where
represents the ascending order of
. Mukerjee and Tang(2012) [2] pointed out that if there are interaction effects among
, these interaction effects will contribute a bias of magnitude
when estimating the main effects, where
. Here,
represents a column vector of size
, where all elements are ones. According to the principle that interaction effects of the same order are of equal importance, they provided that
. It is evident that
measures the bias effect of all
-th order interaction effects on the estimation of the main effects. Since lower-order interaction effects are more important than higher-order interaction effects, we aim to find a design that sequentially minimizes
. This is precisely the minimum
-aberration criterion proposed by Mukerjee and Tang (2012) [2]. The minimum
-aberration criterion is defined as follows in Definition 1.
Definition 1. Consider two baseline designs
and
with the same number of rows and columns, both of which are orthogonal designs with strength 2. Let
be the smallest integer at which the values of
for
and
first differ. If the value of
for
is smaller than that for
, then
is said to have a smaller
-aberration than
. A minimum
-aberration design is one in which no design exists with a smaller
-aberration.
Mukerjee and Tang (2012) [2] proved the following expression, where
, and let
denote the number of rows in
in which all elements are equal to 1.
(1)
where
and
. The derivation details of this sequence can be found in the literature by Mukerjee and Tang (2012) [2].
In order to reduce the computational effort required to find optimal baseline designs, Mukerjee and Tang (2016) [3] provided an equivalent transformation form
for
based on the original approach, and referred to
as a moment confounding of a design. The paper points out that the sequential minimization of a design’s
is equivalent to the sequential minimization of the design’s
. The method of finding an optimal baseline design by minimizing the sequence of
is applicable to both regular and nonregular designs. The definition of
is
. The matrix
is a baseline design matrix of size
with elements 0 and 1.
, where
is an
matrix with all elements equal to 1.
This paper starts with the Plackett-Burman design to identify designs with good
-aberration properties. Chapter 2 introduces the relevant symbols and definitions of the Plackett-Burman design. Chapter 3 introduces the properties of baseline designs when studying them starting from the Plackett-Burman design. Chapter 4 presents the application of the theory, where the minimum
-aberration subdesign with 7 columns and 24 rows was identified for a Plackett-Burman design with 24 rows and 23 columns.
2. Symbols and Definitions Related to Plackett-Burman Design
In this section, we introduce some symbols and definitions related to Plackett-Burman design that will be used in the derivations of subsequent lemmas or theorems.
We use
to denote a Plackett-Burman design of size
, where the elements are either 0 or 1. Plackett-Burman design is typically generated by a special cyclic method [4]. Let the first row of
be
.
is a row vector containing only 0 s and 1 s, with at least one occurrence of both 0 and 1. We represent
as
, The element
refers to the
-th element in the first row of
. Let us assume that
represents the second row of
. Then, the 2nd to the
-th elements of
are respectively equal to the 1st to the
-th elements of
. The first element of
is equal to the
-th element of
. That is,
. Let us assume that
represents the third row of
. Then, the 2nd to the
-th elements of
are respectively equal to the 1st to the
-th elements of
. The first element of
is equal to the
-th element of
. That is,
. By the same logic, we can derive
from
. Let
represent the
-th row of
. Here, we define
or
(This depends on the first line of the Plackett-Burman design). Therefore,
can be expressed as follows.
Here is an example of Plackett-Burman design. Consider an
Plackett-Burman design, where its first row is
According to the cyclic generation method of Plackett-Burman design, we can obtain its design matrix as follows.
3. The Properties of Plackett-Burman Design
Select an arbitrary
-column submatrix
from an
Plackett-Burman design. We let
represents the number of columns between the
-th column and the
-th column within the m columns of Plackett-Burman design, in this case,
, it is evident that
. Specifically, let
represent the number of columns between the
-th column and the
-th column within the
columns of Plackett-Burman design. Specifically,
is the sum of the number of columns after the
-th column and the number of columns before the
-th column within the
columns of Plackett-Burman design. It is evident that
.
is referred to as the distance vector of
. Let the set of distance vectors
be denoted as
.
To ensure the smooth progress of the proofs of the lemma and theorem in this section, we now provide the definitions of function
and function
, respectively. The definition of function
can be found in Equation (2), and the definition of function
can be found in Equation (3).
(2)
The Function
is a binary function, with independent variables
and
, where
,
. In Equation (2),
,
and
represent the
-th,
-th and
-th elements in the first row of Plackett-Burman design, respectively.
(3)
The function
is a unary function, with
as its independent variable, where
.
Lemma 1. Select two
-column submatrices,
and
, from an
Plackett-Burman design. Let the distance vectors of the two
-column submatrices be denoted as
and
, respectively. If
, then
and
have the same
.
Proof. Select two
-column submatrices,
and
, from an
Plackett-Burman design. Let the distance vectors of the two
-column submatrices be denoted as
and
, respectively. Now, let
. By considering the distance vector of
,
can be further expressed in the form of Equation (4).
(4)
Similarly, by considering the distance vector of
,
can be further expressed in the form of Equation (5).
(5)
(i)
. From Equations (4) and (5), it can be observed that when
,
and
are two identical
-column submatrices selected from an
Plackett-Burman design. Therefore, it is evident that
and
have the same
.
(ii)
. Without loss of generality, let
. Let the number of columns between the
-th and
-th columns in the
columns of Plackett-Burman design be
, thus, we have
. From this,
can be further expressed as
,
. The first row of
can be represented as
(6)
By considering the Plackett-Burman design generation method and the definition of Equation (2), it can be concluded that the
-th row of
can be represented as
(7)
The first row of
can be represented as
(8)
By considering the Plackett-Burman design generation method and the definitions of Equations (2) and (3), the
-th row of
can be represented as
(9)
The element at the
-th position in the
-th row of
is
,
. From Equation (2), we can deduce that
. Since
is the element in the
-th row and
-th column of
, it follows that the element in the
-th row and
-th column of
is equal to the element in the
-th row and
-th column of
.
By considering all values of
, it follows that the
-th row of
is identical to the
-th row of
. Since the last row of the Plackett-Burman design consists entirely of zeros or ones, by traversing
, it can be concluded that
and
have the same
. ◻
Lemma 2. Select two
-column submatrices,
and
, from an
Plackett-Burman design. Let the distance vectors of the two
-column submatrices be denoted as
and
, respectively. If both
and
belong to the set
, then
and
have the same
.
Proof. Select two
-column submatrices,
and
, from an
Plackett-Burman design. Let the distance vectors of the two
-column submatrices be denoted as
and
, respectively. Let both
and
belong to
. Similar to the discussion in Lemma 1,
can be further expressed as
,
.
can be further expressed as
,
.
(i)
. In this case,
and
have identical distance vectors. By Lemma 1, it follows that, at this point,
and
have the same
.
(ii)
. Let
and
. Discuss the values of
for
and
under these two distance vectors, respectively. As stated in Lemma 1, we only need to consider the case where both
and
are equal to 1. Now, let both
and
be equal to 1, so
can be further represented as
.
can be further represented as
. The first row of
is
(10)
Based on the Plackett-Burman design generation method and the definition of Equation (2), it can be concluded that the
-th row of
can be represented as
(11)
The first row of
is
(12)
Based on the generation method of Plackett-Burman design and the definitions of Equations (2) and (3), we can deduce that the
-th row of
can be represented as
(13)
The
-th element in the
-th row of
is
,
. From Equation (2), we can deduce that
. That is, the
-th element in the
-th row of
is equal to the
-th element in the
-th row of
. In particular, the
-th element in the
-th row of
is equal to
. From Equation (2), we can deduce that
. That is, the
-th element in the
-th row of
is equal to the first element in the
-th row of
. The first element in the
-th row of
is denoted as
. From Equation (2), we can deduce that
. That is, the first element in the
-th row of
is equal to the second element in the
-th row of
.
By iterating over all possible values of
, it can be observed that the
-th row of
corresponds to the
-th row of
, with both rows containing the same number of 0 s and 1 s. Therefore, when the distance vectors of
and
are equal to
and
respectively, the
of
is identical to that of
. From the scenario where
and
, extending all the way to
and
, the same conclusion holds as in the case where
,
. By the principle of transitivity, if both
and
are derived from
, then the
of
is identical to that of
. ◻
Theorem 1. Select two
-column submatrices,
and
, from an
Plackett-Burman design. Let the distance vectors of the two
-column submatrices be denoted as
and
, respectively. If both
and
belong to the set
, then, the
-column sub-design constructed from
and the
-column sub-design constructed from
possess the same
-aberration sequence.
Proof. Select two
-column submatrices,
and
, from an
Plackett-Burman design. Let the distance vectors of the two
-column submatrices be denoted as
and
, respectively. Select any
columns from
. Let the selected
columns be denoted as
, and it holds that
. Let the
-th column
among the
columns
be the
-th column among the
columns
. Let
denote the distance vector of
. For the selected columns
, we choose a corresponding set of
specific columns
from
. Let
be the
-th column among the
columns
. Let
denote the distance vector of
.
(i)
.
Without loss of generality, let
. When
, it follows that
. In particular,
. By setting
, we obtain
. From Lemma 1, it follows that the
of
and
is identical under this condition. Therefore, when
, for every
, there exists a corresponding
such that the
of
and
is identical.
(ii)
.
Let
and
. When
, it follows that
. In particular,
. When
, let
and let
. When
and
, let
and
. When
and
, let
and
. We can discover that
precisely corresponds to a
-column submatrix of
, with the distance vector
of
being equal to
. From Lemma 2, it follows that the
of
and
is identical under this condition. Therefore, when
and
, for every
, there exists a corresponding
such that the
of
and
is identical.
Let the
-th order aliasing of the sub-design formed by
and the sub-design formed by
be denoted as
and
, respectively. Furthermore, let
and
for
, and
and
for
. Through a discussion of the two distinct scenarios involving
and
, we can deduce that for any
columns
selected from
, there exist uniquely corresponding
columns
in
, such that
and
have the same
. That is,
.
For any
columns
of
, there exist unique
columns
of
such that
and
have the same
. For cases (i) and (ii) concerning
and
, the distance vector
of
and the distance vector
of
respectively satisfy the following two scenarios.
(I)
.
(II)
.
Similarly to the discussion on
and
. For any
-column submatrix of
, there exists a unique
-column submatrix of
such that the value of
is identical for these two
-column submatrices, which consequently leads to
. Given that
and
, it ultimately follows that
. Therefore, when
and
satisfy conditions (i) and (ii), the
-column-subdesign formed by
and the
-column-subdesign formed by
have the same
-aberration sequence.
In the case where
, the same conclusion holds when
,
,
,
and
. From transitivity, it follows that if both
and
belong to
, then the
-column-subdesign formed by
and the s-column-subdesign formed by
have the same
-aberration sequence. ◻
Theorem 2. The conclusion provided by Theorem 1 improves the search efficiency for the optimal baseline sub-design of the Plackett-Burman design by a factor of
, independent of the number of columns in the selected sub-design.
Proof. Select
columns, where
, from the
columns of the Plackett-Burman design, denoted as
. Let
, then the distance vector of
can be represented as
. Thus, according to Theorem 1, the
-column sub-designs formed by
and
have the same
-aberration sequence.
Let the distance vector of
be denoted as
. Let
. Thus, by Theorem 1, the
-column sub-designs formed by
and
have the same
-aberration sequence. The
-column sub-designs formed by
and
have the same
-aberration sequence.
Let the distance vector of
be denoted as
. Let
. Thus, by Theorem 1, the
-column sub-designs formed by
and
have the same
-aberration sequence. The
-column sub-designs formed by
and
have the same
-aberration sequence.
Similarly, Let the distance vector of
be denoted as
. Let
. Thus, by Theorem 1, the
-column sub-designs formed by
and
have the same
-aberration sequence. The
-column sub-designs formed by
and
have the same
-aberration sequence.
Therefore, each time we select an
-column sub-design, there will be
distinct
-column sub-designs that have the same
-aberration sequence as the selected one. Next, we select
columns from the
columns of the Plackett-Burman design, denoted as
. If the distance vector of
does not belong to
, then the sub-design formed by
will have the same
-aberration sequence as
other
-column sub-designs. In summary, we conclude that the result provided by Theorem 1 improves the search efficiency for the optimal baseline sub-designs of the Plackett-Burman design by a factor of
, independent of the number of columns in the selected sub-design. ◻
4. Application
Wu and Hamada (2000) [4] presented a Plackett-Burman design with 24 runs and 23 columns in their work. We will identify the minimal
-aberration design among all the 24-run, 7-column sub-designs based on the theory presented in this paper. According to our theory, after excluding all sub-designs with identical
-aberration sequences, we have obtained a total of 10,659 sub-designs. By calculating the
for these sub-designs, we identified two sub-designs with the smallest
, which are determined by the columns 1,234,567 and 1,234,579, respectively. By calculating
for these two sub-designs, we found that the sub-design determined by columns 1,234,579 has the smallest
. Therefore, the minimal aberration design among all the 24-run, 7-column sub-designs of this 24-run, 23-column Plackett-Burman design is the sub-design determined by columns 1,234,579. If we were to evaluate the
-values for all possible 24-run, 7-column sub-designs, we would need to assess
, which equals 245,157 designs. However, the theory presented in this paper significantly reduces the search space.
We implemented the results presented in the application using R software. The R code is provided in Appendix A. The specific results are presented in Table 2.
Table 2. The two sub-designs,
and
.
-value |
1,234,567 |
1,234,579 |
|
7.5 |
7.5 |
|
32.01042 |
31.34375 |
Appendix