^{1}

^{*}

^{2}

^{1}

Confirmatory factor analysis (CFA) refers to the FA procedure with some loadings constrained to be zeros. A difficulty in CFA is that the constraint must be specified by users in a subjective manner. For dealing with this difficulty, we propose a computational method, in which the best CFA solution is obtained optimally without relying on users’ judgements. The method consists of the procedures at lower (L) and higher (H) levels: at the L level, for a fixed number of zero loadings, it is determined both which loadings are to be zeros and what values are to be given to the remaining nonzero parameters; at the H level, the procedure at the L level is performed over the different numbers of zero loadings, to provide the best solution. In the L level procedure, Kiers’ (1994) simplimax rotation fulfills a key role: the CFA solution under the constraint computationally specified by that rotation is used for initializing the parameters of a new FA procedure called simplimax FA. The task at the H level can be easily performed using information criteria. The usefulness of the proposed method is demonstrated numerically.

In factor analysis (FA), the variation of p observed variables is assumed to be explained by m common factors and p unique factors, with m < p and the two types of factors mutually uncorrelated. The m common factors serve to explain the variations of all p variables. On the other hand, each of the p unique factors has a one-to-one correspondence to each variable: a unique factor explains specifically the variation of the corresponding variable that remains unaccounted for by the common factors [

The parameters to be estimated in FA are a factor loading matrix L = (λ_{ij}) (p × m), a unique variance matrix Y = (ψ_{ii}_{¢}) (p × p), and a factor correlation matrix F = (f_{jk}) (m × m). Here, the loadings in L stand for how the variables load on the common factors, Y is the diagonal matrix whose diagonal element ψ_{ii} expresses the variance of the ith unique factor, and F contains the correlation coefficients among the m common factors. Upon making certain distributional assumptions for the factors, the covariance matrix S among p observed variables is modeled as

Σ = Λ Φ Λ ′ + Ψ . (1)

(e.g., Adachi, 2019 [

FA can be classified into two types; confirmatory (CFA) and exploratory (EFA) [_{ij}) (p × m) are defined generally as

b i j = { 0 , iff λ i j = 0 1 , otherwise , (2)

in other words, b_{ij} = 1 if variable i is linked to factor j; otherwise, b_{ij} = 0. In this sense, we call B a link matrix. Any CFA constraint can be expressed as Λ = B • Λ . Here, · denotes the element-wise Hadamard product with B • Λ = ( b i j λ i j ) . It should be kept in mind that specifying a CFA model amounts to selecting a particular link matrix B. Thus, CFA can be formally expressed as

min Λ , Ψ , Φ f ( Λ , Ψ , Φ | S ) s.t. Λ = B • Λ for a specified B ∈ S B (3)

with “s.t.” the abbreviation for “subject to” and S B denoting a set of considered matrices B.

A problem in CFA (3) is that the link matrix B = (b_{ij}) defined as (2) must be selected by users. That is, it must be decided in a subjective manner, which elements in B are to be zeros/ones, in other words, which pairs of variables and factors are linked as in ^{pm}, since each of the pm elements in B takes zero or one as in (2): for example, for p = 12 and m = 3 one finds 2^{pm} @ 6.87 × 10^{10}. In short, the problems in CFA can be summarized next:

[P1] The link matrix B (specifying a CFA model) must be selected subjectively by users.

[P2] An enormous number of possible matrices B in S B must be considered for finding the best B.

To the best of our knowledge, the problems [P1] and [P2] in CFA have not been considered in the existing papers. In order to deal with those problems, we propose an FA procedure for computationally and optimally identifying a suitable CFA model. Here, the model identification includes estimating the model parameter values. The outline of our proposed procedure is described in the next section. Then, we detail the procedure in Section 3, report its assessment in a simulation study in Section 4, give numerical examples in Section 5, and conclude this paper in Section 6.

First, we outline our approach to the CFA model identification under the condition that the number of zero loadings is fixed to a particular integer in Section 2.1. Then, in Section 2.2, the approach is extended to the cases with the number of zero loadings not being fixed. We then summarize the prospects for the following sections.

For dealing with the difficulties [P1] and [P2] in CFA, we can consider the two procedures introduced in the next paragraphs, on the condition that Card ( Λ ) = Card ( B ) , that is, the number of nonzero values in L and B the matrix between parentheses equals a specified integer c hence

Card ( Λ ) = c , or equivalently, Card ( B ) = c . (4)

Clearly, pm - c equals the number of zeros in B or L.

The first procedure considered can be formulated as

min Λ , Ψ , Φ f ( Λ , Ψ , Φ | S ) s.t. Λ = B • Λ , after B is estimated optimally s.t. (4). (5)

This minimization is rewritten as performing CFA under the constraint indicated by link matrix B, with B estimated by another method in advance. Thus, the problem [P1] is overcome. Further, we can also consider that [P2] is dealt with, supposed that the value c in (4) and B are suitable.

The above procedure consists of two stages: first B is estimated, then CFA is performed, see (5). In contrast, the second procedure considered is formulated with a single stage as follows:

min B , Λ , Ψ , Φ f ( B , Λ , Ψ , Φ | S ) s.t. Λ = B • Λ and (4). (6)

Here, B has been added to the subscripts of “min”: the link matrix B, which indicates the pairs of variables and factors to be linked, is estimated jointly with the other parameters L, Y, and F.

Between (5) and (6) or (6¢), we can find the following difference: in minimizing loss function f ( Λ , Ψ , Φ | S ) , the B value is kept fixed in (5), but allowed to change in (6). The difference implies that the resulting loss function value in (6) cannot exceed that value in (5):

min B , Λ , Ψ , Φ f ( B , Λ , Ψ , Φ | S ) ≤ min Λ , Ψ , Φ f ( Λ , Ψ , Φ | S ) . (7)

This inequality shows that (6) can provide a better solution than (5). We will propose an iterative algorithm for (6), which will be started by the optimum for (5). As empirically shown later, in almost all cases, the solutions of (5) and (6) are equivalent: we can obtain the final solutions only by (5), without performing the iterative algorithm for (6). However, it is worth to perform the latter step, as (6) can provide a better solution in a few cases.

In this paper, we use the maximum likelihood (ML) method for estimating the parameters in (5) and (6). This implies that the loss function to be minimized is given as the negative of the log likelihood. It is explicitly expressed as

f ( Λ , Ψ , Φ | S ) = log | Σ | + tr S Σ − 1 = log | Λ Φ Λ ′ + Ψ | + tr S ( Λ Φ Λ ′ + Ψ ) − 1 , (8)

following from the normality assumptions for factors [

We should notice that the above approach is conditional upon c in (4). Thus, it remains to select the best value for c. This can be attained by the following procedure:

Select the value c with the lowest IC(c) among c = c min , ⋯ , c max . (9)

Here, c_{min}/c_{max} expresses the reasonable minimum/maximum of c, and IC(c) denotes the value of an information criterion statistic [_{min}, c_{max}] is obviously [1, pm]. It can be reasonably reduced as

c min = p and c max = p m − m ( m − 1 ) 2 (10)

This is because, solutions with c smaller than p would surely have one or more zero rows of loadings. On the other hand, it is considered in the c_{max} value that m(m - 1)/2 elements in L can be set to zeros without a change in the values of loss function (8).

Now, let us discuss how our proposed procedure is related to the set S B containing all possible B, using S B ( c ) for the subset of S B that contains the matrices B with c nonzeros. The number of B in S B ( c ) is given by

N B ( c ) = C p m c , (11)

i.e., the number of the combinations of the c elements being ones among all pm elements in B. Thus, the number of all B contained in S B is given by

N B = ∑ c = c min c max N B ( c ) (12)

with (10) and (11). For example, when p = 12 and m = 3, the value in (12) is approximately 1.25 × 10^{9}, which is enormous (though less than 2^{pm} @ 6.87 × 10^{10} presented in the last section where (10) was not yet considered). However, it is not required to assess all N_{B} matrices B in S B in our proposed procedure, as the performance of (6) following (5) allows us to find the optimal B in S B ( c ) for a particular c, with this performance made for c = c min , ⋯ , c max as in (9). That is, our proposed procedure can find the optimal B among all N_{B} matrices B in S B by the c max − c min + 1 runs of (6) following (5), but not by assessing all B. The example of c max − c min + 1 = 22 for p = 12 and m = 3 demonstrates how easily we can arrive at the optimal B.

The remaining parts of this paper are organized as follows: in Sect 3.1 and Sect 3.2, we detail (5) and (6) in turn. There, it is described that Kiers’ [

In this section, we detail the procedure formulated as (5) with f ( Λ , Ψ , Φ | S ) defined as (8). A key point is that we will use exploratory FA (EFA) followed by Kiers’ [

Let T denote any m × m nonsingular matrix satisfying

diag ( T T ′ ) = I m , (13)

where diag(TT¢) is the diagonal matrix whose diagonal elements are those of TT¢. EFA has rotational indeterminacy as shown next:

Σ = Λ T − 1 T T ′ T − 1 ′ Λ ′ + Ψ = Λ T Φ Λ ′ T + Ψ (14)

with F = TT¢ and Λ_{T} = ΛT^{−1}. Here, the latter matrix can also be regarded as the loading matrix. Thus, even if F = TT¢ is fixed to I_{m} by choosing T that meets TT¢ = I_{m}, the (8) value remains unchanged when Λ is replaced by ΛT^{−1}. By taking account of this property, f ( Λ , Ψ , Φ = Ι m | S ) , i.e., (8) with F = I_{m}, is minimized over Λ and Y in EFA. This is attained with the EM algorithm described in Appendix A1. We use Λ_{E} for Λ resulting from EFA.

The indeterminacy (14) implies that Λ_{T} = Λ_{E}T^{−1} and F = TT¢ can also be the EFA solutions of factor loading and correlation matrices, respectively, with f ( Λ E , Ψ , Φ = Ι m | S ) = f ( Λ T , Ψ , Φ | S ) . This fact is often exploited by obtaining T that allows Λ_{T} = Λ_{E}T to be interpretable. The procedures for obtaining T are generally referred to as factor rotation.

This subsection concerns the factor rotation intermediating between the first EFA and the final CFA stages. Here, the loading matrix is unconstrained in EFA, but constrained to meet Λ = B • Λ in CFA as in (3). The latter constrainedness leads to

min Λ , Ψ , Φ f ( B • Λ , Ψ , Φ | S ) ≥ f ( Λ E , Ψ , Φ = Ι | S ) = f ( Λ T , Ψ , Φ | S ) , (15)

where f ( B • Λ , Ψ , Φ | S ) stands for the loss function in (3) in which the CFA constraint Λ = B • Λ substituted. Inequality (15) implies that the value of the EFA loss function f ( Λ E , Ψ , Φ = Ι | S ) = f ( Λ T , Ψ , Φ | S ) gives the lower limit of f ( B • Λ , Ψ , Φ | S ) to be minimized in CFA (3). This suggests that the best attainable CFA solution would be one which provides the min Λ , Ψ , Φ f ( B • Λ , Ψ , Φ | S ) value close to the EFA counterpart the f ( Λ E , Ψ , Φ = Ι | S ) = f ( Λ T , Ψ , Φ | S ) value.

Now if T can be chosen in the factor rotation such that Λ_{T} = Λ_{E}T^{−1} is similar to a matrix that can be written as B • Λ for particular matrices B and L, then we can expect that such B and L will be good candidates for giving a CFA solution with a fit close to that of EFA. This can be attained by the simplimax rotation [

s p x ( B , Λ , T ) = ‖ B • Λ − Λ E T − 1 ‖ 2 (16)

over B, L, and T subject to (4) and (13). Thus, this rotation serves as a suitable bridge between the first and final stages.

For the constrained minimization of (16), two steps are iterated alternately. In one of them, the simplimax function (16) is minimized over T under (13) for a given B • Λ , using Browne’s [_{ij}) and Lunder (8) for a given Λ_{E}T^{−1}. As this problem is also related to the procedure in the next section, we express the minimization in a generalized form as

min B , Λ s p x ( B , Λ | H ) = ‖ B • Λ − H ‖ 2 for a given p×m matrix H = ( h j k ) , (17)

with H = Λ_{E}T^{−1} in the present context. As explained in Appendix A3, (17) can be attained for

b i j = { 1 , if h i j 2 ≥ h [ c ] 2 0 , otherwise and Λ = H , (18)

with h [ c ] 2 denoting the cth largest value of all elements in H • H = ( h i j 2 ) . The steps in the simplimax rotation algorithm is listed in Appendix A2.

The final stage is simply to perform CFA using B obtained by the simplimax rotation. Thus, SbCFA is formulated by making (5) concrete as

min Λ , Ψ , Φ f ( Λ , Ψ , Φ | S ) s.t. Λ = B • Λ

after B is estimated s.t. Card ( B ) = c by simplimax (5¢)

with f ( Λ , Ψ , Φ | S ) defined as (8). The minimization in (5) or (5¢) is attained with the EM algorithm, which is described in Appendix A1.

We will now describe simplimax FA (SimpFA). This is the procedure minimizing (6), using the solution from SbCFA as a starting configuration. Specifically, we minimize the negative of the log likelihood (see (8)), with B • Λ is substituted into L, over B, L, Y, and F subject to card(å) = c (see (4)). We call this procedure simplimax FA (SimpFA), as this is a new FA procedure, and the minimization of the simplimax function (17) fulfills a key role as will be seen in the next paragraph.

The EM algorithm described in Appendix A1 can also be used for SimpFA. The algorithm for SimpFA differs from that for the other FA procedures, in that the former includes the step for minimizing (8) over both B and L with Y and F fixed. An innovative feature in the algorithm is to use Kiers’ [

s p x ( B , Λ | W ) = ‖ B • Λ − W ‖ 2 , (19)

i.e., the simplimax function in (17) with H = W, fulfills a key role. Here, W = Λ C + Q ′ Λ ′ C Ψ − 1 − tr C ′ Ψ − 1 Λ C with Q and C defined in A.1 and L_{C} is the current L value (before update), see Appendix A4 for a derivation.

Here, we describe the procedures in SimpFA following SbCFA, in which SbCFA provides the initial values of the parameters in SimpFA providing the final solution. We take a multiple run approach for SimpFA following SbCFA, in order to reduce the possibility of selecting a local minimizer as the optimal solution: the algorithm of SbCFA followed by SimpFA is run multiple times by starting SbCFA with mutually different initial values, and the best solution is selected among the multiple SimpFA solutions. The procedure is listed as follows:

Stage 1. For S, perform EFA to provide L_{E} and set l = 1.

Stage 2. For each of l = 1 , ⋯ , 100 , perform the following sub-stages:

Stage 2.1. Initialize T to T_{l} and perform simplimax. Express the resulting B as B_{l}.

Stage 2.2. For S, perform CFA using B_{l} as B. Express the resulting L, Y, and F as Λ ˜ l , Ψ ˜ l , and Φ ˜ l , respectively, with their set Θ ˜ l = { Λ ˜ l , Ψ ˜ l , Φ ˜ l } .

Stage 2.3. For S, perform SimpFA with L, Y, and F initialized at Λ ˜ l , Ψ ˜ l , and Φ ˜ l , respectively. Express the resulting L, Y, and F as L_{l}, Y_{l}, and F_{l}, respectively, with their set Θ l = { Λ l , Ψ l , Φ l } .

Stage 3. Select { Λ ^ , Ψ ^ , Φ ^ } with f ( Λ ^ , Ψ ^ , Φ ^ | S ) = min l f ( Λ l , Ψ l , Φ l | S ) as the optimal solution.

Here, the initial value T_{l} for T in Stage 2.1 is chosen as follows: T_{l} is obtained by the varimax rotation for L_{E} if l = 1; otherwise, T_{l} is set to diag ( T 0 T ′ 0 ) − 1 / 2 T 0 with the elements in T_{0} chosen randomly: the resulting T_{l} can be substituted into T. In

The final solution { Λ ^ , Ψ ^ , Φ ^ } resulting in Stage 3 (the last subsection) depends on the cardinality c value in (8). We thus use { Λ ^ c , Ψ ^ c , Φ ^ c } for the solution { Λ ^ , Ψ ^ , Φ ^ } for a particular value of c. The best value for c can be selected with the procedure (9). This can be rewritten using { Λ ^ c , Ψ ^ c , Φ ^ c } as

Choose the { Λ ^ c ∗ , Ψ ^ c ∗ , Φ ^ c ∗ } for the value c ∗ = arg min c min ≤ c ≤ max I C ( c ) (20)

with c_{min} and c_{max} defined as (10).

As IC(c) in (20), we consider using either of Akaike’s [

A I C ( c ) = 2 f ( Λ ^ c , Ψ ^ c , Φ ^ c | S ) + 2 κ ( c ) , (21)

B I C ( c ) = 2 f ( Λ ^ c , Ψ ^ c , Φ ^ c | S ) + log n κ ( c ) , (22)

respectively, for a particular value of c. Here, n is the number of observations, and k(c) = c + α with α the number of unique variances plus the number of inter-factor correlations. Thus, (21) or (22) is substituted into IC(c) in (20). Whether we should use (21) or (22) is assessed in the next section.

We performed a simulation study to assess the proposed procedure in the last section, with respect to [

With p = 12, and m =3, we set the true {L, Y, F} as in _{p} and covariance matrix is defined as Σ = Λ Φ Λ ′ + Ψ , see (1). The resulting X provided the sample covariance matrix S = n^{−1}X¢X. We replicated this procedure 200 times to have 200 matrices S.

For each S, we carried out the procedure of SimpFA following SbCFA with the selection of the best c by (20), using both AIC and BIC. It gives the SimpFA solutions { Λ ^ c ∗ , Ψ ^ c ∗ , Φ ^ c ∗ } , which are classified into two types according to whether c was chosen by AIC or BIC.

Let c_{true} denote the true Card(L). As found in _{true} = 15. On the other

L | Y1_{p } | F | ||||
---|---|---|---|---|---|---|

−0.9 | 0.2 | 1 | 0.2 | −0.3 | ||

0.8 | 0.3 | 0.2 | 1 | 0.1 | ||

0.7 | 0.5 | −0.3 | 0.1 | 1 | ||

0.6 | −0.6 | 0.4 | ||||

0.9 | 0.2 | |||||

−0.8 | 0.4 | |||||

0.7 | 0.5 | |||||

0.6 | −0.6 | 0.3 | ||||

0.9 | 0.2 | |||||

0.8 | 0.3 | |||||

−0.7 | 0.5 | |||||

−0.6 | 0.6 | 0.4 |

hand, c^{*} expresses Card(L) selected by the proposed procedure (20). We obtained the deviation c^{*} − c_{true} for each of the 200 data set. The averages (standard deviation) of these deviations over all data sets are 2.43 (1.40) when using AIC and 0.34 (0.62) when using BIC. This result shows that AIC tends to overestimate Card(L). Moreover, the mean absolute bias |c^{*} − c_{true}| for BIC was smaller than that for IC for 184 data sets among the 200 ones. This result shows that BIC is substantially better than AIC. Accordingly, we take only the BIC-based solution into consideration from here.

It was found that 98 percent of the 200 solutions of SimpFA were equivalent to those of the SbCFA ones used for initializing the SimpFA parameters: no iteration in the SimpFA algorithm was required. Also in the remaining two percent of the runs, SbCFA and SimFA solutions were almost equivalent, with the average of the differences between the two solutions over the 200 data sets being 0.001. Here, the difference is defined as ( ‖ Λ ^ c ∗ − Λ ˜ c ∗ ‖ 1 + ‖ Ψ ^ c ∗ − Ψ ˜ c ∗ ‖ 1 + ‖ Φ ^ c ∗ − Φ ˜ c ∗ ‖ 1 ) / { p m + p − m ( m − 1 ) } with ‖ Γ ‖ 1 = Σ i Σ j | γ i j | denoting the L_{1} norm of a matrix G = (γ_{ij}) and { Λ ˜ c ∗ , Ψ ˜ c ∗ , Φ ˜ c ∗ } being the SbCFA counterpart of { Λ ^ c ∗ , Ψ ^ c ∗ , Φ ^ c ∗ } .

The above results do not show that the SimpFA (following SbCFA) is useless, as SimpFA serves for showing the equivalence of its solution to the SbCFA one, which allows us to find the optimality of the SbCFA solution.

We assess how well the true parameter values are recovered by the SimpFA solution Θ ^ c ∗ = { Λ ^ c ∗ , Ψ ^ c ∗ , Φ ^ c ∗ } , which is provided by (20) with IC(c) = BIC(c). As the indices standing for the badness in the recovery for L, Y, and F,we obtained ‖ Λ ^ c ∗ − Λ true ‖ 1 / ( p m ) , ‖ Ψ ^ c ∗ − Ψ true ‖ 1 / p , and ‖ Φ ^ c ∗ − Φ true ‖ 1 / M , respectively. Further, for assessing the incorrectness in identifying the true zero and nonzero loadings, we recorded MIR_{0} = the number of the true zero loadings estimated as nonzero/(pm − c), and MIR_{#} = the number of the true nonzero loadings estimated as zero/c, for each data set, with M = m(m − 1). Here MIR stands for misidentification rate.

The statistics of the resulting five index values over the 200 data sets are presented in

In order to demonstrate how useful our proposed procedure is, we apply it to two data sets that have already been analyzed by CFA with zero constraints selected by users. The solutions of the latter user-based CFA are compared with those for our proposed procedure.

The first data set is Carlson and Mulaik’s [

Another data set is Kojima’s housing preference one with n = 1120 (participants) and p = 13 (features). It describes to what degree the participants wish to

Index | Percentiles | Avg. | SD | ||||
---|---|---|---|---|---|---|---|

5 | 25 | 50 | 75 | 95 | |||

‖ Λ ^ c ∗ − Λ true ‖ 1 / ( p m ) | 0.011 | 0.013 | 0.015 | 0.017 | 0.088 | 0.023 | 0.035 |

‖ Ψ ^ c ∗ − Ψ true ‖ 1 / p | 0.029 | 0.035 | 0.039 | 0.045 | 0.053 | 0.040 | 0.008 |

‖ Φ ^ c ∗ − Φ true ‖ 1 / M | 0.020 | 0.035 | 0.049 | 0.069 | 0.227 | 0.071 | 0.089 |

MIR_{0} | 0.000 | 0.000 | 0.000 | 0.000 | 0.095 | 0.006 | 0.026 |

MIR_{#} | 0.000 | 0.000 | 0.000 | 0.067 | 0.200 | 0.032 | 0.067 |

Variable | User-based CFA:BIC= 924.1 | Proposed: BIC = 12.3 | ||||||
---|---|---|---|---|---|---|---|---|

L | Y1_{p} | L | Y1_{p} | |||||

Friendly | 0.85 | 0.53 | 0.85 | 0.29 | ||||

Sympathetic | 0.92 | 0.39 | 1.03 | -0.19 | 0.15 | |||

Kind | 0.93 | 0.38 | 1.10 | -0.28 | 0.11 | |||

Affectionate | 0.90 | 0.43 | 0.90 | 0.19 | ||||

Intelligent | 0.88 | 0.48 | 0.88 | 0.23 | ||||

Capable | 0.93 | 0.38 | 0.93 | 0.15 | ||||

Competent | 0.93 | 0.38 | 0.93 | 0.15 | ||||

Smart | 0.90 | 0.43 | 0.90 | 0.19 | ||||

Talkative | 0.80 | 0.60 | 0.80 | 0.35 | ||||

Outgoing | 0.95 | 0.32 | 0.95 | 0.10 | ||||

Gregarious | 0.89 | 0.45 | 0.89 | 0.21 | ||||

Extrovert | 0.90 | 0.44 | 0.90 | 0.19 | ||||

Helpful | 0.73 | 0.22 | 0.59 | 0.74 | 0.20 | 0.35 | ||

Cooperative | 0.72 | 0.23 | 0.60 | 0.73 | 0.21 | 0.36 | ||

Sociable | 0.17 | 0.83 | 0.34 | 0.18 | 0.81 | 0.12 | ||

Factor | F | F | ||||||

1 | 1.00 | 0.22 | 0.56 | 1.00 | 0.25 | 0.64 | ||

2 | 0.22 | 1.00 | 0.30 | 0.25 | 1.00 | 0.30 | ||

3 | 0.56 | 0.30 | 1.00 | 0.64 | 0.30 | 1.00 |

live in the houses featured by the variables. The correlation matrix for this data set is presented in

In both of the above examples, the proposed procedure outperformed in the BIC values, but its solutions are similar to the counterparts of CFA with users’ selected constraints (

A problem in the confirmatory factor analysis (CFA) is that users must select

Variable | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | Food services | 1.000 | 0.423 | 0.404 | 0.127 | 0.179 | 0.132 | 0.148 | 0.214 | 0.225 | 0.252 | 0.125 | 0.122 | 0.192 |

2 | Tea services | 0.423 | 1.000 | 0.751 | 0.187 | 0.267 | 0.115 | 0.188 | 0.223 | 0.265 | 0.300 | 0.144 | 0.097 | 0.193 |

3 | Home for the old | 0.404 | 0.751 | 1.000 | 0.238 | 0.276 | 0.178 | 0.239 | 0.249 | 0.281 | 0.325 | 0.174 | 0.147 | 0.220 |

4 | Lush greenery | 0.127 | 0.187 | 0.238 | 1.000 | 0.411 | 0.399 | 0.350 | 0.286 | 0.274 | 0.273 | 0.150 | 0.192 | 0.176 |

5 | Walking and joking | 0.179 | 0.267 | 0.276 | 0.411 | 1.000 | 0.545 | 0.330 | 0.302 | 0.296 | 0.333 | 0.235 | 0.248 | 0.224 |

6 | Large park | 0.132 | 0.115 | 0.178 | 0.399 | 0.545 | 1.000 | 0.298 | 0.327 | 0.229 | 0.362 | 0.173 | 0.242 | 0.143 |

7 | River and lake | 0.148 | 0.188 | 0.239 | 0.350 | 0.330 | 0.298 | 1.000 | 0.266 | 0.205 | 0.240 | 0.140 | 0.153 | 0.161 |

8 | Communal events | 0.214 | 0.223 | 0.249 | 0.286 | 0.302 | 0.327 | 0.266 | 1.000 | 0.417 | 0.554 | 0.338 | 0.293 | 0.324 |

9 | Multigenerational living | 0.225 | 0.265 | 0.281 | 0.274 | 0.296 | 0.229 | 0.205 | 0.417 | 1.000 | 0.371 | 0.223 | 0.171 | 0.222 |

10 | Active community | 0.252 | 0.300 | 0.325 | 0.273 | 0.333 | 0.362 | 0.240 | 0.554 | 0.371 | 1.000 | 0.286 | 0.291 | 0.283 |

11 | Interactions for hobbies | 0.125 | 0.144 | 0.174 | 0.150 | 0.235 | 0.173 | 0.140 | 0.338 | 0.223 | 0.286 | 1.000 | 0.335 | 0.452 |

12 | House party | 0.122 | 0.097 | 0.147 | 0.192 | 0.248 | 0.242 | 0.153 | 0.293 | 0.171 | 0.291 | 0.335 | 1.000 | 0.333 |

13 | Utilizing own’s careers | 0.192 | 0.193 | 0.220 | 0.176 | 0.224 | 0.143 | 0.161 | 0.324 | 0.222 | 0.283 | 0.452 | 0.333 | 1.000 |

Variable | (A) User-based CFA: BIC = 10915.5 | (B) Proposed: BIC = 10864.2 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

L | Y1_{p} | L | Y1_{p} | |||||||

Food services | 0.48 | 0.77 | 0.39 | 0.16 | 0.76 | |||||

Tea services | 0.86 | 0.27 | 0.89 | 0.21 | ||||||

Home for the old | 0.88 | 0.23 | 0.81 | 0.09 | 0.28 | |||||

Lush greenery | 0.59 | 0.65 | 0.58 | 0.67 | ||||||

Walking and joking | 0.74 | 0.46 | 0.82 | −0.11 | 0.45 | |||||

Large park | 0.69 | 0.52 | −0.18 | 0.78 | 0.48 | |||||

River and lake | 0.49 | 0.77 | 0.48 | 0.77 | ||||||

Communal events | 0.74 | 0.45 | −0.17 | 0.84 | 0.40 | |||||

Multigenerational living | 0.55 | 0.70 | 0.55 | 0.70 | ||||||

Active community | 0.73 | 0.47 | 0.72 | 0.49 | ||||||

Interactions for hobbies | 0.66 | 0.57 | 0.67 | 0.55 | ||||||

House party | 0.53 | 0.72 | 0.15 | 0.43 | 0.73 | |||||

Utilizing own’s careers | 0.66 | 0.57 | 0.67 | 0.55 | ||||||

Factor | F | F | ||||||||

1 | 1.00 | 0.38 | 0.46 | 0.32 | 1.00 | 0.41 | 0.50 | 0.28 | ||

2 | 0.38 | 1.00 | 0.65 | 0.47 | 0.41 | 1.00 | 0.69 | 0.44 | ||

3 | 0.46 | 0.65 | 1.00 | 0.65 | 0.50 | 0.69 | 1.00 | 0.62 | ||

4 | 0.32 | 0.47 | 0.65 | 1.00 | 0.28 | 0.44 | 0.62 | 1.00 |

what pairs of variables and factors are linked, in other words, what loadings are to be zero or nonzero, in a subjective manner. To deal with this problem, we proposed the procedure of SimpFA following SbCFA for computating an optimally suitable CFA model and its solution without relying on any user’s judgment. The simulation study showed that the true CFA model and parameter values can be recovered fairly well by the proposed procedure. Real data examples demonstrated that it can outperform CFA with users’ selected constraints in terms of the BIC statistic.

In Section 4.3, we found the SbCFA solutions to be equivalent to SimpFA in almost all cases. In particular, the good performance of SbCFA was somewhat surprising. To theoretically study, reasons for such a SbCFA performance are considered as a subject for a future study.

The authors declare no conflicts of interest regarding the publication of this paper.

Cai, J.Y., Kiers, H.A.L. and Adachi, K. (2021) Computational Identification of Confirmatory Factor Analysis Model with Simplimax Procedures. Open Journal of Statistics, 11, 1044-1061. https://doi.org/10.4236/ojs.2021.116062

In Rubin and Thayer’s (1982) EM algorithm for FA, anauxiliary function of (8) is defined as

ϕ ( Λ , Ψ , Φ | S ) = log | Ψ | + tr Ψ − 1 ( S − 2 C Λ ′ + Λ Q Λ ′ ) + log | Φ | + tr Q Φ − 1 . (A1)

Here, C = SA and Q = A¢SA + U, with A and U being computed as

A = ( Λ Φ Λ ′ + Ψ ) − 1 Λ Φ and U = Φ 1 / 2 [ I m + ( Φ 1 / 2 Λ ′ ) Ψ − 1 ( Λ Φ 1 / 2 ) ] − 1 Φ 1 / 2 (A2)

using the current L,Y, and F values. In the EM algorithm, (A1) rather than (8) is minimized iteratively. This minimization is known to allow (8) to be minimized (Rubin & Thayer, 1982). This fact also holds true, when L is constrained as Λ = B • Λ . Thus, the minimization of (8) in EFA, SbCFA (5) or (5¢), and SimpFA (6) or (6¢) is attained with the EM algorithm.

The EM algorithm for EFA, SbCFA, and SimpFA can be summarized as follows:

Step 1. Initialize L,Y, and F, with B also initialized in SimpFA.

Step 2. Obtain C and Q in (A.1) through (A.2).

Step 3. Minimize (A.1) over Y with L and F fixed.

Step 4. Minimize (A.1) over L with Y and F fixed, with “over L“ replaced by “over B and L under (4)” only in SimpFA.

Step 5. Minimize (A.1) over F with Y and L fixed.

Step 6. Transform F and L so that F is a correlation matrix and finish if convergence is reached; otherwise. Go back to Step 2.

Here, the minimization in Step 3 is attained when the ith diagonal elements in Y are updated as ψ i i = s i i − 2 λ ′ i c i + λ ′ i Q λ i ( i = 1 , ⋯ , p ), with λ ′ i the ith row of L, c ′ i that of C, and s_{ii} the (i,i) element of S. The task of Step 5 is attained simply by updating F as F = Q. This is because F may be regarded simply as a covariance matrix during the iteration and finally be transromed into a correlation matrix. Thus, ifconvergenceisreachedin Step 6, we must transform F and L as Φ R = diag ( Φ ) − 1 / 2 Φ diag ( Φ ) − 1 / 2 and Λ R = Λ diag ( Φ ) 1 / 2 , then regard the resulting F_{R} and L_{R} as the solutions of F and L, respectively. Here, we should notice that the transformation does not change the value of (8), which is shown by Λ Φ Λ ′ = Λ D ( D − 1 Φ D − 1 ) ( Λ D ) ′ with D an m × m diagonal matrix. Convergence in Step 6 is defined here as the decrease in (6) from the previous round being less than 10^{−6}. The details in Steps 1 and 4 differ among EFA, SbCFA, and SimpFA, as described in the next paragraphs.

The initialization in Step 1 is detailed here. In EFA, the eigenvalue decomposition of S defined as S = LD^{2}L¢ is used, with D^{2} the p × p diagonal matrix whose diagonal elements are arranged in descending order, and LL¢ = I_{p}. That is, the initial L and Y are set to L_{m}D_{m} and diag ( S − L m Δ m 2 L ′ m ) , respectively, in EFA, with Δ m 2 the first m × m diagonal block of D^{2} and L_{m} the p × m matrix containing the first m columns of L. In SbCFA, the initial Y is set to the one obtained in the preceding EFA, while the initial L and F are set to the matrices B • Λ and TT¢, respectively, that are obtained by the preceding simplimax rotation. In SimpFA, L, Y, and F are intialized at their SbCFA solution and B at its simplimax rotation solution.

The minimization in Step 4 is attained by the update of L with L = CQ^{−}^{1} in EFA and the row-wise λ i = B ′ i ( B i Q B ′ i ) − 1 B i c i ( i = 1 , ⋯ , p ) in SbCFA. Here, B_{i} is the m_{i} × m binary matrix satisfying 1 ′ m i B i = b ′ i with b ′ i the ith row of the link matrix B = (b_{ij}) defined as (2) and m i = b ′ i 1 m : for example, if b ′ i = [ 1 , 0 , 1 ] ,

then B i = [ 1 0 0 0 0 1 ] . The Step 4 in SimpFA is detailed in Appendix A4.

A2. Algorithm for Simplimax RotationThe algorithm for the simplimax rotation can be summarized as follows:

Step 1. Initialize T

Step 2. Update B and L as (18) with H = L_{E}T^{−1 }

Step 3. Update T with Browne’s (1972) algorithm

Step 4. Finish if convergence is reached; otherwise, go back to Step 2.

How T is initialized is described in the section for the whole process of the proposed procedure. The convergence is defined that the decrease in (16) from the previous round is less than 10^{−6} in this paper.

The simplimax function s p x ( B , Λ | H ) = ‖ B • Λ − H ‖ 2 in (17) can be rewritten as

s p x ( B , Λ | H ) = ∑ ( i , j ) ∈ ℵ ( b i j λ i j − h i j ) 2 + ∑ ( i , j ) ∈ ℵ ⊥ h i j 2 ≥ ∑ ( i , j ) ∈ ℵ ⊥ h i j 2 . (A3)

with H = (h_{ij}). Here, ℵ denotes the set of the index pairs (i, j) for b_{ij} = 1, while ℵ ⊥ is the set of (i, j) for b_{ij} = 0, and we have used ∑ ( i , j ) ∈ ℵ ⊥ ( b i j λ i j − h i j ) 2 = ∑ ( i , j ) ∈ ℵ ⊥ h i j 2 . The inequality in (A.3) shows that the lower limit of s p x ( B , Λ | H ) is ∑ ( i , j ) ∈ ℵ ⊥ h i j 2 which is attained if b_{ij}λ_{ij} = h_{ij}. Furthermore, the limit ∑ ( i , j ) ∈ ℵ ⊥ h i j 2 is minimal when ℵ ⊥ contains the (i, j) for h i j 2 ≤ [ h i j 2 ] q , with q = p m − c and [ h i j 2 ] q the qth smallest h i j 2 value among h i j 2 , i = 1 , ⋯ , p ; j = 1 , ⋯ , m . This can be rewritten as (18). That is, (17) is attained for (18).

Let us rewrite (A.1) as ϕ * ( Λ ) + c o n s t , where const is a part independent of Λ = B • Λ and

ϕ * ( Λ ) = − 2tr Ψ − 1 C Λ ′ + tr Ψ − 1 Λ Q Λ ′ = tr Ψ − 1 Λ Q Λ ′ − 2tr C ′ Ψ − 1 Λ . (A4)

This minimization over B • Λ is found to be the task of Step 4 in Appendix A1 for SimpFA. Using Δ = Λ − Λ C or Λ = Λ C + Δ with L_{C} the current L value, (A4) can be rewritten as

ϕ * ( Λ ) = tr Ψ − 1 ( Λ C + Δ ) Q ( Λ C + Δ ) ′ − 2tr C ′ Ψ − 1 ( Λ C + Δ ) = ϕ * ( Λ C ) + tr Ψ − 1 Δ Q Δ ′ + tr 2 V Δ (A4¢)

with V = Q ′ Λ ′ C Ψ − 1 − C ′ Ψ − 1 Λ C . Kiers (1990, Theorem 1) shows that tr Ψ − 1 Δ Q Δ in (A4¢) satisfies the inequality tr Ψ − 1 Δ Q Δ ′ / ‖ Δ ‖ 2 ≤ α , i.e., tr Ψ − 1 Δ Q Δ ′ ≤ α ‖ Δ ‖ 2 , where α is the greatest eigenvalue of Ψ − 1 ⊗ Q , with ⊗ denoting the Kronecker product. Using that inequality, a function majorizing (A4¢) can be defined as

η ( Λ ) = ϕ * ( Λ C ) + β tr ‖ Δ ‖ 2 + 2tr V Δ = g ( Λ C ) + β ‖ Δ − V ‖ 2 − β ‖ V ‖ 2 , (A5)

with β ≥ α and β ≥ 0: (14) satisfies ϕ * ( Λ C ) = η ( Λ C ) ≥ η ( Λ ) ≥ ϕ * ( Λ ) if L is the minimizer of ϕ * ( Λ C ) . Since only β ‖ Δ − V ‖ 2 is a function of L with β ³ 0 on the right side of (A5), the task of Step 4 is attained by minimizing η * ( Λ ) = ‖ Δ − V ‖ 2 . From Δ = Λ − Λ C , we can rewrite η * ( Λ ) as η * ( Λ ) = ‖ Λ − Λ C − V ‖ 2 = ‖ Λ − W ‖ 2 , which is the simplimax function (19), with W = Λ C + V . Thus, B and L to be obtained in Step 4 are given by (18) with H = (h_{ij}) and h [ c ] 2 replaced by W = (w_{ij}) and w [ c ] 2 , respectively. Here, w [ c ] 2 is the cth largest value of all elements in W • W = ( w i j 2 ) .