A Note on Approximation of Likelihood Ratio Statistic in Exploratory Factor Analysis ()
1. Introduction
Factor analyis [1] [2] is used in various fields to study interdependence among a set of observed variables by postulating underlying factors. We consider the model of exploratory factor analysis in the form
, (1)
where
is the
covariance matrix of observed variables,
is a
matrix of factor loadings, and
is a diagonal matrix of error variances with
. Under the assumption of multivariate normal distributions for observations, the parameters are estimated with the method of maximum likelihood and the goodness-of-fit of the model can be judged by using the likelihood ratio (LR) test for testing the null hypothesis
for a specified m against the alternative that
is unconstrained. From the theory of LR tests, the degrees of freedom,
, of the asymptotic chi-square distribution is the difference between the number of free parameters on the alternative model and the null model. In (1),
remains unchanged if
is replaced by
for any
orthogonal matrix
. Hence,
restrictions are required to elimi- nate this indeterminacy. Then, the difference between the number of nonduplicated elements in
and the number of free parameters in
and
is given by
. (2)
2. LR Statistic in Exploratory Factor Analysis
2.1. Approximation of LR Statistiic
Let
be the usual unbiased estimator of
based on a random sample of size
from the multi- variate normal population
with
. For the existence of consistent estimators, we assume that the solution
of
is unique. A necessary condition for the uniqueness of the solution
up to multiplication on the right of
by an orthogonal matrix is that each column of
has at least three non-zero elements for every non-singular matrix
([3] , Theorem 5.6). This condition implies that
.
The maximum Wishart likelihood estimators
and
are defined as the values of
and
that minimize
. (3)
Then,
and
can be shown to be the solutions of the following equations:
, (4)
, (5)
where
. The motivation behind the minimization of
in (3) is that
, (6)
that is, n times the minimum value
is the LR statistic described in the previous section. Under (4) and
(5),
and
can be shown to hold. Hence,
.
From the second-order Taylor formula, we have an approximation of the LR statistic as
, (7)
by virtue of (5) [1] [2] . While the approximation on the right hand side of (7) shows how the LR statistic is related to the sum of squares of standardized residuals [4] , it does not enable us to investigate the distributional properties of hte LR statistic. To overcome this difficulty, we express the LR statistic as a function of
.
Let
and
denote the terms of
and
linear in the elments of
. Then we have the following proposition.
Proposition 1. An approximation of the LR statistic is given by
, (8)
where
is defined by
, (9)
with
.
Proof. By substituting
, and
into (4) and (5) and considering only linear terms, we have
, (10)
, (11)
where
. From (10) we derive
,
,
where
. Then
![]()
by virtue of
. Thus,
(12)
By replacing
in (7) with
, we have
,
since
. It follows from (11) and (12) that
, (13)
thus establishing the desired result.
2.2. Evaluating Expectation
For the purpose of demonstrating the usefulness of the derived approximation, we show explicitly that the expectation of (8) agrees with the degrees of freedom,
, in (2) of the asymptotic chi-square distribution. We now evaluate the expectation of (8) by using
, (14)
see, for example, Theorem 3.4.4 of [1] . By noting
, we see that the expectation of the first term in (8) is
(15)
To evaluate the expectation of the second term in (8), we need to express
in terms of
. Let the symbol
denote the Hadamard product of matrices, and define
by
. Because
is positive semidefinite, so is
[5] . If
is positive definite, then (13) can be solved for
in terms of
[3] . An expression of
is
, (16)
where
is a diagonal matrix whose diagonal elements are the i-th column (row) of
[6] . An interesting property of
is
, (17)
where
is the Kronecker delta with
if
and
if
. Hence, we have
(18)
By combining (15) and (18), we obtain the desired result.