A New Definition of P-Value Involving the Maximum Likelihood Estimator ()
1. Introduction
A principal goal of statistics is to obtain evidence from data for comparing alternative decisions. For example, statistical evidence may allow one to decide that a population mean
satisfies
as opposed to
for some specified
. Unfortunately, evidence is an ambiguous concept in statistics, though [1]-[7] among others have attempted to define it. Currently, the P-value [8] [9] is perhaps the most frequently applied measure of evidence used to reject or fail to reject a hypothesis. In Section 2, we review the standard P-value and the notion of a maximum likelihood estimator (MLE). We then propose the new MLE P-value involving the MLE of some parameter of interest. The MLE P-value is not related to significance levels and not defined under the assumption that the null hypothesis is true. Both the standard and new MLE P-values may be considered measures of evidence, but the latter is more intuitive. In Section 3, we present four examples and compare the two approaches.
2. The Standard and MLE P-Values
The notion of P-value is a fundamental tool in statistical inference and has been widely used for reporting outcomes of hypothesis tests. Yet in practice, the P-value is often misinterpreted, misused, or miscommunicated. Moreover, it does not unequivocally reflect the available evidence for the null hypothesis since H0 is assumed to hold. In this section, we propose the new MLE P-value that may give different values in some cases than do existing definitions. The MLE P-value provides a simple and intuitive interpretation of P-value. Our definition appears applicable to a wide range of hypothesis testing problems and yields an interpretation of P-value as both a cardinal and ordinal measure of the evidence. We restrict our development to standard one-sided hypothesis testing, but Example 3 illustrates that the approach here can also be used for two-sided hypothesis testing.
We first summarize the two standard ways of defining P-value for the general hypothesis test
vs
with a parameter space
. Let
be a test statistic for a random sample
from a random variable
with a single scalar parameter
. Denote the observed data for
as
.
Definition 1 (Definitions of P-value from [10] and [11]). Under the assumption that
is true, the P-value associated with the observed data
is defined as either
(1)
or
(2)
where
is the rejection region for a level of significance
.
Equation (1) is usually interpreted as follows. Under the assumption that H0 is true, P-value is the probability that
is as least as extreme as its observed value
. This interpretation can lead to the common misunderstanding that this definition of P-value is the probability that H0 is true. On the other hand, under the assumption that H0 is true, Equation (2) is based on significance levels (Type I Error probabilities) and can lead to the misunderstanding that P-value is only a measure of Type I Error and not related to the likelihood that H0 is true.
Both definitions lead to the question: how can the assumption that H0 is true produce evidence that H0 is true? The answer is that H0 is rejected when there is a probability less than or equal to 0.05 (for example) that H0 is true when it is assumed true. In other words, a contradiction involving probability is reached. For other issues with these definitions, see [10]-[12], for example. To address such issues, we utilize the well-known maximum likelihood estimator (MLE) [13] [14] and define a P-value utilizing the maximum likelihood estimator for the parameter under consideration.
Definition 2 (Likelihood Function and MLE). Let
be sample data from a random sample
from a random variable
with sample space
and real-valued parameter
. For the joint pdf
of the random sample
. For any sample data
, the likelihood function of
is defined as
(3)
where
in (3) is a function of the variable
for given data
. The MLE
maximizes
for the data sample
; i.e.,
in terms of
.
Definition 3 (MLE P-value). Let
be observed random sample data for a random sample
from a continuous random variable
with a single real-valued parameter
and pdf
. Let
denote the MLE for
with pdf
. For the hypothesis test
vs
, the MLE P-value (MPV) at the sample data
for the null hypothesis
is defined as
(4)
where the integration is over the scalar values
of
.
In the integration of (4),
is estimated by the number MLE
from the sample data
. Then the pdf
of the MLE
is integrated over the values
in the null hypothesis
. For example, in the one-tailed hypothesis test
vs
, the set
would be
. In this case (4),
approximates the frequentist probability that
, and so
approximates the likelihood that is true. Any inaccuracies are due (i) to integrating over
since
has no prior or posterior distribution and (ii) to setting
. Both result from using the MLE
as a surrogate for the unknown parameter
. Doing so, however, utilizes the fact that distributions of MLEs for a parameter
are often known for a continuous random variable
with a common distribution such as the normal or exponential. In such cases, (4) may be analytically integrable. If not, a numerical integration could possibly be employed. However, only analytical integration is considered here.
3. Examples
Four examples are now presented to illustrate the MLE P-value approach. The first two involve the normal random variable, and the third involves an exponential random variable. The fourth demonstrates a possible limitation of the method.
Example 1. For a random sample of size
, consider the hypothesis test
vs
for the parameter
of a normal random variable
. In this case, we rewrite
as
. We also show that (4) gives the standard P-value for
.
Result 1. Let
be a random sample from a random variable
with unknown
, and consider the one-sided hypothesis test
vs
. Let
be the sample mean for sample values
. When
is known,
(5)
and when
is unknown,
(6)
where
is the cdf for the standard normal distribution and
is the cdf for the student t distribution with n − 1 degrees of freedom.
Proof. When
is known, to prove (5) we use the fact [9] that the MLE
for
is the sample mean
. The integral of (4) therefore becomes
(7)
Substituting
for
in (7) gives (5). When
is unknown, Equation (6) follows from the definition of the student t distribution [9]. ■
Numerically, when
,
,
, and
, the MLE P-value of (5) for
vs
is
from the standard normal table [9]. If
instead, the MLE P-value of (5) is
, and it is much less likely that
. It should be noted that the right sides of (5) and (6) are the standard P-values for the hypothesis test
vs
using the definition of (2). In particular, the usual test statistic for this case is
with a critical region of
[9], where α is the level of significance, i.e., the probability of committing a Type I error. The P-value is then the lowest level of α for which the observed z is significant. But this lowest level of α in standard hypothesis testing is simply
, which is the right side of (5).
Example 2. For a random sample of size
, consider the hypothesis test
vs
for the variance
of a normal random variable
. From [9], the MLE for the variance of a normal distribution is
, so
, from which we have the following.
Result 2. Let
be a random sample from a random variable
with unknown
, and consider the one-sided hypothesis test
vs
. Then the integral of (4) becomes
(8)
Proof. From [9], the MLE
for
of a normal distribution is
and so
. Hence
(9)
Substituting
for
in
in (9) yields (8). ■
Numerically, when
,
, and
, the MLE P-value of (8) for
vs
becomes
.
Example 3. In this example, we illustrate that the MLE approach is applicable in two-sided hypothesis testing. For a random sample of size
, consider the hypothesis test
vs
for the parameter
of
with
known. To be more realistic, we modify this test to
vs
, where
is an acceptable tolerance level, i.e., an acceptable deviation from
. In this case, (4) becomes
(10)
Numerically, when
,
,
,
, and
, the MLE P-value of (10) becomes
.
Example 4. Let the random variable
have an exponential distribution with pdf
,
, with parameter
. For a random sample of size
, consider the hypothesis test
vs
. From [15], the MLE for
is
. It can be shown [16] that
for the exponential random variable
follows the gamma distribution
. Hence
follows an inverse gamma distribution [16], so the right side of (4) becomes the regularized gamma function
, which is available in various software packages such as MATLAB.
Example 5. To illustrate the difficulty of finding the pdf
for MLE
in the integral of (4), consider the random variable
with pdf
for
and the parameter
. Then it can be shown [9] that
, whose pdf is difficult to determine.
4. Conclusions
The MLE P-value defined here gives a value that estimates the probability that a one-sided null hypothesis on a single parameter is true. Two-sided hypotheses can be similarly treated using tolerance levels as in Example 3. In the approach of this paper, the MLE
may be thought of as a new type of test statistic that is integrated over
, while its numerical value
from the data
is used to estimate
in this integration. Obtaining such an approximation of the probability that the null hypothesis is true is, in fact, the ultimate goal of hypothesis testing. Certainty is not possible. But given the MLE P-value, a decision maker would need to decide if this value is sufficiently large to accept the null hypothesis. Different decision makers might judge a given MLE P-value probability differently, but a metric for the decision has now been provided.
The principal limitation of the analytical approach here is that the pdf of the maximum likelihood estimator needs to be both known and reasonable to integrate. Future work should be directed at the numerical integration of these integrals. Tables could be developed for certain parameters of particular random variables
Although MLEs have been used here as the surrogate for single parameters, other estimators could also be used. The advantage of MLEs is that they are well-studied and often immediately available. Finally, an approach similar to that of this paper could be applied to hypothesis testing for parameters of discrete random variables.
Acknowledgements
The author thanks his former student Maryham Moghimi for her discussion about statistics and the P-value.