Keywords:
People go through different stages upon HIV infection. Immediately after infection, no antibodies are produced by the human immune system and the HIV viruses replicate very quickly. Characteristic of this initial stage is a high HIV viral load and no detectable antibody level. We refer to this early stage as the acute or acute infection [1]. After acute infection, antibodies are produced which, along with other processes, cause the viral load to drop to a lower and more stable level. Characteristic of this later stage is detectable HIV-specific antibodies. We refer to this late stage as the chronic infection. In HIV screening, standard tests such as the rapid HIV test (OraQuick) look for antibodies and can detect only the chronic infection. For acute infection, HIV RNA based tests such as PCR (Polymerase Chain Reaction) and NAAT (nucleic acid amplification test) are used which can detect the actual HIV viruses.
Detection of early HIV infection has been named as a priority [2]. Diagnosis of acute infections―in the antibody negative “window period”―is particularly important. Individuals diagnosed at the acute infection stage have more treatment options for reducing viral loads and enhancing HIV-specific immune response, and consequently have a better chance of delaying progression to AIDS. For public health, since the acutely infected have the maximal transmission potential due to the high viral loads [3], effective early diagnosis can help reduce the spread of HIV by allowing critical prevention services at the acute infection stage [4]. Consequently, early diagnosis is an important aspect for HIV control.
As the antibody-negative window is relatively short, about 3 weeks [5], diagnosis of acute infection requires HIV testing shortly after infection. For a population at a given time, those going for HIV testing fall into three categories: the acutely infected, the chronically affected, and the uninfected. As diagnosis of acute infection requires HIV testing within the antibody negative window, a low proportion of acute infection can be due to a low proportion of infection or delayed testing beyond the window. We propose a new metric as the ratio of the proportion of acute infection versus that of chronic infection. It measures the odds of the infected being diagnosed during acute versus chronic stage. The new metric reflects the rapidity of the infected subjects seeking and obtaining HIV testing. All things being equal, that is, the population of those who seek testing is stable over the time or locations, a high acute infection to chronic ratio is desirable.
We illustrate the new metric with data from sexually transmitted (STD) clinics in Malawi, North Carolina [6], San Francisco [7], and Washington DC [8], which were collected around the years 2000 to 2003. Table 1 lists the number of subjects going for HIV testing, the number of diagnosed acute infections, and the number of diagnosed chronic infections. Also the listed are the per capita incomes around the time of data collection. We are interested in whether the four regions share the same acute-to-chronic ratio; that is, whether the infected individuals in these regions have the same chance of being diagnosed during acute stage.
2. Acute-to-Chronic Ratio
Denote
as the subjects from the i-th population who go for HIV testing over a specific period of time, where
is the total number screened,
is the number with acute infection,
is the number with chronic infection, and the rest
subjects are free of infection. Denote
as the proportion of acute infection and
as the proportion of chronic infection. Then
follows a multinomial distribution
. The log likelihood over the
populations takes the form
(1)
Maximum likelihood estimation (MLE) gives the sample proportions

Define
as the proportion of total infection in the i-th population, so that
. Among the infected who seek a test, let
be the time from infection to HIV testing. Let
be the period from infection to detectable antibodies, also known as the antibody negative window. If
, an acute infection is detected. If
, a prevalent infection is detected. Let
be the cumulative distribution function of
, then
![]()
We define the acute-to-chronic (AC) ratio as
![]()
the odds of being diagnosed during acute stage versus chronic stage among the infected who seek an HIV test.
The distribution function
reflects how quickly the infected seek HIV testing. As the exact time of infection is hard to retrieve and the length of antibody negative window is variable over populations [9,10], both
and
are not easy to determine. It is thus hard to evaluate
and even harder to compare
across popu- lations. The AC ratio can be considered as a parameter of the distribution which captures the
information
around 0. It quantifies the chance of an infected being diagnosed within versus beyond the acute infection stage. A major attraction of the AC ratio is that it is conveniently constructed from regular testing results in STD clinics.
Note that this ratio is free of
, the overall infection rate in population
. While
is very important to monitor,
gives important complementary information about what percentage of the infected is diagnosed during acute infection stage. All things being equal, larger values of
are more desirable.
3. MLE Inference
3.1. Common AC Ratio
If the distribution functions
’s are the same across the populations, then
is constant for
regardless of the infection rate
within each population. The log likelihood under a common AC ratio is
(2)
The number of parameters under model (2) is
. Maximum likelihood estimation of
gives
![]()
Model (2) can hold across populations even if the risks of infection are different. To test for a constant AC ratio we specify two hypotheses,
:
share a common AC ratio following model (2), versus
:
have their own AC ratios following model (1).
The log-likelihood ratio test statistics for testing between these two hypotheses is
![]()
where
,
are the MLEs under (2) and
,
the MLEs under (1). Under
,
is asymptotically
distributed with
degree of freedom as
for all
, and
can be accordingly tested. When the
’s are not large, we can compute the p-value following a parametric bootstrap procedure.
Step 1. Generate
from multinomial distribution
for
.
Step 2. Using
, estimate
and
from (2) and
and
from (1);
Step 3. Compute the log-likelihood ratio
![]()
Step 4. Repeat Step 1 - 3
times. The empirical p-value for rejecting
is the percentage of
above the observed log-likelihood ratio over the
repetitions.
We can also evaluate whether two groups of populations, with a common AC ratio across the populations within each group, share a common AC ratio. This may happen, for example, with one group from a developed country versus the other from a developing country, or one group with an early HIV testing campaign versus the other without such a campaign. Let
represent one group of populations with a common AC ratio
, and
represent the other group of populations with a common AC ratio
. The log likelihood under the null hypothesis of a common AC ratio over the two groups is
![]()
and the log likelihood under the alternative hypothesis of group specific AC ratio is
![]()
Whether the two group of populations share a common AC ratio can be tested via the log likelihood ratio test.
3.2. AC Ratio Depending on a Factor
Some population factors, such as the population education level and income level, may affect the AC ratio. For example, in developed countries, a majority of HIV infected individuals have access to HIV testing at the time of acute infection, but access is limited in developing countries [11]. Such ecological analyses are subject to substantial confounding, but may be useful to generate hypotheses. Here we use simple logistic regression to explore the impact of a factor on the AC ratio. Let
be a factor or a index measured at the population level such as the mean income for the catchment area of the HIV clinic(s) for population
. We assume
for some function
. If there is knowledge about function
, then
and
can be jointly estimated with
. As
is the probability and
is the odds, it is plausible to assume a logistic relationship; that is,
. The log-likelihood function under an AC ratio that depends on
in this fashion is
(3)
Maximization of (3) with respect to
gives
![]()
and the profile log likelihood of
is
![]()
with
![]()
The parameter
can be estimated as the maximizer of
. In fact,
corresponds to a logistic regression of
versus
over the
infected subjects with
the indicator variable for acute infection. Thus,
can be estimated by logistic regression where, within the i-th population, the
subjects all have
,
subjects have the value of
, and
subjects have the value of
. Denote
and
as the estimates from
, then
![]()
A test of whether the AC ratio depends on
can be formed as a likelihood ratio test with hypotheses
:
have
-related AC ratio following model (3), versus
:
have their own AC ratios following model (1).
The corresponding log likelihood ratio is
![]()
which is asymptotically
distributed with
degree of freedom with
the number of parameters in
which is
in model (3).
can be tested via log likelihood ratio or by bootstrap similar to Section 3.1 when
’s are not large.
4. Application to STD Clinic Data
4.1. Testing for a Common AC Ratio
We first investigate whether there is a common AC ratio across the four populations in Table 1. The results are shown in Figure 1. On the left is the scatter plot of the proportion of acute infection versus that of chronic infection. Each circle represents the observed proportion of acute infection versus that of the chronic infection, which is also the MLE estimate under model (1). Around each circle are the
exact confidence bounds based on
the binomial distributions
and
for
and
, respectively. Each solid dot represents the MLE estimates under the common AC ratio model (2). Under (2),
are on the line through origin with slope
. We observe the four solid dots falling on a line with slope
, indicating that there is about 1 acute infection for every 22 chronic infections should a common AC ratio be shared across the four populations. The observed AC ratio within each population is the slope of the line connecting the origin to the circle for that population (lines not shown in the Figure). For example, Malawi has an observed AC ratio of 0.036 indicating 1 acute infection for every 27 chronic infections; San Francisco has an observed AC ratio of 0.136 indicating 1 acute infection for every 7 chronic infections. We observe that
is lower than the observed AC at Washington DC and San Francisco but higher than the observed AC at North Carolina and Malawi. On the right of Figure 1 is the histogram of the log-likelihood ratio under the null hypothesis
from parametric bootstrap with
, where the over-imposed is the density of
. The null hypothesis is rejected with empirical p value of
at the observed log-likelihood ratio of 13.23. Therefore, the four populations do not share a common AC ratio.
4.2. Testing for per Capita Income Related AC Ratio
For illustration, we investigate whether the AC ratio is related to the per capital income and follow (3). The estimation and testing results are presented in Figure 2. The scatter plot shows that the MLE estimates of
and
under model (3) are very close to the observed, all within the
confidence bounds. Under (3),
and consequently the solid dots do not fall on a straight line. The empirical p value from bootstrap is
, thus hypothesis
is not rejected indicating that the AC ratio follows an increasing log-linear function of income. Since San Francisco has the highest and Malawi has the lowest per capita income, they have the highest and the lowest AC ratios, respectively. It could be that the per capital income is one cause for the differential AC ratios. It could also be that other social and demographic factors affect the AC ratio
through the per capital income.
References