Estimation of Bounded Populations and Carrying Capacity with the Logistic Model ()
1. Introduction
Sample surveys are widely used as a cost effective apparatus of data collection and for making valid inference about population parameters. Government bureaus and organizations use such methods to obtain the current information. The foremost aim of a statistician in a sample survey is to obtain information about the population by deriving reliable estimates of unknown population parameters.
This study is using estimation techniques to estimate the bounded population and carrying capacity called the Logistic model that do not require any choice of step size as in the case of local polynomial regression estimator or have to be restricted a fix behavior, instead we allow the data to reveal its nature. The logistic model is use for data fitting. The logistic equation was introduced (around 1840) by the Belgian mathematician and demographer P.F. Verhulst as a possible model for human population growth [1] .
Under simple random sampling (SRS) without replacement design, [2] proposed an exactly unbiased estimator for θyx. The proposed estimator is given by
(1.1)
where,
,
,
,
,
, the population ratio
, where
be the population total for the variable
Y,
be the population total for the variable X and U of N units indexed by the set
a finite population. This estimator can be rewritten under general sampling design p(・). In this case, this estimator is no longer unbiased but still with negligible bias [3] .
Under general sampling design, [4] proposed an estimator for estimating the population ratio θyx. This estimator, has negligible relative bias especially for small sample sizes
and approaches zero with increasing n. Under SRS, and based on simulation results, the performance of this estimator is better than that of (1.1). Their estimator is defined by
(1.2)
Define πi, the first order inclusion probability, by
(1.3)
For
, the second order inclusion probability is defined by
(1.4)
The [5] estimator of the population total
is defined by
(1.5)
where
is one if
and zero otherwise. Further,
(1.6)
can be used to estimate the population mean
. It can be noted that
and
are unbiased estimators for ty, and
respectively. However,
and
do not use the availability of auxiliary variables in the study. In similar way,
and
(1.7)
are unbiased estimators for
and
respectively. Where
is the sample mean of the inclusion probability of the auxiliary variable.
The availability of more than one auxiliary variable issued in literature for estimating the finite population total ty, or finite population mean
.
Under SRS, [6] was the first one who deals with the problem of estimating the population mean using more than one auxiliary variables. His estimator is given by
(1.8)
where p is the number of the auxiliary variables,
wi is the weight of the ith auxiliary variable such that
is the sample mean of Y and
are the population mean and the sample mean of Xi, respectively, for
. [7] proposed the following estimator
(1.9)
for estimating the population mean
,
.
[8] studied the general form of (1.9). They proposed two classes of estimators using two auxiliary variables to estimate the population mean for the variable of interest Y.
[9] suggested a new multivariate ratio estimator using the regression estimator instead of
which used in (1.9). Their estimator is given by
(1.20)
where bi, i = 1,2 are the regression coefficients. Based on the mean squares error (MSE), they found that their estimator is more efficient than (1.9) when
,
where
, and
are defined by Equations (2.4), and (1.2) of Kadilar and Cingi (2004), respectively.
In subsection 2.1 we introduced a general population model that accommodates birth and death rates that are necessarily constant, while subsection 2.2 talked about the asymptotic properties and Section 3.1 talked about the empirical studies. Finally, Section 4.0 drew a conclusion on the study. However, our population P(t) will be a continuous approximation to the actual population, which of course changes only by integral increments―that is, by one birth or death at a time.
Suppose that the population changes only by the occurrence of births and deaths―there is no immigration or emigration from outside the country or environment under consideration. It is customary to track the growth or decline of a population in terms of its birth rate and death rate functions defined as follows:
B(t) is the number of births per unit of population per unit of time at time t;
D(t) is the number of deaths that occur during the time at time t.
Then the numbers of births and deaths that occur during the time interval
is given (approximately) by:
Births:
, Deaths:
Hence the change
in the population during the time interval
of length
is
(1.21)
So
(1.22)
The error in this approximation should approach zero as
, so―taking the limit―we get the differential equation
(1.23)
in which we write
,
, and
for brevity. Equation (1.22) is the general population equation. If B and D are constants, Equation (1.22) reduces to the natural growth equation with
. But it also includes the possibility that B and D are variable functions of t. The birth and death rates need not be known in advance; they may well depend on the unknown function
.
2. Estimation of Bounded Population and Carrying Capacity
This section is purposely considering an estimator that is the logistic model estimate of the bounded population and carrying capacity.
2.1. Proposed Logistic Model
Suppose the birth rate B is a linear decreasing function of the population size P, so that
, where
and
are positive constants. If the death rate
remains constant, then Equation (1.22) takes the form
(1.24)
That is,
(1.25)
where
and
If the coefficients a and b are both positive, then Equation (1.25) is called the logistic equation. For the purpose of relating the behavior of the population
to the values of the parameters in the equation, it is useful to rewrite the logistic equation in the form
(1.26)
where
and
are constants. Solving Equation (1.25) gives,
(1.27)
Actual human populations are positive valued. If
, then (1.27) reduces to the unchanging (constant- valued) “equilibrium population”
. Otherwise, the behavior of a logistic population depends on whether
or
. If
, then we see from (1.26) and (1.27) that
and
However, if
, then we see from (1.26) and (1.27) that
and
In either case, the “positive number” or “negative number” in the denominator has absolute value less than
an―because of the exponential factor- approaches 0 as
. It follows that
(1.28)
Thus a population that satisfies the logistic equation does not grow without bound. Instead, it approaches the finite limiting population M as
. The population
steadily increases and approaches M from below if
, but steadily decreases and approaches M from above if
. Sometimes M is called the carrying capacity of the environment, considering it to be the maximum population that the environment can support on a long-term basis.
The five census years obtained from a sample frame is shown in Table 1 above. However, we aimed at selecting 1969 population as initial population and fit a model through 1989 and 2009 populations from the table. These sample sizes will be used to estimate the population total in 2019 census using the proposed techniques.
Here,
(Initial population)
At
and
we have;
(1.29)
Similarly,
At
and
we have;
(1.30)
Solving Equations ((1.29) and (1.30)) simultaneously we have;
(1.31)
2.2. Asymptotic Properties
Theorem: Law of large numbers:
Let
be iid random variables with common expectation
. Define
. Then for any
, we have
Proof of Theorem:
Let
be the common variance of the random variables; we assume that
is finite. With this (relatively mild) assumption, the Law of Large Numbers (LLN) is an immediate consequence of Chebyshev’s inequality.
For as we have seen above,
and
, so by Chebyshev we have
Table 2 above represents the census population from 1969 to 2009 in the eight provinces in Kenya. Successive sample sizes are selected below to show the law of large numbers.
Here,
and
Sample 1: Nairobi (1969 to 2009)
and
Sample 2: Nairobi and Central
and
Sample 3: Nairobi, Central and Coast
and
Sample 4: Nairobi, Central, Coast & Eastern
and
Sample 5: Nairobi, Central, Coast, Eastern and N/Eastern
and
Sample 6: Nairobi, Central, Coast, Eastern, N/Eastern and Nyanza
and
Sample 7: sample 6 and R. Valley
and
Remark: We can clearly see the sample mean tending to the population mean as we approach the population total N which is in line with the Law of Large Numbers (LLN)
Therefore,
Comment: This technique can track reasonably well throughout up to a sufficiently large number after which, there is a need to shift the initial condition to where the error margin starts increasing in order to maintain precision.
3. Main Results
Empirical Analysis
Table 3 represents the actual population totals, estimated population totals and their corresponding errors from 1969 to 2009.
4. Conclusion
In this work, the logistic model is very effective especially with the presence of outliers in trying to maintain precision. It can perform well with a sufficiently large sample size. The logistic model can be more efficient in prediction especially where a regression model is ill conditioned.
Table 3. Estimated population and error calculations.
Acknowledgements
We are grateful to God for the grace and mercy rendered to us in seeing us through this work. Special thanks go to the African union for making it possible to pursue this course through scholarship.
Disclosure of Potential Conflicts of Interest
Authors strongly disclose no conflict of interest with regard to the publication of the paper.