Properties of the Estimators for the Effective Bandwidth in a Generalized Markov Fluid Model ()
1. Introduction
A multiplexing system can be thought as a buffer with capacity B and output velocity C, fed by many different data sources that share the common output port. One of the more interest study subject is to know how many resources will each source required from the system. This knowledge has different applications to call admission, control and building. This magnitude of the required resources is known as the Effective Bandwidth of the traffic source and is an useful and realistic measure of channel occupancy. The best interpretation and collection of results on Effective Bandwidth are given in Kelly [2] where the Effective Bandwidth for a process
, with stationary increments, that represents the total amount of work arriving from a source in the interval
is defined as
(1)
This definition can be motivated in several ways, perhaps the most important is that the logarithmic moments generating function is naturally associated with the additive property of the sources in a node of the network. The space parameter s, measured in data units, point out the degree of the multiplexing. The time parameter t, measured in time units, it is related with period of greater buffer occupancy before overflow; slow accumulation of work load in the buffer corresponds to large values of t as well as fast accumulation corresponds to small values of t. Both parameters characterize the link operating point depending on the context of the stream.
The general problem is that resources are shared by a set of heterogeneous communications and when a new communication is accepted, its workload must be estimated to allocate part of the available resources. Therefore two problems motivate our work: to propose a realistic model for communications traffic and calculate the effective bandwidth for this model for the allocation of resources, and to obtain consistent estimators for the parameters.
The paper is organized as follows, the Generalized Markov Fluid Model model is introduced in Section 2, In Section 3, we compute de effective bandwidth for the introduced model and in Section 4 the proposed estimator is presented with its properties. Section 5 contains the numerical results of simulations, conclusions are presented in Section 6 and finally some useful lemmas are in the Appendix.
2. The Model
The needs for networks that integrate various telecommunication services leads to the emergence of the concept of integrated services digital network, which involves the use of a single infrastructure for transport, at high speeds, of data, voice and images. An important issue is the selection of the transfer process, which could be defined as the set of multiplexing mechanisms for commu- nication in the network. Through the concept of statistical multiplexing of sources, high efficiency is achieved in the use of network resources. If a source sends data at a variable rate, multiplexing the sources in a link, it reserves for each a capacity greater than the average rate but lower than the maximum emission rate. The price to pay is that the probability that many sources agree on dispatching the maximum rate is not zero, in which case overflow would occur with consequent damage to the Quality of Service (QoS).
To minimize these effects of data loss and maintain quality of service, both for existing sources as well as new, is necessary to have connection admission control (CAC) which to decide whether it can accept a new connection, as well as of congestion control mechanisms. For this we need mathematical models to describe the behavior of the sources.
The Generalized Markov Fluid
Markov Fluid models have been applied to model many kinds of digital sources but there are limitations when speed data transfers take too much values and the Generalized Markov Fluid model, introduced in [1] , can be used to describe properly those kind of traffic in a source.
In the GMFM, a source in a data network assumes the state
at time t, where Z is a continuous time, homogeneous and irreducible Markov chain, with finite state space
, invariant distribution
and infinitesimal generator
. That is, at time t the chain Z reaches states i and the rate data transfer of the sources is drawn, independently of the chain Z, by the law
. So, the random variable
, is distributed according the probability law
and the density functions
, are known and with disjoint support for
.
The process Y does not change until the chain Z changes its state and since the supports of the k laws of probability are known and disjoint, observe the process
allow us to restore the process
.
The work load received from the source that delivers information with speed
is represented by the Markov flow modulated by the chain
and can be written as
(2)
The advantage of the GMFM is that makes manageable those networks where the speed of the source in each state is a random variable. In this model, abrupt changes in the transfer speed report a change of state in the chain, but within a state is allowed the rate to assume randomly any value according to some probability distribution. The laws of probability may be discrete or continuous.
To interpret this model we could think that each state in the chain is interpreted as the activity performed by a user, like chat or video conferences, then speed data transfer assumes values that depend specifically for such activity.
3. Effective Bandwidth Estimation
One of the main issues in QoS for admission control is the estimation of the resources needed for guaranteed communications, which cannot be the peak rate because would be too pessimistic and would lead to resource waste, nor the mean rate of the service, because would be a too optimistic estimation, that would cause frequent losses. Given an expected QoS, interpreted as the probability of buffer overflow, the Effective Bandwidth (EB) of the traffic sources defined in (1), was proposed by Kelly in [2] and is a realistic measure of channel occupancy. The space and time parameters, s and t respectively, depend not only on the source itself, but on the context on which this source is acting as the capacity, the buffer size, the scheduling policy of the multiplexer, the QoS parameter to be achieved and the number of other sources in the channel, this is the actual traffic mix. The EB concept can be applied to sources or to aggregated traffic, as it can be the networks core link, but also it can be used for any shared resource models.
In order to estimate EB for a given GMFM, our first goal is to find the type of formulas obtained by Kesidis, Walrand and Chang [3] [4] to calculate the EB, that depends on estimable data from traffic traces.
Computation of the EB for the GMFM
Let
be a GMFM modulated by a continuous time, homogeneous and irreducible Markov chain Z with finite state space
and invariant distribution
and infinitesimal generator
. Let
be the random variables with density function
, mean
, variance
and Laplace transform
, for
. Let us also assume that each
has known and disjoint support
with
. Let us denote
the diagonal matrix of dimension k, whose nonzero elements are the first moments
of each distribution.
Theorem 1. Let
be a GMFM, then the effective bandwidth has the following expression
(3)
where
is a column vector with all entries equal to 1.
Proof. By definition (1), it is enough to proof
The process
can be represented as in (2) and
is uniformly bounded, hence applying Lebesgue’s dominated convergence theorem
(4)
The Markov chain
is homogeneous and
is the invariant distribution, so the argument of the limit in (4) can be written as
(5)
Each integral in (5) represents the Laplace transform of the density function, so we can express the right side as the product of the transition matrix and a
diagonal matrix
, whose non zero elements are
, in the next way
Applying Taylor’s formula to each matrix we obtain
(6)
(7)
with I the identity matrix and
a diagonal matrix which non zero element are
.
Then,
(8)
and the right side of (8) tends to
. ☐
The importance of this result is that provides an expression for the EB that depends on the infinitesimal generator of the modulating chain, its invariant distribution and a matrix containing information of the transfer rate, and all these elements can be estimated with traffic traces.
4. The estimator and its Properties
In order to introduce an estimator for the EB to this traffic model, the elements of the matrices
and
are the parameters that must be estimated according to the equation (3).
For the first, a result presented by Lebedev and Lukashuk [5] [6] plays a key role in the construction of our estimator, providing asymptotically gaussian estimator, based on traffic traces, of the infinitesimal generator matrix. The maximum likelihood estimator
of each element
not belonging to the diagonal of infinitesimal generator matrix, is the ratio between the number of transitions of the chain Z from the state i to the state j and the time spent by the chain Z in the state i, both during the same unitary time interval. So defined
is unbiased with variance
, where
is the i-th element in the vector of the invariant distribution
.
4.1. The estimator of the elements of
The elements of
are the mean data transfer rate
at each state i of the chain Z. The proposed estimator is
(9)
where
denotes the r-th observed rate corresponding to the range of Y when modulating chain is in state i and
the number of times that the modulating chain Z reaches the state i in the interval
.
Before proving the following results let us remark that the random value
grows as the observed range n increases and due to assumptions about the chain Z, each states i is positive recurrent with average turnaround time
satisfying this relationship
(10)
Proposition 2. Let
be a GMFM and
defined in (9), then
1)
is an unbiased and consistent estimator of
.
2)
.
Proof. 1) Let us compute the expected value of (9) using conditional expectation
(11)
(12)
(13)
so
is unbiased.
To prove consistence it is enough to show that the variance of (9) tend to 0 as n grows. The second moment can be compute similarly
(14)
(15)
Replacing
, where
is the Kronecker delta, this is
if
and 0 elsewhere, we obtain
(16)
(17)
and then
(18)
that tends to 0 as n grow, hence consistence is proved.
2) Applying a classic Central Limit Theorem [7] , to the variables
, it is true that
(19)
A classic result of the stochastic process theory, see [8] , will allow us to achieve the result by just proving that
.
(20)
(21)
Calculating the first term by conditional expectation we obtain
(22)
(23)
that tends to 0 as n grows.
The argument in de expectation of the second term can be written like
(24)
where
and
, so similarly we compute
(25)
(26)
But
, so (26) becomes into
(27)
that also tends to 0 as n grows, and the theorem is proved. ☐
4.2. The estimator of
From the maximum likelihood estimators of
and the estimator of
in (9), a estimator of (3) can be construct. Let us define
and
the vector with de no diagonals elements of
and
res- pectively.
Let us also define some functions that allow to build the matrices from the vectors presented above, they are
such that
, where
such that
if
and
if
;
such that
, where
is de- fined
if
and 0 otherwise.
Finally, another function whose response is a matrix that gives the same information that
but that has the advantage of admitting inverse is defined
as
, where
is such that
if
and 1 if
.
Since
and
contain exactly the same information, we can think any parameter that depends on
as a function of
.
We are now able to present the following result that gives the asymptotic distribution of (3).
Theorem 3. Let
be a GMFM, the vectors
, and
containing the estimators of
and
respectively. Let us define the following functions:
(28)
(29)
(30)
Then, for fixed s and t,
follows that
(31)
with
, where
.
Proof. As
,
are unbiased, asymptotically gaussian, and the functions used to construct
in (31) are differentiable so applying Lemma (6) and Lemma (7) in Appendix the result holds. Then
can be written more precisely like
(32)
where
(33)
(34)
and
,
.
☐
4.3. Consistency of the Estimators
We show now the main result that allows us to find consistent estimators for each parameter involved in the formula of the variance
obtained in the Theorem 3.
Proposition 4. Let
, and
containing the estimators of
and
respectively then
1)
is a consistent estimator of
.
2)
is a consistent estimator of
.
3)
is a consistent estimator of
4)
is a consistent estimator of
.
Proof.
1) By definition
and
, then by Lema 8 in Appendix,
. As
and
is continuous then
.
On the other hand
is also continuous, then for n large enough,
admits inverse and it is fulfilled that
, so
is a consistent estimator of
.
2) As
, then
and by Lemma 8 in Appendix
(35)
But
is a consistent estimator of
, then
is a consistent estimator of
.
3) To prove that
, or equivalently
, let us remember that the matrix
it can be written in two equivalent ways as follows
(36)
Then
(37)
The second term is the tail of a convergent series, therefore it tends to 0 when n grows.
For the first term, we will apply the Mean Value Theorem defining the function
such that
, to express the increment of the function as a proportion of the argument increment through the differential operator in the following way then
(38)
where
is between
and
, or in other way (38) can be written
By definition of differential operator to
, equation (38) becomes into
(39)
so we have
(40)
(41)
(42)
(43)
As
is bounded by being the partial sum of a convergent
series, and both
as
are consistent estimator of
and
respectively, each terms of the first factor tend to 0, so
is a consistent estimator of
.
4) This point is derived directly from the points 1 and 3.
☐
4.4. Confidence interval for
The following theorem and corollary show how to perform numerical computation using the main result.
Theorem 5. Let
be the maximum likelihood estimators of Q,
and
the estimators presented in the proposition above, and
a succession of positive real numbers such that
, then
(44)
is a consistent estimator of
, with
and
defined as in (33) and (34) respectively.
The main argument in the proof is the differentiability , and a straightforward consequence of the theorem is the following result,
Corollary 1. Taking
,
and
defined in (3), (31) and (44) respectively, we have
1)
2) If
where
is such that
for
, then
(45)
5. Simulation and Numerical results
In this section we will carry out the analysis with traffic traces generated by simulations from the model introduced in Section 2 to perform the estimations.
5.1. Parameters for the Simulation
To validate the results obtained, we performed several traffic simulations according to the GMFM model presented.
In the model chosen, the modulating Markov chain has
states and each state is associated with a data transfer rate interval as shown in Table 1.
It is expected the usual state to be that of the highest transfer rate available in the transmission channel, so the most probable state is 13. It is also more common to jump from one state to the adjacent ones, or to the maximum transfer rate, or minimum or no transfer rate. With these considerations it is designed the infinitesimal generator of the modulating chain that is given by the matrix
(46)
Within each of these intervals, it is raffled how much is actually dispatched by means of a normal distribution truncated to the interval with mean equal to the midpoint of the interval and deviation equal to one sixth of the length of the interval. The diagonal matrix
will contain the mean values of these distri- butions.
An example of the traces generated for the simulations can be seen in Figure 1.
5.2. Estimation of the Effective bandwidth from traces
The first objective is to calculate the EB of the presented model, for which we will use the result shown in Theorem 1, and the EB is then calculated according to the formula (3). Figure 2 shows the EB calculated for the GMFM.
For each simulated trace we estimate the EB using the estimator presented in Theorem 2 according to the equation (31) and in Figure 3 the comparison of the estimated effective bandwidth for a trace with the theoretical value is shown.
Figure 1. Trace generated with the GMFM.
Figure 2. Effective bandwidth for the GMFM model.
Figure 3. Effective bandwidth vs. Estimated effective bandwidth.
6. Conclusions
In this work, we have presented contributions in two areas related to data networks. About the modeling, we have proposed the GMFM that has the advantage of being very realistic for the current requirements of telecommunication networks, in which it is possible to apply refined tools and mathematical statistics results. We have also found a formula for the effective bandwidth where can be visualized the role that play each parameters of the model.
Regarding the estimation of parameters, we have proposed a methodology to estimate the effective bandwidths, from traffic traces of a GMFM source, which has the expected properties to ensure that it complies with a Central Theorem of Limit and thus be able to build a confidence interval. These results enable the calculation of the effective bandwidth from simulated traffic traces. A numerical example has been presented, where the results were applied to traffic traces and ideal results were obtained.
It is expected to extend statistical effective bandwidth calculation to other stochastic phenomena where the supports of each probability law are not disjoint, or which do not need to be Markovian and the application of these techniques to real data scenarios.
Acknowledgements
The authors express their gratitude to Dr. Gonzalo Perera for introducing them to the topic and for his valuable suggestions.
Appendix
Lemma 6. Let
be a sequence of random variables in
,
sequence of positive numbers satisfying
and
such that
(a1)
Let us consider
differentiable in an neighborhood of Z, then
(a2)
Lemma 7. Let us consider
as in (30),
as in (29) and
as in (28), and
and
defined in Section 4.2, then
1)
.
2)
.
3)
with
.
4)
,
with
.
Lemma 8. The matrix
supports inverse
that is differentiable and fulfills
(A3)