Properties of the Estimators for the Effective Bandwidth in a Generalized Markov Fluid Model

The Generalized Markov Fluid Model (GMFM), introduced in [1], is assumed for modeling sources in the network because it is versatile to describe the traffic fluctuations. In order to estimate resources allocations or in other words the channel occupation of each source, the concept of effective bandwidth proposed by Kelly [2] is used. In this paper, we present a formula for calculating the effective bandwidth, developed for the Generalized Markovian Flow model, which is of particular interest because it allows expressing said magnitude depending on the parameters of the model. We present unbiased estimators for these parameters that can be obtained from real data. The convergence and the consistency of the estimation are studied, and confidence bands are found. Illustrative calculation and performance of the proposed estimators were tested with simulated data and ideal results were obtained.


Introduction
A multiplexing system can be thought as a buffer with capacity B and output velocity C, fed by many different data sources that share the common output port.One of the more interest study subject is to know how many resources will each source required from the system.This knowledge has different applications to call admission, control and building.This magnitude of the required resources is known as the Effective Bandwidth of the traffic source and is an useful and realistic measure of channel occupancy.The best interpretation and collection of results on Effective Bandwidth are given in Kelly [2] where the Effective Open Journal of Statistics Bandwidth for a process t X , with stationary increments, that represents the total amount of work arriving from a source in the interval [ ] 0,t is defined as ( ) This definition can be motivated in several ways, perhaps the most important is that the logarithmic moments generating function is naturally associated with the additive property of the sources in a node of the network.The space parameter s, measured in data units, point out the degree of the multiplexing.The time parameter t, measured in time units, it is related with period of greater buffer occupancy before overflow; slow accumulation of work load in the buffer corresponds to large values of t as well as fast accumulation corresponds to small values of t.Both parameters characterize the link operating point depending on the context of the stream.
The general problem is that resources are shared by a set of heterogeneous communications and when a new communication is accepted, its workload must be estimated to allocate part of the available resources.Therefore two problems motivate our work: to propose a realistic model for communications traffic and calculate the effective bandwidth for this model for the allocation of resources, and to obtain consistent estimators for the parameters.
The paper is organized as follows, the Generalized Markov Fluid Model model is introduced in Section 2, In Section 3, we compute de effective bandwidth for the introduced model and in Section 4 the proposed estimator is presented with its properties.Section 5 contains the numerical results of simulations, conclusions are presented in Section 6 and finally some useful lemmas are in the Appendix.

The Model
The needs for networks that integrate various telecommunication services leads to the emergence of the concept of integrated services digital network, which involves the use of a single infrastructure for transport, at high speeds, of data, voice and images.An important issue is the selection of the transfer process, which could be defined as the set of multiplexing mechanisms for communication in the network.Through the concept of statistical multiplexing of sources, high efficiency is achieved in the use of network resources.If a source sends data at a variable rate, multiplexing the sources in a link, it reserves for each a capacity greater than the average rate but lower than the maximum emission rate.The price to pay is that the probability that many sources agree on dispatching the maximum rate is not zero, in which case overflow would occur with consequent damage to the Quality of Service (QoS).
To minimize these effects of data loss and maintain quality of service, both for existing sources as well as new, is necessary to have connection admission control (CAC) which to decide whether it can accept a new connection, as well Open Journal of Statistics as of congestion control mechanisms.For this we need mathematical models to describe the behavior of the sources.

The Generalized Markov Fluid
Markov Fluid models have been applied to model many kinds of digital sources but there are limitations when speed data transfers take too much values and the Generalized Markov Fluid model, introduced in [1], can be used to describe properly those kind of traffic in a source.
In the GMFM, a source in a data network assumes the state t Z at time t, where Z is a continuous time, homogeneous and irreducible Markov chain, with finite state space { } The process Y does not change until the chain Z changes its state and since the supports of the k laws of probability are known and disjoint, observe the process t Y allow us to restore the process t Z .
The work load received from the source that delivers information with speed t Y is represented by the Markov flow modulated by the chain t Z and can be written as The advantage of the GMFM is that makes manageable those networks where the speed of the source in each state is a random variable.In this model, abrupt changes in the transfer speed report a change of state in the chain, but within a state is allowed the rate to assume randomly any value according to some probability distribution.The laws of probability may be discrete or continuous.
To interpret this model we could think that each state in the chain is interpreted as the activity performed by a user, like chat or video conferences, then speed data transfer assumes values that depend specifically for such activity.

Effective Bandwidth Estimation
One of the main issues in QoS for admission control is the estimation of the resources needed for guaranteed communications, which cannot be the peak rate because would be too pessimistic and would lead to resource waste, nor the mean rate of the service, because would be a too optimistic estimation, that would cause frequent losses.Given an expected QoS, interpreted as the probability of buffer overflow, the Effective Bandwidth (EB) of the traffic sources defined in (1), was proposed by Kelly in [2] and is a realistic measure of channel occupancy.The space and time parameters, s and t respectively, depend not only on the source itself, but on the context on which this source is acting as the capacity, the buffer size, the scheduling policy of the multiplexer, the QoS parameter to be achieved and the number of other sources in the channel, this is the actual traffic mix.The EB concept can be applied to sources or to aggregated traffic, as it can be the networks core link, but also it can be used for any shared resource models.
In order to estimate EB for a given GMFM, our first goal is to find the type of formulas obtained by Kesidis, Walrand and Chang [3] [4] to calculate the EB, that depends on estimable data from traffic traces.

Computation of the EB for the GMFM
X ≥ be a GMFM, then the effective bandwidth has the following expression where 1 is a column vector with all entries equal to 1.
Proof.By definition (1), it is enough to proof ( ) ( ) The process t X can be represented as in (2) and t Y is uniformly bounded, hence applying Lebesgue's dominated convergence theorem ( ) The Markov chain s Z is homogeneous and π is the invariant distribution, so the argument of the limit in (4) can be written as Each integral in (5) represents the Laplace transform of the density function, so we can express the right side as the product of the transition matrix and a diagonal matrix , whose non zero elements are Applying Taylor's formula to each matrix we obtain with I the identity matrix and  a diagonal matrix which non zero element are ( ) Then, ( ) and the right side of (8) tends to ( ) The importance of this result is that provides an expression for the EB that depends on the infinitesimal generator of the modulating chain, its invariant distribution and a matrix containing information of the transfer rate, and all these elements can be estimated with traffic traces.

The Estimator and Its Properties
In order to introduce an estimator for the EB to this traffic model, the elements of the matrices Z  and  are the parameters that must be estimated according to the Equation (3).
For the first, a result presented by Lebedev and Lukashuk [5] [6] plays a key role in the construction of our estimator, providing asymptotically gaussian estimator, based on traffic traces, of the infinitesimal generator matrix.The maximum likelihood estimator ( )   n ij q of each element ij q not belonging to the diagonal of infinitesimal generator matrix, is the ratio between the number of transitions of the chain Z from the state i to the state j and the time spent by the chain Z in the state i, both during the same unitary time interval.So defined is the i-th element in the vector of the invariant distribution π .

The Estimator of the Elements of 
The elements of  are the mean data transfer rate i µ at each state i of the chain Z.The proposed estimator is where µ is an unbiased and consistent estimator of i µ . 2) Proof. 1) Let us compute the expected value of (9) using conditional expectation To prove consistence it is enough to show that the variance of (9) tend to 0 as n grows.The second moment can be compute similarly Replacing , where kj δ is the Kronecker delta, this is 1 kj δ = if k j = and 0 elsewhere, we obtain 1 , , that tends to 0 as n grow, hence consistence is proved.
2) Applying a classic Central Limit Theorem [7], to the variables ( ) A classic result of the stochastic process theory, see [8], will allow us to achieve the result by just proving that .
Calculating the first term by conditional expectation we obtain ( ) that tends to 0 as n grows.
The argument in de expectation of the second term can be written like , , so similarly we compute .
that also tends to 0 as n grows, and the theorem is proved.☐

The Estimator of ( ) , s t α
From the maximum likelihood estimators of ij λ and the estimator of i µ in (9), a estimator of (3) can be construct.Let us define Finally, another function whose response is a matrix that gives the same information that  but that has the advantage of admitting inverse is defined as ( ) Since , Λ  and  contain exactly the same information, we can think any parameter that depends on  as a function of Λ .
We are now able to present the following result that gives the asymptotic distribution of (3).
Theorem 3. Let t X be a GMFM, the vectors containing the estimators of ij λ and i µ respectively.Let us define the following functions: , that , exp , , that , , , , that , log , .

Consistency of the Estimators
We show now the main result that allows us to find consistent estimators for each parameter involved in the formula of the variance 2 σ obtained in the Theorem 3. ( ) Proof.


, then by Lema 8 in Appendix, On the other hand  is also continuous, then for n large enough, ( ) ( ) 3) To prove that ( ) . . , , let us remember that the matrix ( ) written in two equivalent ways as follows The second term is the tail of a convergent series, therefore it tends to 0 when n grows.
For the first term, we will apply the Mean Value Theorem defining the function , to express the increment of the function as a proportion of the argument increment through the differential operator in the following way then where n  is between n  and  , or in other way (38) can be written By definition of differential operator to ( ) so we have As ( ) is bounded by being the partial sum of a convergent series, and both n Λ as n ϒ are consistent estimator of Λ and ϒ respectively, each terms of the first factor tend to 0, so n  is a consistent estimator of  .
4) This point is derived directly from the points 1 and 3. ☐ is a consistent estimator of 2 σ , with ( ) where z  is such that ( )

Simulation and Numerical Results
In this section we will carry out the analysis with traffic traces generated by simulations from the model introduced in Section 2 to perform the estimations.

Parameters for the Simulation
To validate the results obtained, we performed several traffic simulations according to the GMFM model presented.
In the model chosen, the modulating Markov chain has 13 k = states and each state is associated with a data transfer rate interval as shown in Table 1.

Estimation of the Effective Bandwidth from Traces
The first objective is to calculate the EB of the presented model, for which we will use the result shown in Theorem 1, and the EB is then calculated according to the formula (3). Figure 2 shows the EB calculated for the GMFM.
For each simulated trace we estimate the EB using the estimator presented in Theorem 2 according to the Equation (31) and in Figure 3 the comparison of the estimated effective bandwidth for a trace with the theoretical value is shown.

Conclusions
In this work, we have presented contributions in two areas related to data networks.About the modeling, we have proposed the GMFM that has the advantage of being very realistic for the current requirements of telecommunication networks, in which it is possible to apply refined tools and mathematical statistics results.We have also found a formula for the effective bandwidth where can be visualized the role that play each parameters of the model.
Regarding the estimation of parameters, we have proposed a methodology to estimate the effective bandwidths, from traffic traces of a GMFM source, which has the expected properties to ensure that it complies with a Central Theorem of Limit and thus be able to build a confidence interval.These results enable the calculation of the effective bandwidth from simulated traffic traces.A numerical example has been presented, where the results were applied to traffic traces and ideal results were obtained.
It is expected to extend statistical effective bandwidth calculation to other stochastic phenomena where the supports of each probability law are not disjoint, or which do not need to be Markovian and the application of these techniques to real data scenarios.
asymptotically gaussian, and the functions used to construct ( ) ( ) , n s t α in (31) are differentiable so applying

4. 4 .α
Confidence Interval for ( )The following theorem and corollary show how to perform numerical computation using the main result.Theorem 5. Let ( ) and n  the estimators presented in the proposition above, and n m a succession of positive real numbers such that n n m →∞   → ∞ , then ⋅ defined as in(33)    and (34) respectively.The main argument in the proof is the differentiability , and a straightforward consequence of the theorem is the following result, Corollary 1. Taking ( )

Figure 2 .
Figure 2. Effective bandwidth for the GMFM model.
 .That is, at time t the chain Z reaches states i and the rate data transfer of the sources is drawn, independently of the chain Z, by the law i f .So, grows as the observed range n increases and due to assumptions about the chain Z, each states i is positive recurrent with average turnaround time 1 i λ satisfying this relationship ( ) c.s. .
i N n the number of times that the modulating chain Z reaches the state i in the interval [ ] 0, n .Before proving the following results let us remark that the random value ( ) i N n