On Modeling and Accuracy Analysis of the Available Bandwidth Measurement Based-on Packet-pair Sampling

Packet-pair sampling, also called probe gap model (PGM) is proposed as a lightweight and fast available bandwidth measurement method. But measurement tools based on PGM gives results with great uncertainty in some cases. PGM's statistical robustness has not been proved. In this paper we propose a more precise statistical model based on PGM. We present the new approach by using probability distribution and statistical parameters. We also investigate the use of a PGM bandwidth evaluation method considering a non-fluid cross traffic and present the alternative approach where the bursty nature of the probed traffic could be taken into account. Based on the model, measurement variance and sample size can be calculated to improve the measurement accuracy. We evaluated the model in a controlled and reproducible environment using NS simulations.


Introduction
For many applications, such as congestion control [1], service level agreement verification [2], rate-based streaming applications [3], Grid applications [4,5], efficient and reliable available bandwidth measurement remains a very important goal.Researchers have been trying to create end-to-end measurement algorithms for available bandwidth over the last 15 years.The objective was to measure available bandwidth accurately, quickly, and without affecting the traffic in the path.However, the diversity of network conditions makes it a very challenging task.We also worked at bandwidth measurement in recent 5 years [6][7][8][9][10].Existing measurement techniques fall into two broad categories [11]: The first class of schemes is based on statistical cross-traffic model, known as the probe gap model (PGM), also called direct probing.Measurement tools such as Delphi [12] and Spruce [13] are based on PGM.The main component of PGM is the mathematical relation between the input and output rates of a probing packet pair, under a fluid traffic model.For PGM, the probing packets are sent to the Internet from the probe host with a known separation.By measuring the change of the output gap, the utilization of the bottleneck link can be calculated.
The second class of schemes is the probe rate model (PRM), also known as iterative probing.PRM is based on the concept of self-induced congestion.Pathload [14], PathChirp [11], PTR [16], and TOPP [17] use the PRM model.

Figure 1. PGM model
Our analysis in this paper focuses on PGM.PGM based tools sample the arrival rate at the bottleneck by sending pairs of packets spaced so that the second probe packet arrives at a bottle-neck queue before the first packet departs the queue.These tools then calculate the number of bytes that arrived at the queue between the two probes from the inter-probe spacing at the receiver.A PGM tool computes the available bandwidth as the difference between the path capacity and the arrival rate at the bottleneck.
Figure 1 illustrates a typical PGM probing tool: A probe pair is sent with time gap g i .Specifically, if the size of the probing packets is I c and the packet pair is sent to the path at the rate of the bottleneck capacity b o (back-to-back), the input gap of the packet pair is set to g i =I c /b o .If the queue does not become empty between the departure of the first probe packet in the pair and the arrival of the second probe packet, the output gap g o would be the time taken by the bottleneck to transmit the second probe packet of the pair and the cross traffic arrived during g i .Thus, the time to transmit the cross traffic is g o -g i , and the rate of the cross-traffic is : The available bandwidth is: The PGM approaches assume [13]: 1) FIFO queuing at all routers along the path; 2) Cross traffic follows a fluid model (Non-probe packets have an infinitely small packet size); 3) Average rates of cross traffic change slowly and is constant for the duration of a single measurement.4) The single bottleneck is both the narrow and tight link for that path.5) The cross traffic follows the same path with the measurement traffic [18].
These assumptions are necessary for analysis but the model might still work even when some of the assumptions do not hold [16].In this paper, we are questioning the assumption 2: Published paper only illustrates PGM in its simplest form with stationary and fluid cross traffic.In real tests [19], the time gap g o between probing packets will grow discretely because long or short cross traffic packets are inserted between the probing packets.Several works in this field have been published [13,16,20,21] and all show very similar features.Paper [20] published a detailed analysis of paths characteristics and its influence on g o through lots of experiments.But no solid theory can be found to support these results.
To cope with the burst of cross traffic, both the PGM and PRM tools use a train of probe packets to generate a single measurement.They use statistical methods to estimate the cross traffic, computes the available bandwidth from the average of several sample measurements.Spruce [13] averages individual samples using a sliding window over 100 packets.Abing [20] sends 20 probes for a single test.Why the statistical averages can reflect the real cross traffic is still not clarified in published research.
We put all related symbols in Table1.If E(g o ) is the excepted value of output gap, b c is the cross traffic throughput, ′ c b is probe result, we get: ( ) Equation ( 4) and ideal PGM equation (1) are different.After above analysis, we have got several questions: First, PGM should be statistical robust even the cross traffic follows do NOT follows fluid model.Why the statistical averages can reflect the real cross traffic load?
Second, PGM still has great uncertainty in accuracy.Can we get the variance of PGM test?
Third, what should be the reasonable sample size of PGM probing on a specific network environment?
Fourth, how much is the PGM's accuracy affected by traffic burst?
This paper is organized as follows: In Section 2, we present a new approach for analyzing the PGM model by using probability distribution and statistical parameters on CBR traffic.Section 3 is aimed to derive the coefficient of variation (CV) of the probe result for the crossing traffic.The theoretical study is used to evaluate the impact on CV due to different characteristics of probing and crossing traffic.We present background traffic burst measurement methodology in Section 4. Section 5 is the NS-2 simulation.The paper concludes in Section 6.   Common study of the network layer always has focused on service models: the routing algorithms and the protocols.To analysis the interaction between the probing packets and the competing traffic, we have to consider the switching function of a router, find out the details of the actual queuing and transfer of the packets from incoming links to the outgoing links.
Generally, FIFO queuing is supported on the router's output port.The packet from different input queue is forwarded to the switching fabric and put on the output port follows the FIFO scheduling discipline.Figure 2 shows a scene where four cross traffic packets (black) and two probing packets (shaded black) at the front of two different input queues of a router are destined for the same output port.
Simple traffic models [22] such as ON/OFF sources have been very popular for describing Internet traffic flows.Informally these models assume that the traffic alternates between active states (ON periods) and idle states (OFF periods).During ON periods packets are sent at a constant rate, during OFF periods no packets are transmitted.
We use similar technique to analysis process model of PGM.The time segment for receiving one cross traffic packet is defined as g on .The time segment between two adjacent cross traffic packets is defined as g off .A queuing period T is defined to be the time segment from the first bit of cross traffic packet 1 received in queue 1 to the first bit of cross traffic packet 2 received in the same queue.
The switch fabric always chooses to transfer the firstcome packet from input queues to the output queues.On this case (Figure 2), the black packet in the up-left queue must wait the first probing packets (shaded black) to go first for it comes first.So all the cross traffic packet arrived during time gap g i are "captured" by the output gap.We put an icon of eye to describe the fact.Because how much the time gap is for the first probing packet being ahead of the first cross traffic packet is complete random, the probability of ∆t fall across any point in a queuing period T is equally distributed.So ∆t is a random variable with continuous uniform distribution.There are FOUR possible scenarios according to the relationship between g on and g i .We now investigate them one by one: Scenario1.g i <g on , for ∆t is a continuous random variable in [0,T].We have to identify the three more specific conditions (Figure 3 ( ) ( ) We can summarize from scenarios 1 to 4 that under any circumstance, E(g o ) is constant.The equation ( 3) is proved.It can also tell us the PGM model is statistical robust.

PGM Probing Variance
This section is aimed to derive the variance of the probe result.Variance can be calculated as: D(b c )=E(b c 2 ) − (E(b c )) 2 .From equation (3), we get: To get E(b c 2 ), we divide equation ( 5) in to 3 parts, then calculate them one by one.Consider below two scenarios: Scenario1.g i <T, according to probability distribution of g o (Figure 3(a), Figure 3(b)), we get: ) For the same probability distribution curve, we can get: Then from equation ( 5)-( 7), we get:

CV Curve
We use graphical methods to analysis CV according to equation (13).We put four CV curves on Figure 4 to explore how network parameters can affect the measurement accuracy.1) Figure 4(a): CV(b c ) and traffic packet length Figure 4(a) clearly shows that the CV(b c ) shows an upward trend as traffic packet length increases .There is some variability about this trend, with some CV increasing over and some decreasing.

Sample Size
Sample size is the number of observations in a sample.Lindebergh-Levy Theorem [15] describes the large sample behavior of random variables that involve sums of variables.The theorem is often written as → 2 N N(X -µ) N(0,σ ) which points out that N X converges to its mean µ at exactly the rate as N increases, so that the product eventually "balances out" to yield a random variable with normal distribution.The proof of this result is a bit more involved, requiring manipulation of characteristic functions, and will not be presented here.
From above analysis, obviously we can treat distribution of o g as a random variable with normal distribution.From equation (4), we can know where N is the sample size, Z is the confidence level, e is the desired level of precision (in the same unit of measure as the variance), and D(b′ c ) is the probe variance.We also can calculate the sample sizes from CV: ( ) where m is the desired relate level of precision, it is often expressed in percentage points relative to mean c b .

PGM over Internet
PGM is working perfect over the CBR traffic flows.But for Internet probing, how to estimate accuracy and precision of PGM is still an open question.Recent empirical studies have provided ample evidence that actual network traffic is self similar [23] or fractal in bursty nature over a wide range of time scales.These observations are useful to make PGM probing over Internet more significant to cope with bursty traffic.We suggest a solution to predict the variance of PGM.An observer can easily predicted traffic patterns from actual measured traffic traces to help PGM probing evaluation.

Traffic Burst Coefficient
ON/OFF models assume that traffic alternates among an active state or ON period and idle state or OFF period.We define the traffic burst coefficient B u as: variance of OFF period D(g off ).We capture n+1 packets (n can be set to more than 5000 in our experiments) from the link running PGM probing (The sample is often got from MIB database of router of the link).The timestamp of first bit and last bit of packets arrived in router is recorded as {t 1 ,t 2 ,t 3 ,t 4 ,…,t 2n-2 ,t 2n-1 }, and then we have:

PGM CV Estimation
Clearly, the coefficient of variation depends on whether or not the network traffic being bursty.We can solve this problem by the following reasoning: First we have to estimate the maximum and minimal value of B u : Minimal B u : minimal B u is zero, when traffic is CBR source.All packets are equality distributed.
Maximum B u : Maximum burst means all n+1 packets are connected.The traffic self-similarity gives raise to structural models with the distinctive feature that their ON and or OFF periods are heavy tailed with infinite variance.So the Maximum B u is: We define I′ p as the average length of the packet trains.It can be proved that I′ p is in direct proportion to with B u .So: I′ p can be use in place of I p in equations to calculate the CV(b c ).Thus CV(b c ) can be calculated as:

Test Bed Illustration
Our evaluation is based on NS-2 [24] simulations, since we need to carefully control the competing traffic load in the network.The topology is shown in Figure 5.In this topology, there are one probing source and one receiver.Cbr1, Cbr2 and Cbr3 are used to generate competing traffic.R1 and R2 are FIFO-based routers.Probe sender sends out UDP packets pairs, served as a PGM probing pair.The time gap g i is calculated by packet length and bottleneck bandwidth to ensure the packets are back-to-back.The cross traffic is generated using CBR stream.Timer driven stratified random sampling [25] is used in all our tests, for there is often positive correlation between competing traffic packets within the sample space.We use a timer to trigger the sending of probing packets.When the timer expires, we send out a pair of probing packet.
The timer is set to a random time with an average of 50ms.Our test result indicates that the accuracy performance of stratified sampling technique is wellabove other sampling methods in eliminate the positive correlation, and improves measured precision greatly.Further discussion is beyond the scope of this paper.
Our evaluation includes two parts: continues probing tests and grouped probing tests.
First, we perform continues probing using 4000 packet pairs, focusing on the measurement accuracy and the convergence time.We inject cross traffic at a rate of b c =3Mb/s.The bottleneck bandwidth is set to 10Mb/s.Cross traffic packets are set to be 1500 bytes long .We sent probing packet every 50 ms, with the 1000 bytes probing packet.Like other PGM tools such as Spruce [13], we average individual samples using a sliding window over 360 packets.That is also why the first 18 second we get no output.From equation ( 15), we can estimate bandwidth output will be within boundary ±0.5217M/s with confidence level to 90%.
On grouped probing tests, we analyze how the factors such as probing packet size have effect on the measurement accuracy of PGM.We also study the accuracy of sample size prediction on a network path, and look into a related issue of the sample size prediction.There are 4 groups 32 combined tests conducted.In each group, we change different characteristics of probing and crossing traffic to evaluate its impact on CV.Sample size is calculated from equation (15).We chose the confidence interval to 0.10.From normal distribution table we get z=1.65,e=0.5217M/s,.The sample size is just 10 times the value of D(b′ c ).So the sample size calculation is simplified.

Test Results
Figure 6 illustrates typical segments of continues probing results, plots the cross traffic over a period of 200 seconds measured by PGM probing.Form the figure we can find that the probe results b′ c is a nice match for the actual competing traffic of 3M/s.The statistical robustness of PGM is verified.From the output data we can see the only a few point are found out of the boundary 0.5217M.That matches the preset confidence Level 90%.
Grouped probing tests result is shown in Table 2 and Figure 7.
In all 32 tests, 4 results (Dots in probe result curve) are found to be out of the boundary (Solid line in probe result curve) 0.5217M/s .That matches the preset confidence level 90％ which assumes about 3 out of the boundary results.
The variance of a PGM probing is the variance to the mean of the results on that group.The theoretical variance is calculated (Solid line in probe variance curve), the probe variance (Dots in probe variance curve) is a nice match for the theoretical value.The test result can also tell us that the distribution of c b is quite close to normal distribution.
From these two experiments, the mathematical abstraction of section 2, 3 is validated.

Conclusions
PGM probing method features a strong interplay between network and prober.Contributions of this paper are providing the engineers with both descriptive and analytical methods to dealing with the variability in PGM probing.We develop a more precise single-hop gap model that captures the relationship between the competing traffic throughput and the change of the packet pair gap for a single-hop network.We can use the model to understand, describe, and quantify important aspects of the PGM and predict the response from inputs.We explore how network parameters can affect the measurement accuracy.In the future we will work on how to use these analysis methodologies to conduct monitoring efforts on both research and commodity infrastructure.

1 .
Modeling of PGM Probing

Figure 3 .
Figure 3. PGM statistical model and probability distribution curve . (n+1)T > g i > g on nT.(Figure 3(c)) : a) When ∈ − − i o n ∆t [0,g nT g ] , n+1 cross traffic packets are inserted between two probing packets.g o = g i +g ,T g ] , n cross traffic packets are inserted between two probing packets.g o = g i + ng c .c) When ∈ − on ∆t [T g ,T] , n+1 cross traffic packets are inserted between two probing packets.g o =g i +(n+1)g c .T g ] , n cross traffic packets are inserted between two probing packets.g o = g i + ng c .g , T] , n cross traffic packets are inserted between two probing packets.g o = g i + ng c .

(
Only specifying the standard deviation is more or less useless without the additional specification of the mean value.It makes a big difference if D(b c )=5 with a mean of E(b c )=100M/s or E(b c )=3M/s.Relating the standard deviation to the mean resolves this problem.The coefficient of variation represents the ratio of the standard deviation to the mean, and it is a useful statistic for comparing the degree of variation from one data series to another, even if the means are drastically different from each other.The coefficient CV is defined by:

2 )
Figure 4(b): CV(b c ) and cross traffic In fact cross traffic b c is our goal of probing.Analysis here is only for showing the relationship between b c and CV(b c ).Note that if the b c is very small, CV(b c ) becomes very big .That means: Probing with accuracy for small cross traffic is very difficult.3) Figure 4(c): CV(b c ) and probing packet length Figure 4(c) shows that CV(b c ) a downward trend.If the probing packet length I c is very small, CV(b c ) becomes very big.That means: Small packet is not suitable for probing.4) Figure 4(d): CV(b c ) and bottleneck capacity Figure 4(d) shows that the CV(b c ) exhibit an upward trend as bottleneck capacity increases.
Figure 4. CV curve treated as a random variable with normal distribution.We can use below equation to calculate the sample sizes for variables of normal distribution: