Modeling and Performance Analysis of Weighted Priority Queueing for Packet-Switched Networks

Weighted priority queueing is a modification of priority queueing that eliminates the possibility of blocking lower priority traffic. The weights assigned to priority classes determine the fractions of the bandwith that are guaranteed for individual traffic classes, similarly as in weighted fair queueing. The paper describes a timed Petri net model of weighted priority queueing and uses discrete–event simulation of this model to obtain performance characteristics of simple queueing systems. The model is also used to analyze the effects of finite queue capacity on the performance of queueing systems.


Introduction
Although the internet was originally intended for non-time-critical transport [1], there is a growing interest in adding real-time traffic to the traditional non-time-critical bulk traffic.Real-time traffic is characterized by bounds on some performance metrics (such as delay, jitter or packet loss probability).Voice over IP (VoIP) and Internet Protocol TV (IPTV) are examples of real-time traffic.Because of these performance bounds, real-time traffic requires preferential service during transport.
The strategy for mixing real-time and bulk traffic is to use, at the nodes of the network, separate queues for different classes of traffic, so the real-time traffic can get the service it requires.Priority queueing [2] is the simplest mechanism that provides preferential service to some classes of traffic; in the priority queueing, lower priority traffic can be serviced only when all queues of higher priority classes are empty.Such a policy works well when the traffic is not very intensive but can result in blocking lower priority traffic for extended periods of time if the traffic in higher priority classes becomes intensive.Therefore a number of modifications of (strict) priority queueing were proposed to avoid such blocking and to guarantee some levels of service for lower priority classes independently of traffic in higher priority classes [3], [4].Weighted priority queueing is one of such modifications which assigns fractions of the bandwidth to traffic classes according to class weights.Modern communication networks [5] are complex structures which-for modeling-require a flexible formalism that can easily handle concurrent activities as well as synchronization of different events and processes that occur in such networks [6].Petri nets [7], [8] are such formal models.As formal models, Petri nets are bipartite directed graphs, in which the two types of vertices represent, in a very general sense, conditions and events.An event can occur only when all conditions associated with it (represented by arcs directed to the event) are satisfied.An occurrence of an event usually satisfies some other conditions, indicated by arcs directed from the event.So, an occurrence of one event causes some other event to occur, and so on.
In inhibitor Petri nets, in addition to directed arcs, inhibitor arcs provide "test if zero" condition which does not exist in "standard" Petri nets.Inhibitor arcs are needed for modeling priority mechanisms.
In order to study performance aspects of systems modeled by Petri nets, the durations of modeled activities must also be taken into account.This can be done in different ways, resulting in different types of temporal nets.In timed Petri nets [9], occurrence times are associated with events, and the events occur in real-time (as opposed to instantaneous occurrences in other models).For timed nets with constant or exponentially distributed occurrence times, the state graph of a net is a Markov chain (or an embedded Markov chain), in which the stationary probabilities of states can be determined by standard methods [10].These stationary probabilities are used for the derivation of many performance characteristics of the model.Timed Petri nets are used in this paper to develop models of weighted priority queueing and then performance characteristics of simple queueing systems are obtained by discrete-event simulation of developed models.

Petri Nets and Timed Petri Nets
Petri nets [8] are formal models of systems that exhibit concurrent activities.The set of all marking functions that can be created starting from the initial marking 0 m is called the reachability set of a net.This set can be finite or infinite.
A place is shared if it is connected to more than one transition.A shared place p is free-choice if the sets of places connected by directed arcs and inhibitor arcs to all transitions sharing p are identical.All transitions sharing a free-choice place constitute a free-choice class of transitions.For each marking function, either all transitions in each free-choice class are enabled or none of these transitions is enabled.It is assumed that a choice of an occurring transition in each free-choice class is random and can be described by probabilities associated with transitions.A shared place which is not free-choice is a conflict place and transitions sharing it are conflicting transitions.
Temporal behavior can be introduced in Petri nets in several ways, resulting in different classes of Petri nets "with time" [11].In timed nets [9], occurrence times are associated with transitions, and transition occurrences are real-time events (as opposed to instantaneous occurrences in other models [12]); so, tokens are removed from input places at the beginning of the occurrence period, and they are deposited to the output places at the end of this period.All occurrences of enabled transitions are initiated in the same instants of time in which the transitions become enabled (although some enabled transitions may not initiate their occurrences).If, during the occurrence period of a transition, the transition becomes enabled again, a new, independent occurrence can be in-itiated, which will overlap with the other occurrence(s).There is no limit on the number of simultaneous occurrences of the same transition (sometimes this is called infinite occurrence semantics).Similarly, if a transition is enabled "several times" (i.e., it remains enabled after initiating an occurrence), it may start several independent occurrences in the same time instant.
Formally, a timed Petri net is a triple, ( ) , where  is a marked net, c is a choice function which assigns probabilities to transitions in free-choice classes and relative frequencies of occurrences to conflicting transitions, , and f is a timing function which assigns an (average) occurrence time to each transition of the net, The occurrence times of transitions can be either deterministic or stochastic (i.e., described by some probability distribution function); in the first case, the corresponding timed nets are referred to as D-timed nets [13], in the second, for the (negative) exponential distribution of firing times, the nets are called M-timed nets (Markovian nets) [14].In both cases, the concepts of state and state transitions have been formally defined and used in the derivation of different performance characteristics of the model.In simulation applications, other distributions can also be used, for example, the uniform distribution (U-timed nets) is sometimes a convenient option.In timed Petri nets different distributions can be associated with different transitions in the same model providing flexibility that is used in simulation examples that follow.
In timed nets, it is convenient to have a possibility of some events to occur "immediately", i.e., in zero time; all transitions with zero occurrence times are called immediate (while the others are called timed).Since the immediate transitions have no tangible effects on the (timed) behavior of the model, it is convenient to "split" the set of transitions into two parts, the set of immediate and the set of timed transitions, and to first perform all occurrences of the (enabled) immediate transitions, and then (still in the same time instant), when no more immediate transitions are enabled, to start the occurrences of (enabled) timed transitions.It should be noted that such a convention effectively introduces the priority of immediate transitions over the timed ones, so the conflicts of immediate and timed transitions are not allowed in timed nets.Detailed characterization of the behavior or timed nets with immediate and timed transitions is given in [9].

Weighted Priority Queueing
In priority queueing [2] Weighted priority scheduling limits the number of consecutive packets of the same class that can be transmitted over the channel; when the scheduler reaches this limit, it switches to the next nonempty priority queue and follows the same rule.These limits are called weights, and are denoted 1 w .With k classes of traffic, if there are sufficient numbers of packets in all classes, the scheduler selects 1 w packets of class 1, then 2 w packets of class 2, …, then k w packets of class k, and again 1 w packets of class 1, and so on.Consequently, in such a situation (i.e., for sufficient supply of packets in all classes), the channel is shared by the packets of all priority classes, and the proportions are: is the transmission rate for packets of class i.If the transmission rates are the same for packets of all classes (as is assumed for simplicity in the illustrating examples), the proportions are: .
A Petri net model of weighted priority scheduling for three classes of packets with weights 4, 2 and 1 is shown in Figure 1.The model is composed of three identical interconnected sections corresponding to the three priority classes.
The main elements of the model are the three queues represented by places 1 p , 2 p and 3 p for traffic class 1, 2 and 3, respectively, and timed transitions The scheduling is based on repeated selection of queues in order of priorities (first class 1, then 2, and so on) for the transmission of queued packets.This selection operation is represented by a loop with places 0 r , 1 r , 2 r and 3 r , and 1 q , 2 q and 3 q .There is a single "control token" in this loop (shown in place 0 r in Figure 1).This token indicates the queue that is used for transmission of packets (by the subscript 1, 2 or 3); a token in place 0 r indicates that no queue is selected.
Let 0 r be marked.If all three queues are empty, the next packet arriving to one of the queues enables one of the transitions 1 s , 2 s or 3 s , the control to- ken is moved from 0 r to place i r corresponding to the nonempty queue, and an occurrence of transition i a selects a token from i p for transmission.At the same time, one token from place i w is moved to place i u .When the channel becomes available for transmission (which is indicated by an occurrence of 0 i t ), the control token is returned to i r .Now there are three possibilities: • if the queue (place i p ) is nonempty and the weight ( i w ) is nonempty, another token is selected from i p and forwarded for transmission; • if the queue is empty, an occurrence of transition i d moves the control to- ken from i r to i q ; • if the weight is empty, an occurrence of transition i c also moves the control token from i r to i q .
A token in i q moves (by repeated occurrences of i b ) all tokens from place i u back to i w , and when i u becomes empty, an occurrence of transition i e moves the control token to the next class represented by  The (finite) capacity of the queue is represented by the initial marking of place 14 p (shown in Figure 2 as K).When a packet is generated (by 01 t ) and the queue is not full, i.e., place 14 p is marked, an occurrence of 14 t enqueues the packet in 1 p .If, however, the queue is full, place 14 p is unmarked, the inhibi- tor arc ( ) , p t enables 15 t and the packet is dropped.
Finally, when a packet is selected for transmission and is removed from the queue, each occurrence of transition 10 t returns a token to 14 p , indicating that the queue can store another packet.

Performance Characteristics
The model shown in Figure 1  , is shown in Figure 3.When the capacity of a queue is finite, packets which arrive when the queue is full are dropped as they cannot be queued.The percentage of dropped packets is an important metric of the system.Figure 7 shows that the fraction of packets dropped increases for 1 0.25 ρ > and-for classes 2 and 3-reaches the level of 45% for 1 ρ close to 0.6.This should not be surprising because in the same range of values of 1 ρ the utiliza- tion of the shared channel decreases from 0.5 to 0.286 for class 2 and from 0.25 to 0.143 for class 3 (as shown in Figure 3).This decrease results is dropping about 45% of packets (practically the same for classes 2 and 3).
Average waiting times are shown in Figure 8, and the average queue lengths for all three classes of traffic in Figure 9.   Results shown in Figure 7, Figure 8 and Figure 9 are related to each other.
For weights 4-2-1 and for high-intensity traffic, each scheduling cycle includes 4 packets from class 1, 2 packets from class 2 and just 1 packet from class 3.Each packet served from class 3 is thus accompanied by 6 other packets, so if the average length of the queue 3 is n, the average waiting time for class 3 is expected to be 7n.For 4.2 n = (Figure 9), this results in the average waiting time for class 3 that is close to 30 (as shown in Figure 8).For class 2, two packets are served in each scheduling cycle, so its average waiting time is one half of that for class 3 (the average queue lengths are practically the same for classes 2 and 3, as shown in Figure 9).
It should be observed that from performance point of view, it is not beneficial to have long queues for packets waiting for service.For high intensity traffic these queues will be practically full, and then the average waiting time will simply increase proportionally with the queue length.Figure 10 and Figure 11 show the average queue length and the average waiting time for the case when all queue lengths are equal to 10.
The average waiting times in Figure 11 are about two times greater than those in Figure 8.
Finally, Figure 12 and Figure 13 show the fraction of the dropped packets and the average waiting times for the case when the traffic intensities do not exceed the levels determined by the weights, i.e., 2 0.25 ρ = and 3 0.1 ρ = , as in Figure 6.
For class 1, the increase of the fraction of dropped packets is caused by queue 1 which is becoming full; all arriving packets which cannot be queued, are dropped.
For classes 2 and 3, the fraction of dropped packets is very small and the average waiting times are also rather small.

Concluding Remarks
Efficient use of modern networks requires detailed knowledge of network characteristics, traffic statistics, transmission media types, and so on.Some of this information can be obtained by measurements performed under real traffic, but other can only be provided by detailed models, verified by comparisons with measurement data.On the basis of these characteristics, specific methods can be developed to determine the optimal numbers of links, the transmission capacity of links, the management strategy for resources shared among traffic classes, and others.
The goal of this paper is to provide insight into the behavior of weighted priority queueing, a modification of (strict) priority queueing that eliminates blocking of lower priority traffic that is typical for priority-based traffic management schemes.The paper shows that when the weights match the characteristics of lower priority traffic, the performance provided by the analyzed scheme is actually quite good.However, since in real communication networks the characteristics often change, a dynamic weight selection method may be needed for adjusting the performance to the changing character of the traffic.Some ideas for such a dynamic weighted queueing can be found in [15] and [16].
The weighted priority queueing exhibits several similarities to the weighted fair queueing [3], [17] but seems to be simpler to implement.An in-depth comparison of these queueing methods is needed for better understanding their relative strengths and weaknesses.

Section 2
recalls basic concepts of Petri nets and timed Petri nets.Section 3 describes the net model of weighted priority queueing while Section 4 uses the developed model to analyze the performance of simple weighted priority queueing systems.Section 5 concludes the paper.
If the queue for this class is empty, occurrences of transitions 1 i d + and 1 i e + move the control token to a subsequent class until 0 r is reached, and then the highest priority nonempty class is selected by an occurrence of one of transitions 1 s , 2 s or 3 s .
(three classes of traffic, weights 4-2-1) is used for performance analysis of weighted priority queueing.The utilizations of the shared communication channel as functions of traffic intensity of class 1 (the highest priority), 1 ρ , with constant traffic intensities for classes 2 and 3,

Figure 2 .
Figure 2. Petri net model for class 1 of weighted priority queueing with a finite queue and weight 4.

Figure 7 1 ρ with 2
shows the fraction of packets which are dropped in a weighted priority queueing with weights 4-2-1 and with queue length equal to 5, as functions of traffic intensity