Coupled IEEE 802.11ac and TCP Goodput improvement using Aggregation and Reverse Direction

This paper suggests a new model for the transmission of TCP traffic over IEEE 802.11 using the new features of IEEE 802.11ac . The paper examines a first step in this direction and as such we first consider a single TCP connection, which is typical in a home environment. We show that when the IEEE 802.11ac MAC is aware of QoS TCP traffic, using Reverse Direction improves the TCP Goodput in tens of percentages compared to the traditional contention based channel access. In an error-free channel this improvement is 20% while in an error-prone channel the improvement reaches 60%, also using blind retransmission of frames. In our operation modes we also assume the use in Two-Level aggregation scheme, the ARQ protocol of the IEEE 802.11ac MAC layer and also assume the data rates and the four Access Categories defined in this standard.


Background
The latest IEEE 802.11-REVmc Standard (WiFi), created and maintained by the IEEE LAN/MAN Standards Committee (IEEE 802.11) [1], that embedded and updated the IEEE 802.11ac amendment, is currently the most effective solution within the range of Wireless Local Area Networks (LAN). Since its first release in 1997, the standard provides the basis for Wireless network products using the WiFi brand, and has since been improved upon in many ways. One of the main goals of these improvements is to optimize the Throughput of the MAC layer, and to improve its Quality-of-Service (QoS) capabilities.
To fulfill the promise of increasing IEEE 802.11 performance and QoS capabilities, and to effectively support more client devices on a network, the IEEE 802.11 working group introduced the fifth generation in IEEE 802.11 networking standards; the IEEE 802.11ac amendment, also known as Very High Throughput (VHT) [1,2]. IEEE 802.11ac is intended to support fast, high quality data streaming and nearly instantaneous data syncing and backup to notebooks, tablets and mobile phones. The IEEE 802.11ac final version, 11ac-2013, released in 2013 [2], leverages new technologies to provide improvements over previous generation, i.e. IEEE 802.11-2012 [3] . Both versions are now included in IEEE 802.11-REVmc [1] which will be published as IEEE 802. 11 -2016. The IEEE 802.11ac amendment [2] improves the achieved Throughput coverage and QoS capabilities, compared to previous generations, by introducing improvements and new features in the PHY and MAC layers. In the PHY layer, IEEE 802.11ac (VHT) continues the long-existing trend towards higher Modulation and Coding rates ( 256 QAM 5/6 modulation), working in wider bandwidth channels ( up to 160 MHz ) and using 8 spatial streams that enable higher spectral efficiency.
In the MAC layer IEEE 802.11ac includes many of the improvements first introduced with IEEE 802.11e and IEEE 802.11n [3], also known as High Throughput (HT). Integrated with the following two key performance features are the ability to aggregate packets in order to reduce transmission overheads in the PHY and MAC layers, and to use Reverse Direction (RD) which enables stations to exchange frames without the need to contend for the channel.
We now describe these features.
Frame aggregation is a feature of the IEEE 802.11n and IEEE 802.11ac that increases Throughput by sending two or more consecutive data frames in a single transmission, followed by a single acknowledgment frame, denoted Block Ack (BAck). Aggregation schemes benefit from amortizing the control overhead over multiple packets. The achievable benefit from data aggregation is often interesting, especially in light of several factors that can impact its performance, e.g., link rates, collisions, error-recovery schemes, inter-frame spacing options, QoS guarantee etc. IEEE 802.11n introduces, as a pivotal part of its MAC enhancements, three kinds of frame aggregation mechanisms: The Aggregate MAC Service Data Unit (A-MSDU) aggregation, the Aggregate MAC Protocol Data Unit (A-MPDU) aggregation and the Two-Level aggregation that combines both A-MSDU and A-MPDU. The last two schemes group several Mac Protocol Data Units (MPDU) frames into one large frame. IEEE 802.11ac also uses these three aggregation schemes, but enables larger frame sizes.
The basic idea behind the Reverse Direction (RD) feature is a time interval denoted Transmission Opportunity (TXOP). A station gains a TXOP by gaining access to the wireless channel and in a TXOP the station can transmit several PHY Protocol Data Units (PPDU) without interruption. This station is denoted the TXOP holder. The TXOP holder can also allocate some of the TXOP time interval to one or more receivers in order to allow data transmission in the reverse link. This is termed Reverse Direction (RD). For scenarios with bidirectional traffic, such as Transmission Control Protocol (TCP) Data segments/TCP Acks, RD is very attractive because it reduces contention in the wireless channel (no collision).
The IEEE 802.11ac standard also defines an Automatic Repeat-Request (ARQ) protocol that enables a transmitter to retransmit lost MPDUs and guarantee in-order reception of MPDUs at the receiver. This protocol is also used to improve quality of the wireless channel.
Another feature in IEEE 802.11ac related to QoS capabilities is the use in Access Categories (AC). There are 4 ACs: Best Effort (BE), BackGround (BK), Video (VI) and Voice (VO). The difference between the 4 ACs is in the parameters that control access to the channel, namely the Arbitrary Inter-Frame Space (AIFS) length and the values of CW min and CW max . These vary in the various ACs and are intended to provide priority to traffic streams with QoS requirements such as Video and Voice.

Research question
In this paper we investigate a model to transmit TCP traffic in an infrastructure IEEE 802.11 that optimizes the combined performance of the IEEE 802.11 MAC layer and the L4 TCP protocol using the new features that were developed in the latest generation of the IEEE 802.11, i.e. IEEE 802.11ac. These features enable to use IEEE 802.11 in a completely different way than before, as we now specify.
Collisions are possible between stations that are involved in different TCP connections and between the AP and stations with which it has TCP connections due to the exchange of TCP Data/Ack segments. Both the AP and the station(s) try to get access to the channel simultaneously and this results in collisions.
As far as we know, there was no development of models to transmit TCP traffic over IEEE 802.11 using new features of the standard. In the model we suggest in this paper the AP controls the TCP transmissions in the cell by configuring the stations to use large BackOff intervals such that effectively they never gain access to the channel and the AP enables the stations to transmit only through time periods delivered by the AP, the TXOP holder, i.e.
the Reverse Direction (RD) capability. The AP communicates with one station during a TXOP and in this paper we evaluate the performance of such communication between the AP and the station. Establishing policies for the communication between the AP and several stations with which it maintains TCP connections is the issue for further research.
Therefore, as a first step we assume that the AP communicates with a single station and they have a single TCP connection between them. The AP is the TCP transmitter, transmitting TCP Data segments, and the station is the TCP receiver transmitting TCP Acks. Our performance criteria is the Goodput, defined as the number of MSDUs' bits (TCP Data segments) that are successfully transmitted and acknowledged by TCP Acks, in the wireless channel, on average, in a second. Such a scenario is possible, for instance in a home environment where a Network-Attached Storage (NAS) device [18] is attached to the AP, and a PC downloads data files from the NAS device. It uses the aggregation and RD capabilities of the IEEE 802.11ac MAC layer.
We use the four features of the MAC layer of the IEEE 802.11ac mentioned above, namely aggregation, Reverse Direction (RD), the ARQ protocol and the four Access Categories.
Concerning aggregation, we assume the Two-Level aggregation scheme. This scheme enables transmission of several TCP Data segments and several TCP Acks in a single transmission over the wireless medium. Up to 64 MPDUs can be transmitted in a single transmission and every MPDU can contain several MSDUs. We measure the influence of aggregation on the Goodput.
Notice that in a TCP connection over IEEE 802.11ac both sides of the connection compete for the wireless channel -one for transmitting TCP Data segments and the other for transmitting TCP Acks. This competition can result in collisions and reduced Goodput. We examine two operation modes for the transmission of TCP traffic over the wireless medium.
In one operation mode, using RD, the TCP transmitter allocates a TXOP when it acquires the wireless medium, and enables the TCP receiver to transmit TCP Acks during the TXOP without collisions. In the second operation mode, for comparison purposes, which we denote by No-RD, the traditional CSMA/CA random access MAC is used. The TCP transmitter and the TCP receiver contend for the wireless medium in every transmission attempt. The operation mode using RD is more complicated than the contention based one, and we want to check if, and to what degree, using RD improves the Goodput of the No-RD operation mode.
In addition to all the above we assume the ARQ protocol of the IEEE 802.11ac standard at the MAC layer. This protocol guarantees an in-order delivery of MPDUs between communicating entities. However, due to its Transmission Window, the ARQ protocol can sometimes limit the number of MPDUs transmitted in each transmission, i.e. this protocol can limit the amount of aggregation.
Finally, we check the influence of the values of the access parameters in the four ACs on the Goodput, namely the Arbitrary Inter Frame Space (AIFS), Contention Window min., CW min , and Contention Window max. CW max .
We assume that the AP and the station are the end points of the TCP connection.
Following e.g. [5,8,9,17] it is quite common to consider short Round Trip Times (RTT) in this kind of high speed networks such that no retransmission timeouts occur. Notice also that due to the MAC ARQ protocol, the L4 TCP protocol always receives TCP Data segments in order. Therefore, the TCP congestion window increases up to the TCP receiver advertised window. We assume that the TCP receiver window is large enough such that the TCP Transmitter Transmission window can always provide as many MSDUs to transmit as the MAC layer enables. We assume the above following the observation that aggregation is useful in scenario where the offered load on the channel is high. We therefore do not consider the TCP Transmission Window and our goal is to find the maximum possible Goodput that the wireless channel enables to a single TCP connection, where the TCP itself does not impose any limitations on the offered load, i.e. on the rate that MSDUs are given for transmission to the MAC layer of the IEEE 802.11ac.
Following the above we also do not consider a particular flavor of TCP, e.g. TCP NewReno, Westwood, Cubic [19][20][21] if to mention only a few. All the TCP flavors differ in the way they handle the TCP congestion window but in this paper, as mentioned, we assume that the TCP Transmission Window is limited only by the TCP receiver advertised window.
Regarding the wireless channel quality we first assume an error-free channel, i.e. the Bit Error Rate (BER) equals 0. Then we assume another three BERs : 10 −5 , 10 −6 and 10 −7 .
The scenario of a single TCP connection with various BER values is possible for instance in the mentioned home environment where a Network-Attached Storage (NAS) device [18] is attached to the AP, and a PC, which is a client in the IEEE 802.11 system, is located close to the NAS and downloads data files from the NAS device. The various BERs are a function of the channel conditions between the client (e.g. PC) and the AP. If they are stable and have a low path loss channel between them. the BER is very low. However, if the PC is located in the basement for instance, the BER can be larger.
An additional feature that we use was introduced in [22]. In [22] a repetition scheme is introduced, in which several MPDUs in a single transmission are transmitted several times.
This feature improves the achieved Goodput in large BERs, as will later become clear.

Our results
We show that for an error-free channel, i.e. BER=0, using RD improves the Goodput over not using RD by 20%. Moreover, using TXOPs of about 20µs are sufficient to achieve that improvement, and this outcome has an impact on the delay at the TCP protocol from the time the TCP transmitter transmits TCP Data segments until it receives the corresponding TCP Acks.
For error-prone channels we show that using RD improves the Goodput in almost 50% and when also using the Repetition scheme of [22] the improvement can even reach 60%.
TXOPs of about 4µs are sufficient to achieve these Goodput improvements.

Previous works
From the point of view of Transport protocols, the performance of the IEEE 802.11 protocol has been investigated in two models : UDP-like traffic and TCP traffic, i.e. when there is bi-directional traffic that can result in collisions. By UDP-like traffic we mean that the Data receiver does not transmit an Ack at the Transport layer, nor, in terms of IEEE 802.11, does it generate an MSDU for transmission. In TCP traffic, the receiver of TCP Data segments generates an MSDU which contains a TCP Ack, and depends on the channel for its transmission.
Regarding UDP-like traffic, the performance of IEEE 802.11 (taking into account the aggregation schemes) has been investigated in dozens of papers over the years. For example, in [23][24][25][26][27][28][29][30][31][32][33][34][35] the Throughput and Delay performance of the A-MSDU, A-MPDU and Two-Level aggregation schemes are investigated. Several papers assume an error-free channel with no collisions, several papers assume an error-prone channel and others also assume collisions. In [36][37][38][39][40] the performance of 802.11ac is investigated. Papers [37,40] consider the performance of the aggregation schemes in 802.11ac and compare the performance of 802.11ac to that of 802.11n.
Another set of papers [41][42][43][44][45][46] deals with QoS together with the aggregation schemes. In particular, in [46] the use of the ARQ protocol of the IEEE 802.11 standard [1], together with the aggregation schemes, is investigated in relation to QoS guarantee.
Concerning TCP traffic, we can specify a first set of papers that deal with TCP's Throughput, Delay and Fairness performance over legacy IEEE 802.11/a/b/g networks. There are dozens of such papers, such as [4][5][6][7][9][10][11] to mention only a few. None of the papers from this set consider Access Categories or aggregation schemes that were introduced in later versions of the standard, i.e. IEEE 802.11e and IEEE 802.11n respectively.
As the IEEE 802.11e was introduced, many papers appeared concerning this standard and the performance of TCP. In IEEE 802.11e the Access Categories are defined, enabling change to the fix values of the DIFS ( now called AIFS ) and CW min of the previous versions of the standard. Also introduced is the TXOP time interval that enables the AP/stations to transmit several frames in a single transmission opportunity. Such frames are acknowledged in the MAC layer, all together, by a new defined frame; the Block-Ack frame. Papers regarding TCP investigate the use of the above changes in improving TCP performance [12,13]. None of the papers concerning IEEE 802.11e and TCP deal with ACs and aggregation schemes, as does our paper, since aggregation schemes were only introduced in a later version of the standard, namely IEEE 802.11n.
In relation to IEEE 802.11n/ac where aggregation is introduced, we are aware of only three research papers that handle the Throughput performance of TCP in the various aggregation schemes [14,15,25]. In [25] the authors also assume the model of the AP and a single station that maintain a TCP connection. The paper considers the A-MSDU and A-MPDU aggregation schemes only, and does not consider the Two-Level aggregation scheme, the RD and the various ACs. In the analysis the authors assume a TCP Transmission window of one TCP Data segment. On the other hand, in this paper we also handle the Two-Level aggregation scheme, the various ACs, the RD and a TCP Transmission window larger than one Data segment, which complicates the analysis.
In [14] it is argued that aggregation increases the discrepancy among upload TCP connections. The model is an AP with several stations that initiate TCP upload connections. The A-MPDU aggregation is considered and there is no a reference to Two-Level aggregation, to RD and to the standard ACs. The authors suggest an algorithm to reduce the discrepancy among TCP connections. Our paper deals with another model: we explore the influence of aggregation on the Goodput of a single TCP connection, e.g. in a home environment, consider Two-Level aggregation, RD and check the performance of the 4 ACs defined in IEEE 802.11 .
In [15] the performance of a single TCP connection is evaluated using all three aggregation schemes and four standard ACs. However, only an error-free channel is considered and there is no reference to RD, i.e. there can be collisions between the two parties of the TCP connection. The current paper is a next step to the research in [15] in the sense that it also considers an error-prone channel and explores the elimination of collisions by RD.
Regarding RD, there are several papers such as [47][48][49][50] that deal with RD's Goodput performance, also in relation to TCP. However, these papers do not consider aggregation, ACs and the IEEE 802.11ac ARQ protocol all together.
Finally, none of the papers mentioned in this literature survey consider the Repetition scheme of [22] and its influence on the Goodput performance.
The rest of the paper is organized as follows: In Section 2 we describe in detail the features of the IEEE 802.11ac that we use in this paper. In Section 3 we describe the model we suggest for TCP transmission over IEEE 802.11 using RD. In Sections 4 and 5 we compute the Goodput performance of the error-free and error-prone channels respectively. Section 6 concludes the paper and in the Appendix we present a Markov chain model for the scenario in which there is no use in RD and both the AP and the station contend for the channel in every transmission attempt.  Let O 1 = AIF S + P reamble + SIF S + BAck and let BackOf f be the BackOff interval that a station uses in a given transmission. The transmission time without collisions of the above A-MPDU is [1]: The additional 22 bits are due to the SERVICE ( 16 bits ) and TAIL ( 6 bits ) fields that are added to every transmission by the PHY layer Conv. Protocol [1].
In Eq. 1 we assume the OFDM PHY layer. T sym is the duration of one Transmission Symbol in OFDM, and it is 4µs. BitsP erSymbol equals 4 in OFDM and R is the PHY rate in Mbps. Any transmission in OFDM must be of an integral number of Symbols.

The Error model
We assume that the process of frame loss in a wireless fading channel can be modeled with a good approximation by a low order Markovian chain, such as the two state Gilbert model [51,52].
In this model the state diagram is composed of two states, "Good" and "Bad", meaning successful or unsuccessful reception of every bit arriving at the receiver, respectively. Bit By the above model one can see that as the frame length B increases, so does the failure probability. Thus, in every aggregation scheme, increasing the aggregation amount increases the frame's size as well as the transmission delay of the frame. The failure probability can sometimes also increase.
We would like to mention that there are other models to represent the quality of the indoor wireless channel, e.g. the one in [53]. This model shows burstiness in the channel quality.
In this paper however, we assume that the communicating stations use Link Adaptation by which they keep the effective SNR stable and in such a scenario the BER is stable.

IEEE 802.11ac ARQ protocol
We give only a brief description of the IEEE 802.11ac ARQ protocol. A more detailed description can be found in [46] and in sections 9.21.7.3 -9.21.7.9 in [1].  rate R used for data frame transmissions is lower than 24Mbps, then R is also used for the BAck and Ack transmissions. However, in this paper we assume a PHY rate of 1299.9 Mbps assuming working point MCS9 with 3 spatial streams and an 80MHz channel. With 3 spatial streams the PHY Preamble is 48µs [1].

Successful transmissions
In Figure 2 we show the activity on the channel where a successful transmission occurs, i.e.
without collisions. In this case, after a station senses an idle channel for a duration equal to its AIFS and BackOff intervals, it transmits the data frame. After a SIFS and a PHY Preamble the receiver acknowledges reception. In the case of Two-Level aggregation the BAck frame is used.

Collision events
In Figure 3 we show the activity on the channel in the event of collisions. We show two

TCP traffic model over IEEE 802.11
In this section we describe our model for the transmission of TCP traffic over IEEE 802.11 . In Fig. 4 Figure 4: The Traffic model.

Operation modes for TCP Usage of the channel
We consider 2 operation modes for the transmission of TCP Data/Ack segments over the channel.

Operation mode 1 -No-RD, Competition
Both the TCP transmitter and the TCP receiver contend for the channel in every transmission attempt, i.e. when the TCP receiver has TCP Acks to transmit, it contends for the channel with the TCP transmitter in every transmission . Both stations use the Two-Level aggregation.

Operation mode 2 -Reverse Direction
Reverse Direction is a mechanism in which the owner of a Transmission Opportunity (TXOP) can enable its receiver to transmit back during the TXOP, so that the receiver does not need to contend for the channel. This is particularly efficient for a bi-directional traffic such as TCP Data segments and TCP Acks. We examine an operation mode in which the TCP transmitter (AP) transmits A-MPDU frames containing MPDUs of TCP Data segments to the TCP receiver (station), and enables the TCP receiver to answer with an A-MPDU frame containing MPDUs frames of TCP Acks.
Both stations use the Two-Level aggregation.
We assume the following scenario to use RD, as is illustrated in Fig. 5: After waiting AIFS and BackOff the TCP transmitter (AP) transmits n A-MPDU frames in a row in the TXOP. In Figure 5 we assume n = 2. The TCP receiver (station) responds to every transmission by a BAck frame. In its last A-MPDU frame the TCP transmitter sets the RDG bit [1], enabling the TCP receiver to respond with an A-MPDU frame. The TCP transmitter then responds with a BAck frame and terminates the TXOP with the CF-End frame [1].
We assume that there are no collisions on the channel after the end of a TXOP because the TCP receiver is configured in a way that prevents collisions. For example, the TCP receiver is configured to choose its BackOff interval from a very large contention interval, other than the default ones in Table 1. Thus, the TCP transmitter always wins the channel without collisions. The transmissions on the channel are composed of TXOPs that repeat themselves one after the other. We denote by RD(n) the case where the TCP transmitter transmits n A-MPDU frames in the TXOP.

Error-Free Channel Results
In this section we assume an error-free channel, i.e. BER=0, and in this case the operation Since there are no collisions when using RD, holds BO = (CW min−1) 2 · SlotT ime, where we refer to the CW min of the AP. See Table 1. We now define C to be C=AIFS+BO+SIFS+CF-End+Preamble. The last P reamble in C is the one preceding the transmission of the station.
Let T (AP ) and T (ST A) be the transmission times of the AP and the station's A-MPDU frames respectively. T (AP ) is given by the following ( the details of how Eqs. 3-5 are derived can be found in [40]): and T (ST A) is as follows: where K A is the number of MPDUs in the station's A-MPDU frame and K A = n·K D ·7
The length cycle of a TXOP is therefore given by cycle = C + n(P reamble+ T (AP ) + SIF S + BAck + SIF S) + T (ST A) + SIF S + BAck (5) and the Goodput of the system is Neglecting the rounding of T (AP ), T (ST A) and K A the Goodput can be written as : One can see that as n increases and/or K D increases, so does the Goodput. Notice that  We see in all the graphs that as the number of transmissions increases and/or as K D increases, so does the Goodput. We also include a curve showing the maximum possible Goodput using RD. This curve is obtained as follows: For every K D we first find the maximum number of possible transmissions, n max , such that n max = 178·64 7·K D . Recall that 64 · 178 is the maximum number of TCP Acks that the receiver can transmit in a TXOP. Then, we compute the received Goodput for n max using Eq. 7. For example, for K D = 64 holds that n max = 25 and for K D = 1 holds n max = 167·64 7·1 = 1627.
Notice that in the VI and VO ACs and for K D s larger than 15, the difference in performance between No-RD and using RD is the largest among all the ACs. This happens because in these ACs CW min and CW max are the smallest among the ACs and so the probability for collisions is the largest. In large K D s collisions waste relatively long intervals of time and so the decrease in the Goodput is significant. As CW min and CW max decrease, the difference between using RD and No-RD increases. Notice that in VO CW min = 4 and so the collision probability is 25%. In VI CW min = 8 and the collision probability is 12.5%. For the BK and BE CW min = 16 and the collision probability is only 6.25%.
In BK and BE the collision probability is small and the AP and the station transmit almost alternately. Therefore, No-RD and RD(1) have almost the same performance, except that in RD (1) there is an extra overhead of CF-End and SIF S at the end of every TXOP.
As the value of AIFS is larger, this overhead is less significant. In BE the AIFS is 43µs compared to 79µs in BK and therefore the CF-End+SIF S = 26 + 16 = 48µs is more significant in BE and No-RD slightly outperforms RD(1), while in BK they perform equally.
In VI and VO the AIFS is smaller than in BK and BE and so the CF-End+SIF S overhead is more significant. Moreover, the AP in these ACs has a higher probability of accessing the channel than the station because its AIFS is shorter by one Slot-Time. This enables the AP in No-RD to transmit several times in a row before the station replies. This also enables a better Goodput in No-RD than in RD(1) where the AP and the station transmit alternately. On the other hand, the collision probability is larger in VI and VO. However, the AP transmits many times without competition in No-RD when the TCP receiver has no TCP Acks to transmit. The overall outcome is a slightly larger Goodput in No-RD, compared to RD(1), than in BK and BE.
In Figure 7 we show the same results as in Figure 6 but now every TCP Ack acknowledges two TCP Data segments, a feature known as TCP Delayed Acks. For clarity, for the No-RD scheme we only show the simulated results. The analytical results are similar, as can be seen in Figure 6. Normally, the TCP receiver does not send an Ack the instant it receives data.
Instead, it delays the Ack, hoping to have data going in the same direction as the Ack, so the Ack can be sent along with the data. This delay is usually in the order of 200µs. However, if meanwhile another data segment arrives, the TCP receiver immediately generates an Ack to send.
Using TCP Delayed Acks enables the TCP transmitter to transmit more TCP Data segments in one TXOP : the limiting condition is now n · K D · 7 ≤ 2 · 64 · 178. Comparing We then checked the cycle length, and how the curves relate between cycles' lengths and Goodputs. For example, if we find that transmitting S TCP Data segments with the largest Goodput G takes Cms, it is easy to verify that G is the largest Goodput possible in Cms.
We see that all the ACs achieve the same Goodput for 'long' TXOP. This happens because the cycles in the various ACs differ only in the AIFS and BackOff time intervals which become negligible in long cycles. In shorter cycles the VI and VO ACs achieve the same best performance because their AIFS are the shortest, 25µs for the AP. BE outperforms BK because its AIFS is 43µs (AP) compared to 79µs in BK. See Table 1.

Error-Prone channel Results
In this Section we assume the BERs of 10 −5 , 10 −6 and 10 −7 . We concentrate only on the BE AC. The results for the other ACs are similar, with the same differences compared to BE as described in Section 4. In Figure 9  is different than those of parts (B),(C) and (D). This is because the positive BER can cause the MAC TW to limit the number of transmitted MPDUs in a single transmission to be smaller than K D , and so K D is only the maximum allowed MPDUs in a single transmission. 10 In general using RD results in a larger Goodput. Notice however that as the BER increases, the advantage of RD(1) over No-RD decreases. As the BER increases, the number of MPDUs that the TCP transmitter is able to transmit in every transmission decreases. The MAC TW is not always able to slide so that it will contain K Notice that Figure 9(A) is for BER=0 and it is the same as Figure 6(A). In Figure 9(A) we can provide a curve showing the maximum possible Goodput. However, for BER>0, in order to find such a curve one needs to know, given K D , the actual average number of transmitted MPDUs in every transmission of the TCP transmitter. This number might be smaller than K D , especially for large K D s, because it is possible that the MAC TW does not contain K d MPDUs. Such a computation is difficult [25,46] and it is out of the scope of this paper. This is also the reason why we cannot provide analytical results for the No-RD scheme as for the case BER= 0. Notice again that for small K D s No-RD slightly outperforms RD(1) for the same reasons given for this phenomena in Section 4.
In Figure 10 we show the same results as in Figure 9, but now there is a use in the TCP Delayed Acks. Using TCP Delayed Acks does not improve the performance of the No-RD scheme because in the case of collisions, the time wasted is the time of transmitting the TCP Data segments. The shorter time of transmitting the TCP Acks has no influence in this case. On the other hand, in the schemes that use RD the reduced time of transmitting TCP Acks has an influence because the TXOP length is shorter. Therefore, one can see that the difference between the performance of the RD schemes to that of No-RD is larger than in the case of not using TCP Delayed Acks.
In Figure 11 we show the use in the scheme of [22] where each of the first 3 MPDUs in every A-MPDU frame of the TCP transmitter is transmitted twice, i.e. MPDU repetition.
Only the first 3 MPDUs are transmitted twice because it is the most efficient scheme (max.    twice. Therefore, for BER=0 it is clear that the performance of Rep. is worse than not using it. As the BER increases, the advantage of Rep. increases. We also found that for BER= 10 −7 it is inefficient to use Rep. However, for BER=10 −5 , 10 −6 Rep. improves the achieved Goodput as we show in Figure 11. In order to demonstrate the improvement consider Figure 11(A) for BER= 10 −5 without TCP Delayed Acks. One can see that all the schemes, namely No-RD, RD(1), RD(2) and RD (25), benefit from using Rep. in the case of large K D s, while for small K D s it is not efficient. Notice that in the case of small K D s the probability that the MAC TW will contain K D MPDUs ready for transmission is much larger than in the case of larger K D s. Therefore using Rep. in the former case only increases the transmission time of the TCP transmitter A-MPDU frames with no benefit.
In Figure 12 we show the maximum received Goodputs vs. the BER for the No-RD, RD(1), RD(2) and RD(25) schemes. In Figures 12(A) and (B) we consider the cases without and with TCP Delayed Acks respectively. We see that for every BER, using RD is more efficient than not using RD. For BER= 10 −5 and in several cases when BER= 10 −6 using Rep.
even improves the Goodput further. For example, in BER=10 −5 the Goodput of RD (25) is 780Mbps, compared to 600Mbps in the No-RD case. With using Rep. the Goodput of RD(25) is 860Mbps, over 40% improvement compared to No-RD. For BER=10 −7 and BER= 0 using Rep. decreases the performance for the reasons mentioned previously.
Finally, in Figure 13 we show the maximum received Goodput as a function of the TXOP for BER= 10 −5 . Recall that for BER>0 it is difficult to find the number of actually transmitted MPDUs in every transmission of the TCP transmitter. Therefore, we cannot use the same technique to compute the maximum Goodput as in Section 4, BER=0.
The outcomes and conclusions are similar in trend to those in Figure 8 except that the achieved Goodputs are much lower because of the positive BER. On the other hand the delays, i.e. the length of the TXOPs, are shorter. In BER=0 there is no benefit to using TXOPs of more than 20µs while for BER= 10 −5 there is no benefit to using TXOPs of more than 4µs.

Summary
This paper shows an example of the benefit achieved when different layers in the protocol stack co-operate. In particular, we show the improvement in the TCP Goodput that is achieved when the MAC layer of the IEEE 802.11ac standard is aware of TCP traffic. Using Reverse Direction, the contention between the TCP transmitter and receiver is eliminated, and no time is wasted due to collisions.
Using also the Two-Level aggregation scheme, in an error-free channel the TCP Goodput is improved by 20% compared to contension based channel access. In an error-prone channel the TCP Goodput is improved by 60% also blindly using retransmission of frames in A-MPDU frames.
This paper assumes only one TCP connection in the system, a scenario that is possible in small systems such as in the Home environment. A next research step is to investigate the performance of Reverse Direction and aggregation when the AP maintains several TCP connections at the same time.

Appendix
In this Appendix we describe a Markov chain model for the No-RD scheme and for an errorfree channel. The Markov chain is based on two assumptions : First, we assume that the case of 3 or more consecutive collisions on the channel is very rare. Notice that for the VO AC the probability for 2 consecutive collisions is ( 1 4 ) 2 ( 1 8 ) 2 ∼ 10 −3 . Therefore, we assume that only two sizes of contention intervals are used, [0, ..., CW min − 1] and [0, ..., 2 · CW min − 1].
Second, as already mentioned, we assume the saturated scenario where the TCP transmitter always has TCP Data segments to transmit and that the TCP transmission window does not limit the offered load. In particular, we assume that for every K D , 1 ≤ K D ≤ 64, the TCP transmitter can always transmit K D MPDUs in a single transmission. Every MPDU contains 7 MSDUs of TCP Data segments. The TCP receiver transmits all the TCP Acks it has in one A-MPDU, up to 178 · 64 in one transmission.
We also assume that every TCP Ack acknowledges one TCP Ack. The extension to the case of Delayed Acks is immediate.
We first present a Markov chain for the BE and BK ACs, which are symmetrical in the sense that the AIFS of the AP and the station are equal, Table 1. We later show what changes are needed for the VI and VO ACs that are a-symmetrical.
The Markov chain follows after the channel access state. The set of its states, together with the transitions among the states, is shown in Figure 14. A state, except for the Initial State, represents 3 variables, and is denoted (X, C AP , C ST A ). X denotes the number of K D · 7 TCP Acks that the TCP receiver accumulated to transmit. In group (C) the station has y · K D · 7 TCP Acks to transmit, 1 ≤ y ≤ M. We explain what is M later. Notice that there are three types of transitions from a state in this group -a transition when the AP transmits, when the station transmits and when there is a collision.
The transitions and their corresponding probabilities are straight forward. Notice that after the station transmits it is left without TCP Acks, and the transition is to a state in Group (B). We later explain why we assume that the station is left without TCP Acks.
Notice that in principal the size of the Markov chain is unlimited. We therefore look for a finite size that will give analytical results within say 1% of those of the simulation. This seems to be a reasonable error range. We therefore assume that the station cannot accumulate more than M · K D · 7 TCP Acks, and M = 20 gives the desired error range.
Therefore, in every state in Group (C) the station can transmit all the TCP Acks it has in one transmission. (The station can accumulate up to 25 · 64 · 7 TCP Acks and transmit them in one A-MPDU ).
Group (D) of states is similar to Group (B), except that the station already has M · K D · 7 TCP Acks to transmit, and every another A-MPDU that the AP transmits is dropped by the station.
We attach a T ime metric to every state. The T ime metric denotes the time elapsed on the channel in this state. The T ime metric of the Initial State is 0. A state in which the AP transmits has a T ime metric equals to AIF S +BO +P reamble+T (DAT A)+SIF S +BAck. We also attach a Goodput metric to every state. Recall that we consider a transmission of a TCP Data segment to be a successful one only when a TCP Ack segment is received for this segment. Thus, the Goodput metrics of the Initial State, every state in which the AP transmits and every state that denotes a collision are all 0. For any other state the Goodput metric is the amount of bits of X · K D · 7 TCP Data segments, where X is the number of K D · 7 TCP Acks that are transmitted in the state, divided by the T ime metric of the state.
We denote by Gs the Goodput metric of state S.
The Goodput G of the system is G = s∈states πs·T s·Gs s∈states πs·T s where πs, T s and Gs are the stationary probability, T ime metric and Goodput metric respectively of every state S in the Markov chain.
Concerning the VI and VO ACs, the AIFS of the AP is shorter than that of the station by one SlotTime, Table 1. Therefore, a collision occurs when C AP = C ST A + 1, the AP transmits when C AP < C ST A + 1 and the station transmits when C AP > C ST A + 1. The modified Markov chain for these ACs is shown in Figure 15.