On Cross-Layer Design of AMC Based on Rate Compatible Punctured Turbo Codes

,


Introduction
The success of current standard such as 3GPP HSPA and IEEE 802.11/16 in terms of high data rates provision and quality of service (QoS) requirements satisfaction is principally owed to Adaptive Modulation and Coding (AMC), hybrid automatic repeat-request (HARQ) and fast scheduling [1,2].The AMC realization uses different constellation orders and coding rates according to the signal strength [3].By this way, when instantaneous channel conditions are proper, link adaptation offers high data rates at the physical layer.The proper usage of each constellation order and coding rate, i.e., mode is specified by the SNR regions in which each separate mode is active.
Enhancement of AMC performance can be achieved by using different channel coding techniques.Particularly, in case of turbo-coding implementation, an AMC scheme can achieve the highest spectral efficiency even if low SNR regions are met [4].The original rate of a turbo code could be 1/3; nevertheless by using puncturing techniques greater code rates can be used for each modulated symbol.Incorporating also rate compatibility in punctured turbo codes, by which all of the code sym-bols of a high rate punctured code are used by the lower rate codes, an enhanced spectral efficiency is reached.This gain is actually provided by Rate-Compatible Punctured Turbo (RCPT) codes [5].RCPT codes have been employed for HARQ implementations due to the fact that no received information is discarded [6].Such ARQ schemes are well-known as Incremental Redundancy (IR) HARQ schemes that improves the channel use efficiency since parity bits for error correction are transmitted only if this is required [7].
The aforementioned description is basically a crosslayer combination of AMC at the physical layer and HARQ at the data link layer for QoS provisioning in wireless communication networks [8,9].It has been shown that such a cross-layer design outperforms in terms of spectral efficiency, the case of AMC use only at the physical layer or the combination of typical ARQ with a single modulation and coding scheme [8].Moreover, it has been proved that IR HARQ based on convolutional codes has much improvement in spectral efficiency than that with type-I HARQ [9].To this direction and since a lot of research work has been done on turbo codes as well as turbo coding and decoding is applied to all known standards of wireless communications [2,10], we extend this study by employing the aforementioned crosslayer design (CLD) that combines AMC and HARQ based on RCPT codes.
The rest of this paper is organized as follows.Section 2 presents the system model and its components in details.In Section 3, the cross-layer design of the system is presented with its assumptions and constraints.In Section 4, system performance is evaluated for both turbo and convolutonal rate compatible codes and LDPC as well.Besides, the complexity performance is evaluated for each coded system.Finally, Section 5 provides the concluding remarks and gives some directions for further investigation in this topic.

System Model
The model of the adopted cross-layer design system is illustrated in Figure 1.It shows the layer structure of the system as well as the implementation details of the AMC scheme (i.e.physical layer).In the following text, we first describe concisely the functionality of each layer and in sequel we go into details for each of layers' components.

Turbo Encoding and Decoding
First, confirm that you have the correct template for your paper size.This template has been tailored for output on the A4 paper size.If you are using US letter-sized paper, please close this file and download the file for "MSW US ltr format".Turbo coding and decoding achieves performances on error probability near to Shannon limit [11].In its main form, turbo coding is a channel coding type that combines two simple convolutional codes in parallel linked by an interleaver (i.e.Parallel Concatenated Convolutional Codes-PCCC) [12].It had been studied that recursive systematic convolutional codes (RSC) are superior to nonrecursive counterparts for concatenated implementations [13].The codewords of such schemes consist of one information bit followed by two parity check bits which both parallel encoders produce.Thus, the rate code of a PCCC scheme with two RSC constituent codes is 1/ 3 c R  .On the other side, the decoding process of concatenated codes is performed by a suboptimum decoding scheme that uses a posteriori probability (APP) algorithms instead of using Viterbi algorithms.Such a scheme is constructed by "soft-in/soft-out" decoders that exchange bit-by-bit or symbol-by-symbol APPs as soft information that depends on the bit or symbol decoding technique [14].The input soft-information represents the log-likelihoods of encoder input bits and code bits.This is actually the input of the Soft-Input Soft-Output (SISO) "Maximum A Posteriori" (MAP) module presented in [15].The output soft-information of this module is updated versions of input based on the information of the constituent RSC of the turbo encoder.
More specific, turbo decoding based on a PCCC scheme is constructed by two SISO modules that linked with a deinterleaver (Figure 2).In addition to that, iterative decoding is accomplished in order to improve the decoding performance.Henceforth, a feedback loop between the two constituent SISO decoders is established that actually presents the turbo decoding principle [11].This feedback loop appears an interleaver that gives interleaved inputs to the first parallel decoder required for the first iteration and so on.Multiple iterations between these two decoders exchanging soft information give near to Shannon limit results.Turbo codes and iterative turbo decoders has been extensively studied for implementation purposes in current standards like 3GPP HSPDA (High Speed Data packet Access) [16].

RCPT Codes
In general, a RCPT encoder consists of a turbo encoder as described above followed by a puncturing block with puncturing matrix P. The puncturing matrix P is known as the puncturing rule or pattern and indicates the coded bits that should be punctured [17].Puncturing can be applied both to information or/and parity bits.However, the way of puncturing affects the coding scheme performance and the coded-modulation scheme in general [18].Assuming only the impact of puncturing on turbo coding scheme, one can realize that without puncturing systematic bits, the code performance decrease is reasonable.In addition to that, by puncturing periodically the parity bits produced by two RSC codes, a better performance of the coding scheme can be achieved.
The rate compatibility offered by a RCPT code has been considered as the enabling technique for incremental redundancy (IR) HARQ schemes [6].IR HARQ based protocols are major components of HSDPA offering rate matching capabilities [19].During the rate matching process, the transmitter sends only supplemental coded bits indicated by the aforementioned puncturing rule.A representative example of IR HARQ scheme for HSDPA with turbo encoder as mother code is presented in [20].The RCPT encoder in particular is constructed by a turbo mother code with a rate code resulted by RSC encoders.The puncturing matrix indicates the puncturing period and actually the bits being punctured during the HARQ scheme operation [6,18].Therefore, the resultant family of rate codes is: An example of RCPT encoder dedicated to ARQ mechanisms with M = 3 and P = 3 is illustrated in Figure 3 which is constructed from two constituent RSC encoders with rate 1/2 and offers a family of RCPT code rates

RCPT-ARQ Protocol
By puncturing the bits that will be transmitted in the current and future transmission attempts, the HARQ scheme (i.e.RCPT-ARQ) brakes the packet unit with size into blocks of bits with size .The number of transmitted bits of the RCPT-ARQ protocol at the transmission attempt can be expressed as ) 1, denotes the rates produced by the RCPT encoder.Going into further details, we assume a single stop-and-wait ARQ strategy of RCPT-ARQ protocol (i.e.hybrid ARQ) described by the following step-by-step functionality: C  Depending on previous channel condition the adaptive scheme operates on mode n.
 The L-long packet size is encoded by the turbo mother code.The coded packet is stored at the transmitter and is broken into blocks with size of {1 / } . Bits selection is performed for each transmission attempt according to the puncturing rule.
 Constituent blocks' transmission with size i L is initialized according to the puncturing matrix.
 At the receiver, iterative decoding is performed for each separate transmission block i L .If decoding is not successful after the number of maximum transmissions max t N is reached, then a NAK is sent to the transmitter and the adaptive scheme updates to the corresponding mode according to the channel condition.
 Otherwise, an ACK is sent to the transmitter and the adaptive scheme continues to the current mode n.

Cross-layer Design
The cross-layer system structure described above is relied on the following assumptions:  Channel SNR estimation is perfect and in consequence the channel state information (CSI) that is available at the receiver as well, although the impact of errors in SNR estimation on adaptive modulation is negligible [21].In our implementation, the channel estimator is implemented using the Error Vector Magnitude (EVM) algorithm for AWGN channel [22].
 Feedback channel dedicated to mode selection process is error free and without latency.The mode selection is performed in a packet-by-packet basis i.e. the AMC scheme is updated after max t N transmission attempts.Alternative update policies with e.g.updates for every transmission attempt (i.e.block-by-block basis), will be left for further investigation [10].
 System updates are based on received SNR denoted as  that is actually the estimated channel SNR at the receiver.It is assumed that the received SNR  values per packet is described statistically as i.i.d random variables with a Rayleigh probability density function (pdf): where { } E    is the average received SNR.

Cross-Layer Design of AMC and HARQ
A cross-layer design approach that combines the AMC at the physical layer with a hybrid ARQ at the data link layer could follow the procedure presented both in [9] and [10].Applying this method, the following constraints must be imposed in order to keep a particular QoS level at the application layer: Constraint1 (C1): The maximum allowable number of transmissions per information packet is .C1 is calculated by dividing the maximum allowable delay at the application layer and the round trip time required for each transmission at the physical layer.For example, assuming the QoS concept of 3GPP, the audio and video media streams for MPEG-4 video payload allows a maximum delay value equal to 400 ms [23].In addition to that the round trip delay between the terminal and the Node B for retransmissions in case of HSDPA could be approximated less than 100 ms [23].Thereafter, in such a context, the should be 4. On the other hand, C2 is related to the bit-error rate (BER) at the physical layer and the packet size at the data link layer.Hence, if the BER imposed by the QoS requirements at the application layer is equal to and the information packet size is L = 1000 then the should be It is obvious that the aforementioned cross-layer design (CLD) dictates the code rates that will be used for each transmission at the data link layer and therefore specifies the AMC switching thresholds at the physical layer.Moreover, the proposed CLD scheme will be affected by constituent encoders (i.e.RSC encoders) of turbo code as well the puncturing rules [6,18].However, in current investigation, we present the results derived using one of the optimal RCPT code and puncturing rule presented in [6], and we will present the RCPT codes and puncturing impact on our CLD in our future work.

AMC Schemes
The design of AMC schemes is the process by which the switching thresholds are specified.The switching thresholds of an AMC scheme at the physical layer are specified by a given target BER ( ) [3,4].The switching thresholds are boundary points of the total SNR range denoted as   Afterwards, each mode is selected in accordance to the switching thresholds derived from the .
arg et t BER However, in a combined system in which the unit of interest is the packet at the data link layer, the AMC design follows the value.More specific, in order to satisfy the aforementioned constraints of the proposed combination, the switching thresholds should be derived from the following inequality: where is the packet error probability (i.e.packet error rate) after transmissions at the data link layer.In the following paragraphs, we derive the boundary points for each modulation and coding scheme (MCS).
The packet error probability can be expressed in rela-tion to BER by the following equation only if each demodulated and decoded bit inside the packet has the same BER and bit-errors are uncorrelated [9,10].On the other hand, known closed form expressions for the PER1 and BER is not available in the literature and closed-form expressions for the BER of turbocoded modulations in AWGN channel is not available either [8].All the same, one can use the union bound for turbo-coded modulation system using the bounding technique introduced in [24].However, this technique is applied for 16QAM system and indeed needs more investigation in case of turbo-coded AMC schemes with multiple modulation modes.Thereafter and since further investigation on union bounds of turbo-coded modulation is not the aim of our current work, we take BER and PER values through simulations.Finally, the simulated PER values are compared with those derived from fitting the curves and those derived from Equation (5).
Figure 4 shows the PER values versus received SNR of each mode in coding step with , where i Rc 1, 2, 3 i  number of transmissions.We use the 1/2 QPSK, 3/4 QPSK, 1/2 16QAM and 3/4 16QAM modulations with RCPT code rates 1, 1/2 and 1/3 for each transmission respectively.The packet size is length and the puncturing follows the optimal rules according to [6] (Table 1).The constituent RSC encoders of PCCC turbo codec is the optimal encoder B proposed by [6]  g have octal representations and respectively.The number of iterations is 8.The figures depict the simulated PER, the fitting curves and the values derived from Equation ( 5).(15) octal (13) octal In order to have a more clear view on RCPT performance combining with AMC, we should compare it with the other types of rate compatible codes.To this end, we implement also the aforementioned CLD first using RCPC (Rate Compatible Convolutional Code) and second using RC-LDPC (Rate Compatible Low-density Parity-check codes).We use the same rates for both two RC codes.Specifically, the RCPC is a convolutional encoder with rate 1/2, generator polynomial (171, 133) and constraint length 7 [9].For LDPC, we employ the same codes as in [25] with rate 1/2 (1008,504) and a variable node degree equal to 3. The corresponding performance of these modulation and coding schemes (MCS) is depicted separately for each code in

System Performance
In case of a general type-II HARQ that uses punctured codes, the probability of unsuccessful reception after t N nsmissions represents the event of decoding failure with code i Rc ter i transmissions [10].In case of limited transmissions, the packet error probability of this using AMC mode n N under channel states is given by [26] 1,..., By using ( 6) over for each retransmission and for each mode the packet error rates after transmissions are resulted.The  denotes the region boundaries for each MCS and obtained as follows The is reached using the corresponding decoder when the imposed transmission attempts is reached either.Assuming , the derived switching thresholds are listed in Table 2. Table 2 includes also the parameters of MCSs for convolutional and LDPC codes.We next evaluate the system performance in terms of spectral efficiency when the AMC scheme is combined with type II HARQ (i.e.IR HARQ).In each transmission attempt, the number of transmitted bits is specified according to RCPT code rates  , where s T S is the symbol rate.Afterwards, the spectral efficiency gives the bit rate in bits per symbol that can be transmitted per unit bandwidth and is given by In (7), where is the input information packet size and L L is the average of transmitted symbols in order to  transmit an information packet.The average of transmitted symbols for each mode is given by n For cross-layer designed AMC schemes with n = 1,..,N modes, the average spectral efficiency needs to be calculated in order to evaluate system performance.By averaging the n L values in the range of for over all ( ) (1) ( ,..., ) modes, the average number of transmitted symbols in order to transmit an information packet is  rate compatible punctured codes are employed under the constraints of the previous described cross-layer design.The parameters of each MCS are those listed in Table 2 considering a channel with Rayleigh fading phenomena as described above.
In Figure 6, we make contrast of the average spectral efficiency derived for each rate compatible punctured code.We illustrate the values of third transmission (i.e.N t = 3).Figure 6 shows the performance merit of RCPT against RCPC.This corroborates the benefit of turbo scheme against convolutional one in terms of communication performance as it is well known.Indeed, this performance benefit is more evident in low regions of average SNR than in high regions.Moreover, it is obvious that RC-LDPC achieves performance close to RCPT code.This is a useful outcome considering these two families of codes since LDPC codes are used in several standards and especially in space communications.The fact that turbo and LDPC codes show identical performance has also concluded both in [27] and [28].[27] has focused on performance in terms of PER values at the physical layer both in AWGN and multipath Rayleigh fading channel.[28] has proposed the PEG (Progressive Edge-Growth) construction method for LDPC codes and has concluded that turbo coding is identical of LDPC in terms of bit-level performance.To this direction, we evaluate the system performance under the aforementioned cross-layer design and we have also concluded in the same result.

Comparison Complexity
However, the comparison between different codes should not be considered only in terms of performance related to communication efficiency.It should be also studied in terms of complexity even when the achieved system performance is identical between different codes (e.g.turbo and LDPC).Most of code complexity issues are related to computational complexity measuring the additional operations required by each code.Another important aspect of code complexity relies on architectural issues introduced by code design.[29] studies the complexity of decoding algorithms that is measured in terms of computational operations such as multiplications, divisions and additions.In Table 3 is listed the number of operations (i.e.additions, divisions, etc.) needed for each decoding procedure using the max-log MAP (Maximum A Posteriori) algorithm and the Viterbi algorithm in case of turbo and convolutional decoder respectively.These are actually the decoding algorithms that we have implemented in the RCPT and RCPC decoding procedure.In Table 3, M is the constraint length used by each encoder at the transmitter side.
Figure 7 shows the complexity of each decoding procedure (i.e.turbo and convolutional) in terms of number of operations vs. the number of iterations and code constraint length respectively.It is obvious from this figure that the decoding complexity in case of convolutional scheme is noticeably less than turbo case.In our case, the convolutional decoding procedure uses Viterbi decoder with constraint length equal to seven.On the other hand, turbo decoding uses max-log MAP with iterations equal to eight.The declension of turbo decoding complexity is close to two times the complexity of convolutional one since convolutional decoding scheme exhibits 1200 number of operations while turbo one exhibits approximately 2400 number of operations.On the other hand, performance comparisons between turbo and LDPC codes in terms of decoding complexity have shown that when both codes achieve an identical performance then the decoding complexity remains approximately the same.For instance, [28] have claimed that 80 iterations using the belief propagation algorithm produces the same decoding complexity as a turbo code does with 12 iterations using the BCJR decoding algorithm.[27] has studied the performance comparison between turbo and LDPC codes in more details considering computational complexity.The authors have measured the computational complexity in terms of number of operations per iteration per information bit that they could be additions or comparisons.Table 4 shows the computational complexity per information bit of the sub-optimum decoding algorithm for code rate R = 1/3.The complexity is expressed in relation to number of iterations and it is illustrated in Figure 8.
itr N Assuming the same configuration as in [27] the turbo decoding with 8 iterations when a max-log-MAP algorithm is used exhibits approximately the same complexity in terms of number of additions with the LDPC decoding scheme that uses the BP algorithm.In our comparative study, we use the decoding schemes from [28] that consist of a turbo decoder with max-log-MAP plus 8 iterations and LDPC decoder with PEG decoding graphs plus 80 iterations.Henceforth, it could be claimed that both turbo and LDPC decoders show the same computational complexity.

Conclusions and Future Work
In this paper, we have extended the cross-layer design combining AMC with HARQ using RCPT codes.To this end, a hybrid FEC/ARQ based on RCPT codes has been assumed.In previous works, the proposed CLD was introduced with uncoded modulations, convolutional and rate-compatible convolutional coded modulations dedicated to AMC schemes.In addition to that, we have implemented a CLD approach using puncturing techniques for rate compatibility purposes.The system performance has been evaluated for type-II hybrid ARQ mechanism.Moreover, we have illustrated comparative results of system performance of other rate compatible codes as convolutional and LDPC as well.In order to have a more comprehensive view of coding and decoding schemes we also discuss the computational complexity of each code separately, in terms of the required number of operations either in each iteration attempt or for each memory length.However, since turbo coding and indeed punctured turbo codes are able to accomplish better performance with different RSC encoders and puncturing rules namely optimal encoding and puncturing [26], a future work should be the performance evaluation of AMC and HARQ combination implementing different encoders and puncturing rules using RCPT-ARQ.

Figure 1 .
Figure 1.Cross-layer system model combining AMC and HARQ based on RCPT codes.
): The probability of unsuccessful reception after transmissions is no greater than .max t N Pr loss

2 .
In addition to that, when mode is used, each transmitted symbol carry n  As in[9], we assume a Nyquist pulse shaping filter with bandwidth s B T 

Figure 5 Figure 5 .
Figure 5 depicts the average spectral efficiency of the combination of AMC and type-II HARQ relied on constraints and .In this figure, it is shown the performance of AMC at physical layer when 3 t N  0.01 loss PER 

Figure 6 .
Figure 6.Comparison of RC codes in terms of average spectral efficiency under the constraints of CLD design.