Complexity Reduced MIMO Interleaved SC-FDMA Receiver with Iterative Detection

In this paper, we propose the receiver structure for Multiple Input Multiple Output (MIMO) Interleaved Single Carrier-Frequency Division Multiple Access (SC-FDMA) where the Frequency Domain Equalization (FDE) is firstly done for obtaining the tentative decision results and secondly using them the Inter-Symbol Interference (ISI) is cancelled by ISI canceller and then the Maximum Likelihood Detection (MLD) is used for separating the spatially multiplexed signals. Furthermore the output from MLD is fed back to ISI canceller repeatedly. In order to reduce the complexity, we replace the MLD by QR Decomposition with M-Algorithm (QRD-M) or Sphere Decoding (SD). Moreover, we add the soft output function to SD using Repeated Tree Search (RTS) algorithm to generate soft replica for ISI cancellation. We also refer to the Single Tree Search (STS) algorithm to further reduce the complexity of RTS. By examining the BER characteristics and the complexity reduction through computer simulations, we have verified the effectiveness of proposed receiver structure.


Introduction
Recently MIMO transmission techniques with multiple transmit and receive antennas are widely used to achieve the spatially multiplexed transmission and to increase the transmission rate in wireless communications.For MIMO spatially multiplexed transmission, MLD is known as the optimum signal separation method at the receiver side, which attains the minimum BER.However, when the number of transmit antenna and the modulation levels are increased, MLD needs very high computational complexity and the reduction of complexity be-comes a problem [1]- [3].The SC-FDMA is used as the uplink wireless scheme in LTE (Long Term Evolution).The feature of its low Peak to Average Power Ratio (PAPR) characteristics decreases the burden of amplifier linearity in User Equipment (UE) and the SC-DFMA is more suitable to uplink transmission than Orthogonal Frequency Division Multiplexing (OFDM) [4].Moreover by employing the interleaved SC-FDMA where the subcarriers for each UE are deployed like a comb tooth, the PAPR of SC-FDMA is further reduced and the frequency diversity effect becomes large.We have already proposed the MIMO SC-FDE and MIMO Interleaved SC-FDMA receivers with iterative detection where the receive signal is firstly detected by FDE, i.e., MMSE nulling, to obtain the tentative decision results and secondly the ISI cancellation from the receive signal using the tentative decision results is done followed by the MLD for separating spatially multiplexed signals [5].However, as the complexity of MLD increases as the power of modulation levels to the number of transmit antenna, the complexity reduction of MLD becomes an important issue.For reducing the complexity of MLD, QRD-M is proposed [6], but it is a quasi-Maximum Likelihood (ML) method and could not obtain the ML solution although the complexity is greatly reduced.On the other hand, SD [1] [2] can obtain the ML solution like MLD with reduced complexity.In this paper, we propose the novel receiver structure in which the MLD is replaced by QRD-M or SD to reduce the complexity of MLD [5].In addition, by using RTS algorithm [7], we add the bit LLR output function on SD, which enables the proposed SD receiver to cancel the ISI with the soft replica resulting in more accurate ISI cancellation.Moreover we have replaced the RTS by the STS [8] algorithm to further reduce the complexity of RTS.Through computer simulations, we have examined the BER characteristics and the complexity reduction effect of proposed MIMO interleaved SC-FDMA iterative receiver with ISI canceller and MLD, QRD-M, SD with RTS or STS.Consequently we verify that the receiver structure using STS mostly improves the BER and the complexity.

Proposed Transmitter and Receiver Structure
In Figure 1, the block diagram of transmitter and receiver for the uplink is shown.At the transmitter of each UE, the Quadrature Amplitude Modulation (QAM) modulated signal is Fast Fourier Transform (FFT) transformed with N-points and converted to the frequency domain.The FFT points are then mapped to the interleaved frequency points like a comb tooth.After that, the frequency points are Inverse FFT (IFFT) transformed with M points ( 4 M N = in case of 4 UE's for example) to obtain the time domain signal.The Cyclic Prefix (CP) is inserted and the signal is transmitted to the channel.At the receiver in Base Station (BS), after removing the CP, the FDE, i.e., MMSE nulling, is firstly done.The receive signal is then FFT converted with M points and the frequency domain signal is obtained.The frequency points are de-mapped to each user subcarrier arrangement and the subcarriers are multiplied by the MMSE weight ( ) where ( ) u n H denotes the MIMO channel matrix at the frequency point n assigned to the user u ( ) σ the variance of noise at each frequency point, T n I the identity matrix with the size T n which is the number of transmit antenna.After that, the frequency points are IFFT transformed with N points and the time domain signal to be detected is obtained as ˆu x .
[ ] 1 ˆˆˆ, , , , We call ˆu x as the tentative decision through FDE.Using ˆu x , the receive signal replica due to ISI caused by the transmit signals other than the signal at time k to be detected, is generated and is subtracted from the receive signal.Using tentative decision result of (2) and by letting the transmit signal of user u at time k being 0, the tentative decision result for ISI cancellation is obtained as (3).
where ( ) is the receive signal of user u in frequency domain.Next, for ( ) [ ] where the matrix size is T n N × .ˆu k x in ( 5) is then FFT transformed with N points resulting in ( ) ( ) u n H which is assigned for user u at frequency point n and the candidate receive replica for MLD is obtained as Then the squared distance between the ISI cancelled receive signal and the candidate MLD replica in frequency domain is calculated as where * denotes the Euclidian norm.( 6) is minimized over the total T n K MLD candidates and the MLD output of ˆuk x in (5) which minimizes ( 6) is obtained.The tentative decision result ˆuk x in ( 2) is then replaced by the obtained ˆuk x and the procedure proceeds from time k to time 1 k + where the initial value of k is 1.This ISI canceller with MLD procedure is sequentially done from time 1 to N .Accordingly the residual ISI components in tentative FDE decision results are more precisely removed and the spatially multiplexed signals are more accurately separated.After the processing for one FFT block is done, the obtained decision results for one block are regarded as the evolved tentative decision results.Then the MLD outputs are fed back to the ISI canceller at each FFT block and this feedback is iteratively done to lower the final BER.

Complexity Reduction of MLD by QRD-M or SD
The number of candidate replicas in MLD increases exponentially as As the complexity reduction method of MLD, we illustrate the method utilizing the tree search of MLD with QR decomposition [6].The receive signal vector is written as = + y Hx n (7) where y is the receive signal vector with 1 Q by y from the left hand side, we obtain where As R is the upper triangular matrix, the detection of transmit signal is considered as the tree search problem from ˆT n x where ˆT n x denotes the transmit signal candidate from antenna T n .The tree structure is shown in Figure 2 when 2 K = (BPSK) and 4 T n = , where the diverging number at each node and the depth of tree become 2 K = and 4 T n = respectively.Equation ( 9) is also expressed in elements as As the tree search method for Figure 2 toward the width direction, M algorithm is widely known.At each step, the squared distance norm for every branch is calculated, and arbitral m survival paths with the least cu- mulative squared distance metric are retained.The complexity of M algorithm is constant when the value of m is determined and the QRD-M algorithm reduces the complexity of MLD very much, especially when 1 m = .But it could not obtain the ML solution, i.e., quasi-ML.On the other hand, the SD algorithm searches the tree of Figure 2 toward the depth direction.The SD first determines the initial sphere radius Ĉ = −  y Rx for some transmit candidate of x .Next SD searches the transmit signal vector which falls within the radius C toward the depth direction.When the cumulative distance metric exceeds the initial radius, then the subsequent search along the path is no more needed, thus the amount of calculation is saved.Therefore, when the initial sphere radius is small, the complexity reduction becomes more effective.In other words, the higher the 0 b E N and the smaller the initial sphere radius is, more effectively the complexity reduction is done.If the cumulative distance metric does not exceeds the initial radius till the bottom of tree, then the initial radius is replaced by the cumulative metric and the new radius is set.In the same manner the tree search is done for every path in the tree, thus the SD can obtain the ML solution.

Receiver Structure When Using QRD-M Algorithm
By using QRD-M instead of MLD in the receiver structure in Figure 1, we reduce the complexity of MLD.The same signal processing procedure mentioned in 2 is done to cancel the residual ISI and to satisfy the condition as if only the transmit signal at time k is transmitted.After the ISI cancellation, the QRD-M is applied instead of MLD.The number of transmit signal candidates ˆu k x equals T n K .Like in (5), the time domain transmit signal vector with N points in which the candidate transmit signal is located at time k and the transmit signals at other time instants are all set to 0 is generated.Then the time domain signal vector is transformed to the frequency domain signal vector ( ) using FFT with N points.Then the channel matrix

( )
u n H assigned to user u at subcarrier number n is QR decomposed.

( ) ( ) ( )
The Hermitian transpose ( ) of the ISI canceller from the left hand side. ( The squared metric for minimization using ( ) Using M algorithm, (13) is step by step calculated from the bottom to the top.The m survival paths with the least cumulative metrics are retained at each step from the bottom.The path which minimizes (13) is finally selected from the m survival paths which reach the top.The path obtained by ORD-M determines the output ˆuk x .The signal processing afterward is the same as MLD.

Receiver Structure When Using SD Algorithm
By using SD instead of MLD in the receiver structure in Figure 1, we reduce the complexity of MLD.The same signal processing procedure mentioned in 2 is done to cancel the residual ISI and to satisfy the condition as if only the transmit signal at time k is transmitted.In the proposed SD, the initial radius is set using QRD-M.( 13) is used for the search of initial radius using QRD-M.The cumulative metric with small radius is firstly searched in the tree using QRD-M and we set this cumulative metric as the initial radius.Next the transmit signal candidate which satisfies the initial radius is searched toward the depth direction in the tree.When the cumulative distance metric exceeds the initial radius, then the subsequent search along the path is no more needed, thus the amount of calculation is saved.If the cumulative distance metric does not exceeds the initial radius till the bottom of tree, then the initial radius is replaced by the cumulative metric and the new radius is set.In the same manner, the tree search is done for every path in the tree, thus the SD can obtain the ML solution.In (13) the search procedure is done toward the upward direction with exhaustive search to obtain the ML solution of ˆuk x .If the QRD-M can find a smaller initial radius, then more effectively the tree search is done.

Realization of Soft Output in SD
We aimed to obtain the soft output from the SD in Figure 1.In case of QPSK, the bit LLR's for the 1st bit and the 2nd bit of the transmit signal from antenna ( ) are given by ( 14) and (15) respectively.( ) ( ) in (15) represent the same notation but with the 2nd bit being 0 and 1 respectively.In SD, there exist some paths for which searches are not made in the tree.In order to calculate the bit LLR, the path for bit "0" and the path for bit "1", both of which have minimum path metrics, have to be evaluated.For this evaluation, we have used the RTS [7] and STS [8] algorithms.

RTS Algorithm
In RTS, using ( 13) and the M algorithm, the path with minimum path metric is obtained firstly and is regarded as the initial radius of SD.Then, the path metric which is not yet searched is calculated through SE algorithm [1] [2].In RTS, to evaluate the bit LLR, the SE algorithm is repeatedly applied to calculate the path metric which is not searched in SD.In Figures 3(a)-(c), we show the tree structure for BPSK when the number of transmit antenna is 3, for example.In Figure 3(a), the red line shows the minimum path metric [101] obtained from the M algorithm with 1 m = .In this case, in order to obtain the bit LLR of 3 x , we have to find the minimum path metric for which 3 ˆ0 x = , which is illustrated in Figure 3(b).Likewise, in order to obtain the bit LLR's of 2 x and 1 x , we have to find the minimum path metrics for which 2 ˆ1 x = and 1 ˆ0 x = , those are illu- strated in Figure 3(c) and Figure 3(d) respectively.To find the minimum path metrics having the counter bits, we repeatedly use the SE algorithm.

STS Algorithm
In STS, the path metrics for calculating the bit LLR's are evaluated using the single search of the tree.The basic idea of STS follows that every path metric and its search depth are stored in the list and monitored.When the evolution of all the path metrics in the list does not occur during the tree search, the search of specific branch is saved and this results in complexity reduction.At the initial stage, the values in the list are all set to infinity.In Figure 4, we show the STS algorithm where the number of transmit antenna is 4, the number of modulation level K , , s b λ the accumulated norm with the search depth s and the symbol number b , ML λ the accumulated norm of ML sequence.Also, the list as an example is illustrated in Figure 5 where the number of transmit antenna is 4 and the QPSK modulation is used.
In STS algorithm, the list in Figure 5 is filled up with the algorithm in Figure 4.When 3,2 λ is calculated for example, its value is compared with all 1~2,1~4 λ already stored in the list.At this stage, when 3.2 1~2,1~4 λ λ < , we find that the further search of this branch does not lead to the evolution of the path metric.Accordingly we stop the search and move to the calculation of 3.3 λ .When ML λ is finally obtained, the needed norms are read from the list and the bit LLR's are calculated using ( 14) and (15).

Computer Simulation Results
Computer simulations are made for the system in Figure 1.The simulation conditions are listed in Table 1. Figure 6 shows the BER characteristics when the hard decision replica is used to cancel the ISI through MLD, QRD-M or SD for spatial de-multiplexing.In Figure 6, #4, for example, denotes the number of MLD, QRD-M or SD iterated.Figure 7 shows the BER characteristics when the soft decision replica is used to cancel the ISI through RTS or STS for spatial de-multiplexing.In Figure 8, we compared the BER characteristics between hard replica cancellation and soft replica cancellation with iteration being used.Also in Figure 6-8, we showed     the lower bound of BER where the ISI cancellation is perfect, which means the demodulated bits for ISI cancellation are error-free.In Figure 9, we show the comparison of complexity of MLD, QRD-M, SD, RTS and STS on 4 4 × flat fading channel.This complexity is measured using "tic" and "toc" function in MATLAB and the computation time needed for MLD is normalized to unity.
From Figure 6, compared with the conventional FDE receiver, the proposed receiver using MLD with no iterative feedback improves the BER by about 7 dB at 5 BER 10 − = .By increasing the number of iterative feedbacks, the proposed receiver further improves the BER and obtains the BER improvement of more than 10 dB which is close to the lower bound of BER.This is because the MLD outputs with high reliability are used as the improved decision results for making the accurate ISI replicas.Accordingly more exact ISI cancellation becomes possible followed by improved MLD performance.We observe the BER performance of QRD-M is inferior to MLD, but the BER of SD coincides with the MLD, thus the SD can obtain the ML solution.
From Figure 7, we see that the BER performance with soft ISI cancellation behaves basically the same as the SD with hard ISI cancellation in Figure 6, but the BER approaches more rapidly to the lower bound than the hard ISI cancellation.We find that at average From Figure 8, we see that the soft ISI cancellation with iterative feedback performs better than the hard replica cancellation.This is because more accurate ISI replica for cancellation can be generated for the soft decision than the hard decision.
From Figure 9, QRD-M, SD, RTS and STS can reduce the computation time compared with MLD.The QRD-M is the most effective in reducing the computation time.The computation time is almost 1/100 of MLD and is constant over the average 0 b E N value.However, QRD-M is sub-optimal and ML solution is not obtained.The SD can obtain the ML solution, but for low average 0 b E N region less than 10 dB the computation

Conclusion
In this paper, we have proposed the low BER receiver structure for the interleaved SC-FDMA on the uplink MIMO frequency selective fading channels.In the proposed receiver, using the tentative decision results obtained from the MMSE nulling (FDE), the ISI components are cancelled and the MLD is then used for separating the spatially multiplexed signal streams.The reliable output from MLD is again fed back to the ISI canceller to reduce the residual ISI.Furthermore we improve the complexity of MLD by replacing it with QRD-M or SD.We have verified the BER characteristics of the proposed receiver with MLD, QRD-M or SD through computer simulations.The receiver with SD achieves the same BER as the one with MLD, i.e., ML solution, whereas the QRD-M has the inferior BER because of its quasi-ML solution.We have also verified that the complexity of SD is very much improved compared with MLD especially in high 0 b E N region.In order to cancel the ISI more effectively using soft replica, we have further replaced the SD by RTS or STS algorithm in which the soft out from SD is available.As a result, the BER characteristic approaches more rapidly to the lower bound.The complexity of STS is lower than the RTS and almost coincides with the SD in low 0 b E N region.The proposed receiver structure will be useful to extend the coverage of uplink.
4), the signal separation of spatially multiplexed transmission is done using MLD.The total number of candidates of receive replica for MLD is T n K , where K is the modulation levels.The candidate signal for MLD in time domain ˆu k x is obtained by letting the transmit signals all 0 except for at time k to be detected.

Rn
× , H the frequency flat channel matrix with R T n n × , x the transmit signal vector with 1 T n × and n the receive noise vector with 1 R n × .Using the QR decomposi- tion, the channel matrix is decomposed into = H QR , where Q is the unitary matrix and R is the upper tri- angular matrix.By multiplying the Hermitian transpose H

Figure 2 .
Figure 2. Tree structure of MLD when using QR decomposition.
14) denotes the transmit signal in frequency domain of u -th user at time k and frequency point n from transmit antenna i with the 1st bit being 0. has the same notation but with the 1st bit being 1.

Figure 3 .
Figure 3. Tree search algorithm in RTS.(a) Path with minimum path metric; (b) Paths for calculating the minimum counter path metrics for antenna 3; (c) Paths for calculating the minimum counter path metrics for antenna 2; (d) Paths for calculating the minimum counter path metrics for antenna 1.

Figure 5 .
Figure 5. Example of list with 4 transmit antennas and the modulation level K.

Figure 6 .Table 1 .Figure 7 .
Figure 6.Comparison of BER characteristics of MIMO interleaved SC-FDMA receiver with hard replica cancellation of ISI.Table 1. Simulation condition.Number of UE 4 Number of transmit antennas in each UE 4 Number of receive antennas at BS 4 Modulation formats QPSK Number of total subcarriers M = 256 Number of subcarriers assigned to each user N = 64 Symbol length of QPSK T Cyclic prefix length 4T/ Channel model between each transmit and receive antenna Equal power 16 paths quasi-static Rayleigh fading channel Interval of delay time T/4 Subcarrier assignment IFDMA Channel estimation Perfect at BS FDE Nulling (MMSE) Initial radius setting for SD (SE algorithm) QRD-M (m = 1) Number of iterative feedbacks in the receiver 0,1,3 ( # 1) = − # denotes the repetition number of MLD, QRD-M, SD, RTS or STS

Figure 8 .
Figure 8.Comparison of BER characteristics of MIMO interleaved SC-FDMA receiver between hard decision and soft decision with iterative feedback.
the BER coincides with the lower bound and this means the perfect ISI cancellation is possible at this receive SNR value.

Figure 9 .
Figure 9.Comparison of computation time among MLD, QRD-M, SD, RTS and STS on flat fading channel.
The MLD criterion is then expressed as