^{1}

^{*}

^{1}

^{1}

Minimum Partial Euclidean Distance (MPED) based K-best algorithm is proposed to detect the best signal for MIMO (Multiple Input Multiple Output) detector. It is based on Breadth-first search method. The proposed algorithm is independent of the number of transmitting/receiving antennas and constellation size. It provides a high throughput and reduced Bit Error Rate (BER) with the performance close to Maximum Likelihood Detection (MLD) method. The main innovations are the nodes that are expanded and visited based on MPED algorithm and it keeps track of finally selecting the best candidates at each cycle. It allows its complexity to scale linearly with the modulation order. Using Quadrature Amplitude Modulation (QAM) the complex domain input signals are modulated and are converted into wavelet packets and these packets are transmitted using Additive White Gaussian Noise (AWGN) channel. Then from the number of received signals the best signal is detected using MPED based K-best algorithm. It provides the exact best node solution with reduced complexity. The pipelined VLSI architecture is the best suited for implementation because the expansion and sorting cores are data driven. The proposed method is implemented targeting Xilinx Virtex 5 device for a 4 × 4, 64-QAM system and it achieves throughput of 1.1 Gbps. The results of resource utilization are tabulated and compared with the existing algorithms.

Today MIMO system is one of the wireless communication technologies which provide increased data throughput and link range without any additional bandwidth. MIMO plays a key role in every new wireless standard, such as HSDPA (High Speed Download Packet Access), IEEE 802.11n [

Among the large variety of the MIMO detection techniques Maximum Likelihood (ML) detection [

Finally to solve the trade-off between complexity and performance loss tree search algorithms are introduced. Depth-first and Breadth-first search algorithms are two main categories in tree search algorithms. In Depth-first search algorithm the tree is traversed in both forward and backward direction with variable throughput which results in extra overhead in hardware. But in Breadth-first search algorithm [

In this paper, to reduce the computational complexity in K-best algorithm, a MPED based K-best algorithm is designed and implemented on FPGA board (a reduced complexity systems), which making a significant reduction in the over-all hardware/software complexity of the system. The K value is chosen based on the square root of constellation order of QAM. So for a 4 × 4, 64-QAM MIMO detector with K = 8 is chosen. All the results presented on system performance were first tested in Matlab and then translated into hardware blocks in Simulink using Xilinx System Generator (SysGen). Once the hardware designs were completed, the bit streams were generated using Xilinx synthesis tools, which were required for the FPGA implementation. An efficient VLSI implementation is the key to enable real-time wireless communication.

Assume the MIMO system with N_{t} transmitting and N_{r} receiving antennas as shown in _{t} × N_{r} channel matrix H [

The complex-valued base band received signal is expressed as,

where _{t}-dimensional complex transmitting signal vector, from which each element is obtained independently from the complex constellation of QAM. _{r}-dimensional complex received signal vector,

with the variance of σ^{2} per dimension. A complex domain frame work is developed here; and on the other hand for the received signal, the real value decomposition can also be derived [

Due to the intrinsic challenges in the implementation in the complex domain, most of the MIMO detection algorithms in the literature have been proposed for the real domain [

In K-best algorithm each level of the tree is expanded from root to the leaves and selects the best candidates with the lowest path metric that is possible in each level. The path at the last level of the tree with the lowest Partial Euclidean Distance (PED) is the hard-decision [

The objective of the MIMO detection method is to find the closest lattice points [

where O is the set of vectors from the real entries in the constellation.

The channel matrix H is QR decomposed as H = QR, where Q is the unitary N_{r} × N_{t} matrix and R is the upper triangular N_{t} × N_{t} matrix. By taking hermetian of Q or (Q^{H}), the nulling operation can be performed, which results in Z = Q^{H} × Y, which in turn equals to Rx + w, where w = Q^{H} + e, the nulling matrix is always known to be one, where the noise w after nulling remains spatially white. Since R is an upper triangular matrix in nature, hence the Equation (2) can be represented as

the Equation (3) is considered as an tree-search problem with N_{t} levels, where, starting from the last row, one symbol is detected and, based on that, the next symbol in the upper row is detected, and so on.

The two computing procedures in the K-best algorithms are

1) Expansion: The K-best algorithm in the complex domain can be expressed as K (Parents of each level) ×M (Children per parent). KM children should be enumerated, which results in higher computational complexity. The relaxed K-best algorithm and base-centric search methodology [

2) Sorting: In the K-best algorithm, for each level of the complex-domain, KM children should be sorted. In [

To overcome the above two challenges the MPED based K-best algorithm is proposed, in which the node with the minimum PED is considered as the parent node at each level. The computational complexity and performance will be better than the on-demand expansion scheme and works well for any values of K and M without any performance loss.

The proposed MPED based K-best algorithm is based on the Breadth-first tree search method. The algorithm is initialized by considering the level l of the trees and assumes the candidate nodes in the level l + 1 is known in

the tree. The individual nodes in the level K_{l}_{+1} will be having

The main objective of the proposed scheme is to find the First Best Child (FBC) of the initial parent node, based on the Minimum PED of the received first

merical value. In other words, the key innovation behind the proposed MPED based K-best algorithm is to find the FBC of each initial parent node in the level K_{l}_{+1},and among these children the best candidate at level K_{l}_{+1}, is the one which is having the minimum PED value. The best candidate selected act as a parent node for the next level. The children’s for the second level parents are generated and it replaces the first level siblings. In order to find the best path the process is repeated K times. For each level of tree the same procedure is repeated till the best path is found.

The proposed MPED based K-best algorithm scheme is diagrammatically represented in _{l} from K_{l}_{+1} level. The input to the algorithm is initially applied with zero PED value, the parent node at the level K_{l}_{+1} has four children’s, the corresponding PED values of the four children’s are shown in

Here the parent can find its own children’s without visiting all the nodes in the tree. Let the representation of S_{l} consist of best selected child for the first parent, and let P_{T} represents the corresponding PED values (in _{ij} represents the j^{th} child of the i^{th} parent node in the first lev-

el of the algorithm). From

The proposed scheme involves the following features:

1) It can be easily adapted to real domain.

2) Based on the QAM constellation size the K value is chosen in proposed scheme so as compared to the existing algorithm (K value is randomly chosen) it has less computational complexity.

3) It can be applied to infinite lattices and be jointly applied with lattice reduction.

4) Increased performance is obtained by using Wavelet Packet Transformation (WPT) with the AWGN channel.

5) It has reduced BER.

6) Easily implemented in VLSI architecture.

In VLSI architecture one of the main key challenges is to achieve high throughput with minimum number of levels that are being used in the architecture. To address this challenge, a pipelined structure is used, which performs the child expansion and minimization in a pipelined fashion and the sorting is implemented in a distributed way. The pipelined architecture involves the sorter block which sorts all the signals and the Processing Element (PE) block generates the best signal from the sorted signals.

The proposed pipelined VLSI architecture for a 4 × 4, 64-QAM hard output MIMO detector is shown in _{i}, r_{ij} and the K parents [_{t} = 8 stages), from L1 to L8, corresponding to the 8-level detection tree.

From the MPED based K-best algorithm, the best signal is detected and this signal is taken as an input to the 8th level of the tree, which opens up all the possible values in O = {−3, −1, 1, 3}and calculates their corresponding PED [

Using the Sorter block the FC is sorted and from that the child with lowest PED is determined. This is represented y

The PE block contains a data register file and three computation units: an arithmetic/logic unit, a multiplier and a shifter. The PE I block takes the FC of each level as an input and generates the K-best candidate of that level one-by-one. The node with the lowest PED is definitely one of the K-best candidates in L7. This value is passed to the PE II block in L6. By removing the first child, its next sibling is calculated by the PE I block. The PED of this sibling is compared with other FCs, already present in that stage. The next K-best candidate with the lowest PED among this new set were found. This process is repeated 8 times (taking 8 cycles) until all the K-best values of the second level of the tree are generated and passed to the PE II block.

The PE II block receives the K-best candidates of L7, one after the other, and generates the FC of each received K-best candidate one-by-one and sorts them as they arrive. It finally transfers them to its following PE I block. This process repeats for all the levels down to the first level. Since at the first level only the FC with the lowest PED is of concern, whose solution S is the hard-decision output of the detector.

A 4 × 4 64-QAM MIMO system with K = 8 is considered in our simulation. The simulation is carried out using Matlab. The input message signal is chosen and it can be plotted in the random bit form. To apply the proposed method for the input message signal, the best candidates are identified for each cycle.

The best candidates for the proposed MPED based K-best algorithm obtained as a result of simulation for the given input message stream are listed below.

1) The best candidate of 1 cycle is 3.283351e+003.

2) The best candidate of 2 cycle is −1.083299e+002.

3) The best candidate of 3 cycle is 3.279795e+003.

4) The best candidate of 4 cycle is −2.166598e+002.

5) The best candidate of 5 cycle is 3.393075e+003.

6) The best candidate of 6 cycle is −3.249897e+002.

7) The best candidate of 7 cycle is 3.389520e+003.

8) The best candidate of 8 cycle is −4.333195e+002.

Based on BER vs. SNR the simulation results of the MIMO detections are presented in this section. BER is a key parameter that is used in assessing systems that transmit digital data from one location to another. It is defined as,

If the medium between the transmitter and receiver is good and the signal to noise ratio is high, then the bit error rate will be very small possibly insignificant and having no noticeable effect on the overall system However if noise can be detected, then there is chance that the bit error rate will need to be considered. The BER is compared with the Rayleigh fading channel as well as AWGN fading channel scheme as shown in

In this section we compare the proposed method (MPED based K-algorithm) with the existing MIMO detection algorithms such as ZF, MMSE-SIC and ML. The MPED based K-best algorithm gives reduced BER as compared to ZF and MMSE-SIC detector algorithms as shown in

Fast Fourier Transform (FFT) is a powerful tool for analyzing the components of a stationary signal (no change in the properties of signal). But it is less useful in analyzing non-stationary signal (change in the properties of signal). Wavelet Packet Transforms allows the components of both stationary and non-stationary signals to be analyzed. The main difference is that wavelets are well localized in both time and frequency domain whereas the Fourier transform is only localized in frequency domain. The BER is compared for both FFT as well as WPT [

The proposed complex MIMO detector and the recently proposed MIMO detectors in the real and complex domains which were compared and reported in the literature is shown in

This design has a larger core area than the one in [

References | TCAS-II 2010 [ | DATE 2009 [ | TVLSI 2010 [ | JSSC 2010 [ | JSSC 2011 [ | TVLSI 2011 [ | TVLSI 2011 [ | TVLSI 2013 [ | Proposed Work |
---|---|---|---|---|---|---|---|---|---|

Modulation | 16-QAM | 64-QAM | 64-QAM | (4-64) QAM | 64-QAM | (16-64) QAM | 64-QAM | 64-QAM | 64-QAM |

Antenna | 4 × 4 | 4 × 4 | 4 × 4 | 4 × 4 - 8x8 | 4 × 4 | 4 × 4 | 4 × 4 | 4 × 4 | 4 × 4 |

Method | SISO-SD | Sys. Like Detection | K-Best | MBF-FD(SD) | SISO MMSE PIC | MMF-LSD | K-Best | Modified K-Best | MPED based K-Best |

Domain | Complex | Complex | Real | Complex | Complex | Real | Real | Complex | Complex |

K-value | N/A | N/A | 5-64 | N/A | N/A | N/A | 10 | 10 | 8 |

Process size | 90 nm | 45 nm | 65 nm | 0.13 µm | 90 nm | 0.18 µm | 0.13 µm | 0.13 µm | 0.13 µm |

f_{max} (MHz) | 250 | 574.7 | 158 | 198 | 568 | 250 | 282 | 417 | 435 |

Throughput (Mb/s) | 90 | 215 | 732 - 100 | 285 - 431 | 757 | 31.7 - 146.3 | 675 | 1000 | 1100 |

Gate count (kG) | 96 | 33.1 | 1760 | 350 | 410 | 25.4 - 48.2 | 114 | 340 | 328 |

NHE | 1.6 | 0.45 | 4.81 - 35.2 | 1.23 - 0.81 | 0.78 | 0.58 - 0.24 | 0.17 | 0.34 | 0.3 |

Energy/bit | N/A | N/A | N/A | N/A | 250 PJ/b | N/A | 200 pJ/b | 110 pJ/b | 100 pJ/b |

Power (mW) | N/A | N/A | 165 | 57-74 | 189.1 | 57-90 | 135 | 1700 | 1684 |

Latency (µs) | N/A | N/A | N/A | N/A | N/A | N/A | 0.6 | 0.36 | 0.27 |

Hard/soft | Soft | Soft | Hard | Soft | Soft | Soft | Hard | Hard | Hard |

mented using a feed forward architecture. According to the proposed algorithm, K-best candidates of each layer of the architecture are generated in Kclock cycles, which increase the throughput of the system.

The throughput of the system is the number of packets produced per unit time. This is measured in units of whatever is being produced (I/O samples, memory words, iterations) per unit time. The latency is the number of cycles required for the system to accept next input and the gate count involves the total core area of the design.

The Normalized Hardware Efficiency (NHE) is calculated, which is given by the gate count and the corresponding scaled throughput [

Moreover, the proposed scheme is implemented in the FPGA platform. The synthesis results and the required resources for the 4 × 4, 64-QAM MIMO detector using the proposed scheme is shown in

To detect the best signal for high performance MIMO detector, a MPED based K-best algorithm has been proposed. This proposed algorithm is scalable both in terms of number of transmitting/receiving antennas and the constellation size. It gives a reduced BER and low computational complexity as compared to existing algorithms. This is carried out by simulating both in terms of FFT scheme and wavelet scheme using Matlab. The proposed design was implemented in Virtex-5 FPGA from Xilinx platform; it provides a high throughput of 1.1 Gigabits per second (Gb/s) at 435 MHz with the area of 328 K gates in a 0.13-µm VLSI process. This algorithm is applicable for real time wireless communication.

Poornima Ramasamy,Mahabub Basha Ahmedkhan,Mounika Rangasamy, (2016) Design and FPGA-Implementation of Minimum PED Based K-Best Algorithm in MIMO Detector. Circuits and Systems,07,612-621. doi: 10.4236/cs.2016.76052