^{1}

^{*}

^{2}

The advanced communication system uses wireless broadband access technologies which provide high speed data connectivity to the subscribers. One of the most popular wireless access technology is Worldwide Interoperability for Microwave Access (WiMAX) and it is based on IEEE 802.16 standard. WiMAX used Orthogonal Frequency Division Multiplexing (OFDM) is an effective modulation technique to improve the timing synchronization. The performance of channel is affected due to the synchronization mismatching between the transmitter and receiver ends. To achieve the timing synchronization in IEEE 802.16 systems, the cross correlator is used to synchronize the received signal with the known signal. In this paper, two high speed correlators are proposed based on Q1.15 format, which is used to validate the timing synchronization problem. The proposed work has been mapped on XC6VCX75T FPGA and simulations are carried out on the Xilinx-ISIM platform. The implementation result shows that the power delay product reduction is 40.81%, and delay reduction is 39.59% over the conventional multiplier less correlators.

Nowadays, wireless technology is part of everyone’s life. In every consumable from phones to computer, internet, Broadband Wireless Access (BWA) is most popularly used without the use of cable modems and Digital Subscriber Line (DSL) connections. In wireless technique, the signals are transmitted through radio waves. It is possible for long range of communications, which is not possible with the use of wires. One of the wireless techniques is a Metropolitan Area Network (MAN) which is based on IEEE 802.16 standard. It provides fixed broadband wireless access for rural as well as remote areas. Some other wireless accesses are Wi-Fi and WiMAX. The operation of Wi-Fi and WiMAX is similar, but WiMAX operates at high speed and it is used for a large number of users. It has the ability to overcome the physical limitations of wired infrastructure. There are two types of standards developed by the IEEE 802.16 working group, they are fixed usage model and portable usage model. Both fixed and portable applications WiMAX offers 40 Mbps capacity per wireless channel. The important features of IEEE 802.16/WiMAX technologies are frequency < 11 GHz, data rate is up to 100 MHz and distance up to 20 km. The WiMAX physical layer is based on OFDM [

To improve frequency synchronization, many researches have been done. Cyclic Prefix (CP) based method is used to determine frequency offset and symbol timing [

To overcome the drawback of autocorrelation technique, the cross correlation method is used. It accurately determines the start of the frame. Since it has a low signal to noise ratio (SNR), it requires complex computation. Kim et al. proposed the synchronization method that has two separate computation processes. One is autocorrelation for coarse Symbol Time Offset (STO) and Carrier Frequency Offset (CFO) for reduction of hardware cost and reliable frequency synchronization [

In our generation single person sending several information or multiple person sending multiple information at the same time is greatly increased. So synchronization problem occurs and it creates some frequency offsets. The correlators were designed to overcome this synchronization problem. There are two types of correlation: auto correlation and cross correlation. The function of cross correlator is similar to the convolution operation. It is a standard method to estimate the degree to which the two series of signals are correlated. It is mainly used for determining the timing delay between two signals.

The structure of downlink preamble in IEEE 802.16d standard is shown in

The short training symbol is followed CP. It contains four 64 samples which are identical. Long training symbol has two consecutive 128 sample fragments. It also follows the CP guard interval to minimize the inter symbol interference which causes errors in synchronization. For timing synchronization, the cross correlation is

performed by the 64 samples in the short training symbol. Hence the correlators are designed to perform the cross correlation of 64 coefficients with the received signal. Previously parallel multiplier operations are incorporated with the correlator architectures for improving speed of multiplier based architectures [

The correlator is designed to receive complex coefficients. It receives the samples in fractional fixed point format which is called as Q format. Q indicates the number of bits used for fraction. Here 16 samples are used in Q1.15 fixed point format (1) represents the integer and 15 represent the fractional part of the sample.

The correlator correlates the received signals with the short OFDM symbol for timing synchronization. The length of short OFDM symbol is 800 ns and the rate of received samples is 20 MHz. The 16 samples are multiplied with 64 coefficients to generate the proper output. Generally the correlator will perform 320 million complex multiplications per second [

where G_{m} is the mth correlator coefficient, R_{n} is the received sample and C_{n} is the output of correlator. The coefficients can be computed from the Equation (2)

Here Tsam = 50 ns is the Sampling interval, B_{k} is the pilot symbol, and ∆f = 312.5 KHz is the subcarrier spacing. The pilot symbol that is used to generate the short OFDM symbols can be computed by Equation (3)

The real and imaginary part of the coefficients should satisfy

The complex coefficients can also be computed by

The correlator coefficients are computed by (2) and it should satisfy (3), (4) and (5). The coefficients are selected from the sequence of samples from short OFDM symbol and it is tabulated in

M | Gm |
---|---|

1 | 1.18 + 0.02j |

2 | 0.12 + 0.07j |

3 | 1.27 + 0.11j |

4 | 0.82 |

5 | 1.27 + 0.11j |

6 | 0.12 + 0.7j |

7 | 1.18 + 0.02j |

8 | 0.41 + 0.41j |

9 | 0.02 + 1.18j |

10 | 0.7 + 0.12j |

11 | 0.11 + 1.27j |

12 | 0.82j |

13 | 0.11 + 1.27j |

14 | 0.7 + 0.12j |

15 | 0.02 + 1.18j |

16 | 0.41 + 0.41j |

… | … |

64 | 0.41 + 0.41j |

The real and imaginary part of the coefficients is computed using sums of power of two operations. The shift and add operations are best replacement of multiplier to design the correlator. If shift and add operations are used for correlator implementation, there is no need of multiplier. The architecture of non-pipelined direct form correlator is shown in

The product of received sample with the correlator coefficient is obtained by the complex multiplication as

The multiplier less correlator is designed to process the complex coefficients as a sum of power of two and the correlator round off the appropriate coefficients. Hence the design of correlator uses shift and add technique instead of multiplier. Thus the multiplier less correlator is more efficient when compared to the design of multiplier based correlator. The architecture of multiplier less correlator is shown in

Several multiplier less correlators are designed analyzed and its performances were compared for timing synchronization and resource utilization. Compared to these existing correlators, the shift-add technique based correlator is the best one, because it requires only 26 addition/subtraction operations per correlator output. This architecture is mainly used to reduce the complexity in the receiver implementation. Rin is the correlator input, Pr is the selection signal for selecting the output from the shift add block. The shift-add block contains shifter and adder. It calculates the possible correlator coefficients. The shifter in the shift-add block mainly performs the left shift, then the shifted values are added based on the correlator coefficient values. The pre computed values are selected based on Pr[n]. The multiplexed outputs are added to get the final correlator output. Finally by using the shift add block instead of multiplier, the timing synchronization is achieved.

In OFDM, the overlapping of sub-channels provides good spectral efficiency. But this overlapping leads to channel interference. This is caused by frequency offset. The timing error also creates inter symbol interference which affects the performance of OFDM. To improve OFDM timing synchronization, two correlator architectures are proposed.

The correlator is proposed using an efficient computation sharing technique. This efficient sharing technique reduces the hardware overhead. The block diagram of proposed sharing technique based correlator is shown in

The proposed architecture consists of pre-compute unit, multiplexer and adder. The received signal is given as the input to the pre-compute unit block. The values are pre-computed based on the complex valued coefficient samples.

Then the product of the received sample and the correlator coefficients are selected based on the selection input of the multiplexer. The select line depends on the quantization set which is used for OFDM synchronization. The architecture of sharing technique based correlator is shown in

The pre-computation unit is used to compute the multiplication of small bit sequence with the received input samples. Once the products of the received sample with the complex correlator coefficients are computed, then the computed values are shared among the multiplexers. For example, only the eight bit alphabets of the

multiplications are shown in the pre-computation unit. Instead of multipliers, the pre-computation unit is used to compute the product of received samples and the correlator coefficients. The main advantage of a computation sharing technique is time efficiency and reduced area utilization.

The output of the pre-computation block is given as the input to the multiplexer. Based on the quantization value, the product of the received sample and the complex coefficients computed from the pre-computation unit is selected by the multiplexer. The multiplexed pre-computed values are finally added to get the final output.

The architecture of parallel pre-compute correlator is shown in

The size of the correlator and the number of registers is based on the input samples. The coefficients are selected based on the preamble samples of the short OFDM signal. Preamble signal is used for transmitting time synchronization. The parallel architecture is also based on computation sharing technique. The product of the received sample with the complex coefficients is estimated by the pre-compute and the selector unit. Pre-computed values are selected based on the multiplexer. Finally the addition process is done in parallel. Because of this parallel processing, it reduces the delay with some area overhead.

The proposed sharing technique based correlator and the parallel pre-compute architectures are synthesized using Xilinx ISE 13.2 and mapped on Virtex 6 FPGA (Device-XC6VCX75T, Package-FF484 with the speed grade-2) with 40 nm CMOS technology. The behavioral simulation was done in ISIM simulator. The performance of the OFDM is compared in terms of area, power and delay. The comparison of area in terms of number of adders/subtractors and the number of slice LUTs is shown in

Types of correlators | Number of add/sub |
---|---|

Multiplierless correlators [ | 210 |

Pre-compute correlators | 326 |

Pre-compute parallel correlator | 396 |

Types of correlators | Delay (ns) | Power (Watts) | PDP (nJ) |
---|---|---|---|

Multiplier less correlator [ | 32.476 | 1.344 | 43.647 |

Pre-compute correlator | 23.484 | 1.317 | 30.928 |

Parallel pre-compute correlator | 19.616 | 1.317 | 25.384 |

Compared to the multiplier less correlator, the proposed pre-compute and parallel pre-compute correlators have some area overhead in terms of number of adders/subtractors. Both the pre-compute and parallel pre-compute correlator uses the sharing technique, but parallel pre-compute correlator needs more add/sub units due to parallel processing.

Area analysis in terms of Slice Registers, slice LUTs and number of occupied slices of various types of correlators is shown in

From the comparison results, it is observed that the proposed parallel pre-compute and pre-compute correlator architectures reduce the delay and power delay product (PDP). The parallel pre-compute and precompute correlator architectures reduced the delay by 39.59% and 27.68% respectively. The proposed parallel pre-compute and pre-compute techniques provide the power delay product as 40.81% and 29.14% respectively.

The cross correlator architectures are proposed for a flexible timing synchronization in Wi-MAX applications. The proposed pre-compute based and the parallel pre-compute based correlator architectures reduce the area by 9.3% and 10.43% respectively and reduce the delay by 27.68% and 39.59% respectively. The proposed schemes reduce the power delay product as 29.14% and 40.81% respectively. Thus the proposed cross correlation techniques provide proper synchronization with reduced cost for communication systems, mainly in IEEE 802.16 standard applications.

As a future work, an area delay power efficient adder will be used to obtain the optimized cross correlator architectures. The ASIC implementation also leads to the optimized cost minimization.

B. Sivasankari,P. Poongodi, (2016) Design of Low Power and High Speed Correlators for IEEE 802.16 WiMAX Systems. Circuits and Systems,07,1352-1360. doi: 10.4236/cs.2016.78118