A Reduced Complexity Quasi-1d Viterbi Detector

This paper develops a reduced complexity quasi-1D detector for optical storage devices and digital communication system. Superior performance of the proposed detector is evidenced by simulation results.


Introduction
Recent literature is rich enough for improvements in multi-user detection system like that of Digital Communication or Optical Storage system.Such improvement with use of turbo encoding/decoding algorithms [1] for digital communication, non-coherent Ultra Wide band (UWB) detector for in the context of distributed wireless sensor networks [2].However, in this paper, we focus on Optical storage systems.
The perpetual push for higher track density necessitates the two-dimensional optical storage (Two-DOS) systems to have large number of tracks in a single group.In the current stage, the number of tracks is chosen to be 11 within the group [3].The complexity of two-dimensional (2D) Viterbi detector (VD) grows exponentially with both the target length g N and number of tracks r N in a single group.Hence, truncating the channel memory by means of pre-filtering techniques does not sufficiently reduce the complexity of 2D VD for the current Two-DOS system.For example, though we have shortened the channel memory by setting 3 g N  , it is by far impractical because the number of states for the full-edged 2D VD will reach 2 22 for 11 r N  .For this reason, in this paper, we develop a quasi-one-dimensional (quasi-1D) VD, which exploits the cross-track decisions as the feedback to facilitate the implementation of reducedcomplexity 2D Viterbi-like detectors for systems with large number of tracks per group.

Decision Feedback Equalization
Decision feedback equalization is a nonlinear detection technique that is quite popular in digital communication systems [4,5].Figure 1 shows the block diagram of a discrete time decision feedback equalizer (DFE).In the figure, h k is the discrete-time channel symbol response, is the additive white Gaussian noise (AWGN) with variance 2    , and k w and k f represent the taps of the forward and feedback equalizer, respectively.The forward equalizer shapes the channel into a prescribed target k g , which is constrained to be causal and the first tap 0 g is constrained to be one.Feedback equalizer has a strictly causal impulse response k f that should match k g for all 1 k  in order to cancel the causal intersymbol interference (ISI), i.e. the ISI due to the symbols that have already been detected.By removing the causal ISI, the DFE uses the threshold comparator to make the bit decision based on the input of the slicer.Though the DFE is the optimum detector that has no detection delay [6], its performance lags behind that of the VD because of the following two main reasons.
• Error propagation: Any decision errors at the output of the slicer will cause a corrupted estimation of the causal ISI, which is to be generated by the feedback equalizer.The result is that a single error causes the detector to be less tolerant of the noise for a number of future decisions.This phenomenon is referred to as the error propagation and degrades the performance of the detector.• Energy reduction: Even in the absence of error propagation, the DFE is still sub-optimum compared to the VD in terms of performance.This is because in the decision process, the DFE subtracts the causal ISI and thus ignores the signal energy embedded in this causal ISI component.In other words, some signal energy that is beneficial for the optimum detection is neglected.The adverse effect on the detection performance is referred to as the energy reduction.To minimize the energy reduction effect due to neglecting the energy of causal ISI, the target is designed to have minimum-phase characteristics, i.e. the energy of the target is optimally concentrated near the time origin.

Fixed-Delay Tree Serch
Unlike the DFE that makes the bit decision instantly, the fixed-delay tree search (FDTS) detection technique makes the bit decision after a delay of D [7,8].In this technique, the bit decision is based on a sequence of D + 1 input samples before the detector and uses the maximum-likelihood (ML) decision rule for the bit decision with a delay of D. The ML decision exploits partly or all of the signal energy embedded in the causal ISI components, and thus reduces the energy reduction effect compared to the DFE.The choice of parameter D is limited by the compromise between performance and complexity.If D + 1 is smaller than the target length Ng, the FDTS is referred to as the fixed delay tree search with decision feedback (FDTS/DF) [8].In fact, the FDTS can be considered as a generalization of the DFE since the FDTS is essentially equivalent to the DFE when D = 0. Similar to the DFE, the FDTS first uses the forward equalizer to shape the channel into a known target.
Then, the noiseless input of the detector is   , where i g represent the coefficients of the target whose length is Ng, and a(n) is the channel input bit at time index n.The FDTS uses a fixed-depth ML decision rule implemented as a tree search algorithm.The tree representation with depth D = 2 is shown in Figure 2 for illustration.Each branch corresponds to one input bit at a particular time.A sequence of branches through the tree diagram is referred to as a path.Each possible path corresponds to one input sequence and vice versa.At time index n, the tree diagram consists of D + 1 bits.Thus, at each time index, the trellis contains 2D + 1 paths that represent all the possible 2D + 1 input se-quences.Detection based on the smallest Euclidian distance between the detector input z(n) and the desired noiseless detector input d(n) is optimum in the ML sense when the noise component of the detector input is white and Gaussian.
Thus, similar to the trellis diagram that corresponds to the VD, the Euclidian distance  is defined as the branch metric for each branch, and the summation of the branch metrics associated with each path is called the path metric.Since the FDTS performs ML detection based on a sequence of samples, it chooses the path whose path metric is minimum as the most likely transmitted sequence and releases the first bit associated with this path as the detected bit.More specifically, the FDTS operates recursively as follows [8]: step, the tree structure has a depth of 1 D  .Each path retains the path metric obtained from the previous iteration.
• Path extension: At the nth step, the tree structure is extended such that the depth is increased to D .
The new input sample   z n is used to compute the branch metric • Path selection: After computing all the path metrics for the extended paths, the first bit of the path that has the smallest path metric is selected and released as the detected bit.Then, half of the total paths that are incompatible with the detected bit are discarded.As a result, the tree structure that remains has a depth of 1 D  .As time progresses, the root node moves along the ML path and a fixed-size identical tree structure is maintained at each time index.Therefore, the complexity of the FDTS is kept constant for each time index.Similar to the VD, the ML decision rule makes the FDTS unduly complicated if D is large.An efficient and simple realization of the FDTS for systems using run length-limited (RLL) (1; k) codes can be found in [9,10].

Sequence Detection with Local Feedback
Many detection techniques with sequence feedback, such as the DFE and FDTS/DF, use the detected bits as the input of the feedback equalizer, resulting in the error propagation problem.Nevertheless, this problem can be reduced by resorting to local feedback [11,12].The local feedback is based on the trellis structure, and uses the path memory associated with the current state instead of the past decisions to estimate the causal ISI.The local feedback guarantees that the branch metric of the correct path is the ML metric, as long as it is discarded in favor of some incorrect path [11].As a result, it improves the performance of those detectors with sequence feedback at the price of requiring a large memory to store paths associated with each state.

Complexity of 2D VD
2D PR equalization to shape the 2D channel into a known 2D target with controlled ISI and intertrack interference (ITI).These controlled ISI and ITI are left to be handled by the 2D VD.The noiseless input of the 2D VD is given by     , where, i g is the target matrix whose length is Ng, and a(n) is the channel input vector at time index n.As indicated earlier, the complexity of 2D VD grows exponentially with both the target length Ng and number of tracks r N in a single group.For a better understanding, the trellis structure for the case of target length 3 g N  and number of tracks per group 2 r N  is shown in Figure 3.In this figure, the '+' and '  ' represent the bits '+ 1' and '  1', respectively.The trellis is assumed to start at the node S0, and then becomes steady at instant 3 n  (i.e.

g n N 
).Here, the labels of states represent the channel memory and number of tracks per groups associated with the paths that pass through these states.At time index n, each state consists of   states.At time index n, each branch specifies the channel memory associated with the state that the branch originates from and the possible channel input vector _a(n).Therefore, each branch corresponds to one possible noiseless detector . For the binary channel input bit, each state possesses 2 r N incoming and 2 r N outgoing branches and thus there are totally 2 r g N N incoming and 2 r g N N of outgoing branches for each time index of the trellis.
In Figure 3, it is clear that even in this simple 2D case, the trellis of 2D case is much more complicated than the one-dimensional (1D) case though the target length is the same.Thus, the practical implementation of the 2D Viterbi-like detector for large Nr also requires the significant reduction of the complexity arising from the cross-track direction.In [13], a technique using the Viterbi detector track-by-track, as well as the decision feedback to estimate the ITI between tracks was proposed.We call this detector the DFE-VD.It uses a set of sub-2D VDs, each corresponding to one track.In the bit decision process for a given track, the known bits just above (or below) the current track are used as the feedback to calculate part of the ITI.These known bits can be previously detected bits, or can be zeros if the upper (or lower) track is the guard-band.
The branch metric is then computed by subtracting the effect of these known bits.However, in this track-bytrack technique, the ITI from either only the upper track(s) or only the lower track(s) estimated, and the remaining ITI estimations are still dependent on the trellis states.As a result, the number of states should be larger than that of 1D VD with the same target length.Moreover, this redundant complexity will not benefit performance much since the detector makes the detection based still only on the input samples from the current single track.An improved detector is the stripe-wise Viterbi detector (SWVD) [3,14].This detector consists of a set of sub-2D VDs, each dealing with one stripe that consists of a limited number of tracks.The number of stripes is equal to that of tracks in a single group.The preliminary decisions from one sub-2D VD is used for estimating the ITI in the next sub-2D VD, which is shifted up (or down) by one track.This procedure is continued for all the stripes and the full procedure from bottom to top (or top to bottom) of the group is considered to be one iteration.Note that at least two iterations are required in order to estimate the ITI from both upper and lower tracks.Unlike the DFE-VD that resorts to the trellis states to estimate the ITI from the lower (or upper) track(s), the SWVD uses the preliminary decisions from the previous iteration to estimate the ITI from the lower (or upper) track(s).This additional decision feedback not only reduces the complexity but also improves the performance compared with the DFE-VD since its decisions exploit the input information from both upper and lower track(s) as well as that from current.However, the use of iterations increases complexity as well as latency.Our new proposal, whereas, is a non-iterative reduced-complexity detector that is applicable to any 2D system.

Causal ITI Target
In this subsection, we introduce the causal ITI target as a starting point for the development of our reduced-complexity 2D Viterbi-like detectors.Conventionally, the causal and anticausal ISI are referred to as the ISI from the past and future bit decisions, respectively [6].Similarly, we refer to the causal and anticausal ITI as the ITI resulting from the lower and upper tracks, respectively.
The concept of causal ITI was first used in the multichannel DFE [15].Similar as shown in Figure 1, this multi-channel DFE consists of a multi-channel forward filter, a multi-channel feedback filter, and a decision block.The multi-channel forward filter is designed to constrain the channel to be causal ISI and ITI.The multichannel feedback filter is designed to remove the causal ISI based on the previous bit decisions.The causal ITI is left to be handled by the decision block.Motivated by this, we propose the causal ITI target such that the 2D target matrices are constrained to be the right triangular matrices.It should be noted that this target is the basis for the development of our reduced-complexity 2D Viterbi-like detectors.As a starting point for our development, we first examine the suitability of the causal ITI target in Two-DOS. Figure 4 shows the performance of full-edged 2D VD for four different targets when 5 r N  and target length 3 g N  .In the figure, the diagonal elements of G0 in the causal ITI target are constrained to be 1s to avoid trivial solutions of the target and equalizer.We use a fixed 2D target with elements   1 2 and 2D monic constrained target, which are reasonable targets described in the last chapter for Two DOS, as reference targets.Note that we impose a symmetry constraint, which constrains all the tracks within the same group to suffer the same amount of ITI, in the design of the 2D monic constrained target.In other words, after the finite length equalizer, all the tracks within the same group ideally suffer the same amount of ITI.However, due to the presence of guard-bands serving as boundaries of the group, before the finite length equalizer, not all the tracks suffer the same amount of ITI.In addition, the 2D monic constrained target only allows ITI from adjacent tracks.Therefore, the symmetry constraint will burden the design of finite length equalizer and result in residual ISI and ITI.Note that the causal ITI target does not have this symmetry constraint, and allows ITI not only from the adjacent tracks.Therefore, compared with the 2D monic constrained target, the causal ITI target burdens the finite length equalizer less and is expected to achieve better performance.From Figure 4, it is shown that the causal ITI target outperforms all the targets at every SNR.This result indicates that it is reasonable to use the causal ITI target for Two-DOS.More importantly, based on this target, we propose some reduced-complexity 2D Viterbi-like detectors that are quite different from DFEVD and SWVD since the latter two detectors suffer ITI from both lower and upper tracks.

Principle of Quasi-1D VD
Since the causal ITI target contains ITI only from the lower tracks, the bits in the upper tracks will not affect the desired output.Based on this idea, a set of 1D VDs are used to detect the bits, each deals with one track.More specifically, as shown in Figure 5, the first 1D VD that deals with the lowest track is processed with no delay and the bits are detected after a delay D. The second 1D VD that deals with the second lowest track is processed with the delay D in order to use the detected bits from the lowest track to estimate all the ITI in the second lowest track.The third 1D VD that deals with the third lowest track is processed with a delay D after the second 1D VD, and the detected bits from the lowest two tracks are used to estimate the ITI in the third lowest track.This procedure continues for all the tracks.Since the bits detection does not need to consider the interference from the upper tracks, this detector is distinct from the DFE-VD and SWVD.Compared with the DFE-VD, this detector has less computational complexity since fewer states are needed for bit detection.More importantly, the quasi-1D VD has better BER performance since it uses all, while DFE-VD uses part, of the input information that is needed in the cross-track direction.As illustrated in Figure 6, the quasi-1D VD outperforms the DFE-VD significantly no matter what target is chosen for the DFE-VD.Compared with the SWVD, as mentioned previously, it has much lower complexity since it has no iterative procedures.
Link with QR Detector Our quasi-1D VD is developed for the Two-DOS system, which is a multiple-input multiple-output system having a large temporal span of the channel.Obviously, this quasi-1D VD is applicable to multiple-input multiple-output systems having an arbitrary temporal span of the channel.In many wireless communication systems, the multiple-input multiple-output channel is assumed to be at-fading [16,17], i.e. the temporal span 1 h N  .In such systems, the channel is characterized by a matrix, Where, z and a are the ( 2 1 N  ) channel output vector, and ( 1 1 N  ) channel input vector, respectively, H is the ( 2 1 N N  ) at-fading channel matrix.For the sake of simplicity, the time index is ignored here.Then, QR decomposition of the channel matrix yields

QR H 
, where Q is an ( 2 1

N N
 ) ortho-normal matrix constructed to make the ( 1 1 N N  ) matrix R right triangular [19].Pre-multiplying the channel output vector z with H Q , the resulting vector ẑ is given by Note that if the noise in z is additive white Gaussian noise (AWGN), the noise in ẑ remains AWGN since

N N
 ) identity matrix.Comparing R with the causal ITI target discussed in the previous subsection, we find that R can be seen as a special case of causal ITI targets.Then, like the quasi-1D VD, the first element from the bottom of the channel input vector a is first detected.The detected element is used to estimate interferences for the detection of the second element from the bottom of a.This procedure continues until all the elements in a are detected.
This detector is commonly referred to as the QR detector and has been investigated in multiple-input multiple-output at-fading channels [19][20][21].The QR detector is also applicable in multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) systems [20,22], since the channel at each sub-carrier of MIMO-OFDM systems is considered as a multiple-input multiple-output at-fading channel.Note that our proposed quasi-1D VD is suitable for any multiple-input multiple-output channel with arbitrary positive h N , while the QR detector is only applicable for multiple-input multiple-output at-fading channel, i.e. 1 h N  .Therefore, the QR detector is considered as a special case of our proposed quasi-1D VD.

Performance of 1D VD
As shown in Figure 5, though the quasi-1D has much lower complexity than the DFE-VD and SWVD, it causes significant detraction from optimality.We consider three factors that affect the performance of quasi-1D VD: target length, error propagation and energy reduction.In Figure 7, "L4" and "L5" represent that the lengths of targets are four and five, respectively.Otherwise, the length of target is three."No EP" means detectors without suffering error propagation.In simulation, "No EP" is achieved by use of correct input bits to estimate ITI.The length of the equalizer is 31 in all the simulations.As illustrated in Figure 7, the BER performance is not significantly improved by increasing the target length.Further investigation shows that all the elements in target matrices 3 g and 4 g approach zero, therefore confirming that there is no need to increase the channel memory beyond two.Figure 7 also shows that the error propagation degrades performance by 1 dB for BER is 4 10  .Thus, the energy reduction should be the dominant factor that degrades the performance.

Conclusions
In this paper, we have first briefly reviewed prior work on the detectors with sequence feedback.Then, by constraining the target with causal ITI, we have developed a quasi-1D VD, which uses a computationally efficient technique whose complexity, grows only linearly with the number of tracks.This is a significant complexity reduction compared to the conventional 2D VD whose complexity grows exponentially with the number of tracks.We have shown that the quasi-1D VD improves over the DFE-VD and SWVD in terms of complexity.Further, we have shown that the widely known QR detector is a special case of our proposed quasi-1D VD.However, we have found that the quasi-1D VD still causes significant detraction from optimality in the Two-DOS system.Therefore, effective compensation techniques are needed to ensure reliable data recovery.To achieve this goal, we have investigated the factors that might degrade the performance.Our simulation results implied that the energy reduction is the dominant factor that degrades the performance of the quasi-1D VD.Therefore, in the next chapter, we develop some effective techniques to reduce the effect of this energy reduction problem.In addition, the effect of error propagation still needs to be minimized since it degrades the performance by roughly 1 dB when BER is

Figure 1 .
Figure 1.Block diagram of a discrete-time decision feedback equalizer.

Figure 2 . 1 gN 1 gN
Figure 2. Tree representation with depth D = 2 for the uncoded binary channel input data.

Figure 3 .
Figure 3. Trellis structure for a channel with Ng = 3 and Nr = 2.

Figure 4 .
Figure 4. BER performance for different target constraints.

Figure 5 .
Figure 5. Principle of the quasi-1D VD.The solid lines represent the input and output of sub-VDs, the dashed lines represent the feedback coming from the output of the previous sub-VDs.

Figure 6 .
Figure 6.Performance comparison of different detection techniques.instead of a sequence of matrices in the Two-DOS system.Let 1 N and 2 N represent the number of transmit and receive antennas, respectively, in multiple-input multiple-output wireless communication systems.Then, the channel output vector at a given time is given by z Ha (1)

Figure 7 .
Figure 7. BER performance of quasi-1D VD with different target lengths.