Fir System Identification Using Feedback

This paper describes a new approach to finite-impulse (FIR) system identification. The method differs from the traditional stochastic approximation method as used in the traditional least-mean squares (LMS) family of algorithms, in which we use deconvolution as a means of separating the impulse-response from the system input data. The technique can be used as a substitute for ordinary LMS but it has the added advantages that can be used for constant input data (i.e. data which are not persistently exciting) and the stability limit is far simpler to calculate. Furthermore, the convergence properties are much faster than LMS under certain types of noise input.


Introduction
Recursive parameter (or weight) estimation has been used since the early days of least-mean squares (LMS) [1].Along with its variants such as Normalised-LMS (NLMS) [2], it has become the adaptive algorithm of choice due both to its simplicity, good tracking ability and the fact that it applies to finite-impulse-response (FIR) filters which are always stable.There have been many attempts at adaptive infinite-impulse-response (IIR) [3,4] filters using algorithms such as recursive-leastsquares (RLS) [2,5], but they are less popular due to stability problems.Besides this, the tracking ability of RLS is rather ad-hoc in nature using a forgetting factor and the algorithm complexity for L weights is   2 O L operations rather than for LMS [6].Adaptive filters are proven successful in a wide range of signal processing applications such as noise-cancellation [7,8], adaptive arrays [9], time-delay estimation [10], echo cancellation [11] and channel equalization [12].
Despite the popularity of LMS it has a few drawbacks.To name a few, LMS will only give unbiased estimates when the driving signal is rich in harmonics (a persistently exciting input signal [13]) and when this driving signal is not white, the convergence of the algorithm is affected depending on the eigen-value spread of the correlation matrix [2,14,15].Usually many ad-hoc approaches that employ variable step-size have been used to try and overcome this problem [16].Gradient based algo-rithms such as LMS are based on steepest-descent and do not have as fast convergence as RLS.The literature is quite old for many of these approaches indicating that little has changed in many respects though there have been some more recent approaches to the same problem [17].This paper addresses these problems by introducing a new concept in weight estimation which is not based on minimisation of any form of least-squares criterion, recursive or otherwise.Instead the paper uses classical control-theory to separate the convolution of weights to system input by the use of high-gain and integrators.Although LMS also use high-gain and integrators with feedback, the LMS approach is always geared towards correlation and the solution of the Wiener-Hopf equation.This is in fact the reason for some of its limitations on special cases.The approach used here is entirely deterministic in nature and instead based on purely control theory.The control-loop used results in deconvolution, or the separation of the two convolved signals (the impulse response and the driving signal).The novelty in the solution is the fact that a special lower-triangular convolution matrix is used in the feedback path of this control system.

Feedback Deconvolution Loop 1
Consider an unknown system driven by a known input signal which can be either random or deterministic Although the system to be identified is single-input single-output, the method here works on a vector of outputs consisting of n + 1 consecutive outputs.
We define here as the z-transform operator and is the backwards shift operator defined for a scalar discrete signal at sample instant k as or for a vector k 1 . A vector output is obtained from the block convolution "* " of the system input and the impulse response vector thus where the weight and input vectors are defined respectively as: , and are of order n + 1 each giving From the convolution Equation (3), we consider only the first n + 1 terms.k becomes a vector of dimension n + 1 written in terms of past values of k as .
A matrix format of convolution now follows accordingly: , , Here is an n + 1 square lower-triangular Toeplitz matrix given by T u (5) This will be known as a convolution matrix.This is distinct from a correlation matrix which is met in leastsquares problems.Now consider a time-variant multivariable controlsystem as shown in Figure 1.The control-loop forwardpath consists of a matrix of integrators with gain K. Its error vector is defined as The vector represents the estimate of the true weight vector k .Assume that the closed-loop multivariable system is stable.Then the error for large gain K becomes (via the usual action of negative feedback) so that k and hence as .If we initially assume the closed-loop system can be made always stable then for a time-varying system, the control-loop output must track any such changes.In algorithm form, the method is quite simple.Furthermore, if k and k are reversed as inputs to the control-system, then the inverse of the system impulse response will be estimated instead, provided of course enough weights are allocated.The above algorithm is not optimal in any least-squares sense since no cost function is minimised.In fact this is an approach to deconvolution using feedback.Note that in Figure 1 integrators are chosen since they have infinite gain at dc (giving zero steady-state error to step inputs of weightvector), but other types of controller are possible.Although not explored here, it is possible to include loopshaping to the control-loop by adding extra integrators and phase-lead compensation.This was explored elsewhere for the ordinary LMS algorithm [18].s u

Stability and Convergence of the Loop
For a given batch of n + 1 samples it is fairly straightforward to determine the stability of the system of Section 2.
Figure 1 is a time-variant multivariable system, but for a given constant vector k we can treat the closedloop system as time-invariant.(A more complete explanation is shown in Appendix when the input vector is time-varying).Calculating an expression for the error vector in the z-domain Substitute ( 8) into (7) and re-arrange Simplifying further The roots for z of the determinant of the return-difference matrix are the closed-loop poles of the system satisfying [19]: That is, the closed-loop characteristic polynomial of the matrix must have roots (eigen-values) which lie within the unit-circle on the z-plane.Now since is a lower-triangular Toeplitz matrix, it follows is also a lower-triangular Toeplitz matrix.Furthermore, it is well established that the eigen-values of such a matrix are just the its diagonal elements which in this case are the n + 1 repeated roots according to giving For stability all roots must lie within the unit circle in the z-plane.This gives us Excluding the special case when , the gain K must satisfy: Equation ( 15) clearly poses no problem provided and K are always positive.Then However, if k is negative then K (from 14) must also be negative for stability.This will require a timevarying K whose sign tracks the same sign as k .It is interesting to see that the stability is only dependant on the single input k and is also independent of the system order.This is different from the LMS case where the step-size depends on the variance of the input signal and the number of weights.Of course k is renewed with every new buffer of data and so it is not just one value which the stability is dependent on.Now we assume that the true weight vector has a constant but unknown set of weights, say .This is mod- From Figure 1, the weight vector estimate is found from the multivariable system according to, and the error vector is found from Substitute ( 17) into ( 16) and re-arranging, gives the multivariable transfer function matrix relating the estimated vector with signal vector.
which as previously, requires all the eigen-values of the matrix to lie within the unit-circle on the z-plane.With stability satisfied via (15), we can now examine the convergence by applying the step input vector in the z-domain thus: We can write in (18) that and by applying a step input vector in the z-domain thus: By applying the z-transform final-value theorem to (19) we get So that the weights converge to the true values provided the closed-loop system is stable.
Algorithm 3.1: Deconvolution Loop 1. Select magnitude of the loop gain 0 K .Loop: {Fetch recent input and output data vectors k , using at least n + 1 samples.Monitor the first sample k within the vector and make u , 1) Update vector error: , 2) Update Weight Vector: where in the above   k T u is formed by Equation ( 5).For L weights, the algorithm has operations.

 
2 O L

Feedback Deconvolution Loop 2
The problem with the control loop discussed in section 2 is that the stability satisfies (15) which makes the loop gain dependent on the sign of k , the first sample of each vector of input data fed to the unknown system.This means that the input sign must be monitored and K switched in sign.A slight modification can overcome this problem.Consider Figure 2.
u It can be seen that the convolution matrix   k T u has now been added to the forward path as well as the feedback path.We now have Substituting the error vector (23) into ( 22) and following a similar approach to Section 2, we find the relationship (24) and using the z-transform of ( 4) The stability becomes the solution of the roots of the polynomial found from the return-difference matrix . Here however we have the product of two lower-triangular matrices .The product of two such identical lower-triangular matrices is a matrix of the form 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (where the x represents cross-terms) which is another lower-triangular matrix with diagonal elements .Following the same arguments as section (2) we can easily show that the condition for stability of the multivariable loop is now  is the variance of the input driving noise and  is the step-size.Clearly (27) when taken over a large number of batches of data and the input driving signal is random, becomes which is not dependent on the system order n.Algorithm 4.1: Deconvolution Loop 2; Select magnitude of loop gain  .
Loop: {Fetch recent input and output data vectors k , using at least n + 1 samples.Monitor the first sample within the vector and make

1) Update vector error:
, 2) Update Weight Vector: where in the above   k T u is formed by Equation ( 5).For L weights the algorithm also has operations.

 
2 O L

Illustrative Examples
Example 1: Consider an FIR system with three unknown weights: Let the system be driven by unity-variance white noise which is dc-free.Select a forward path gain (and LMS step-size) of K = 0.2.It can be seen that the new algorithms have overshoot but the convergence time is very similar.Algorithm 4.2 has around 90% overshoot than Algorithm 3.1.If we compare a norm of the weight-error for each case we see the comparison in Figure 6.LMS gives the fastest performance of the three algo-    rithms.The norm is taken rather than mean-square error because the new algorithms have vector-based errors as opposed to LMS which has a scalar error.
If we add zero-mean uncorrelated white measurement noise for a SNR of 12dB and repeat the above simulation we get some interesting results.In Figures 7-9 it is seen that the LMS estimates are not as smooth as the new algorithms.
The smooth convergence is illustrated in Figure 10 by comparing the weight-error norms.There are large error fluctuations as compared with the feedback algorithms which give similar much smoother performance.Of course the LMS case can be much improved by lowering    error norms for a gain K = 0.05.
LMS is the superior of the three if convergence rate is sacrificed but still has noisy fluctuations in the weighterror norm.
We can conclude from this example that the new algorithms give smoother weight estimates than LMS but LMS can outperform the new algorithms when the measurement noise is zero provided the loop gain is not too high.
Example 2: It is well established that ordinary LMS has problems when the eigen-value spread of the correlation matrix is large [2].This leads to slow convergence and no amount of increasing the step-size will help since instability will result.Consider a system with zero measurement noise and with two weights to be estimated   which is driven by filtered noise k by passing a whitenoise signal k  of variance 0.003025 through an autogressive filter of order two.
The parameters of the autoregressive model are chosen as 1 and 2 to make the eigenvalue spread of the correlation matrix [2].(For the previous example we can say that ) The driving white noise k  is suitably scaled so that the filtered variance of k is unity.The step-size (gain) of the loop was set initially to K = 0.2, which was the maximum step-size that the LMS algorithm could tolerate without becoming unstable.Figure 12 shows the LMS estimates when the step-size is at its maximum that it can tolerate without instability.

u
The LMS algorithm takes about 1000 steps to converge.If the step-size increases further then the LMS algorithm becomes unstable.Whereas if we look at Fig- ure 13 we can see that Algorithm 3.1 is perfectly stable with gain to spare.
The weight-error norms are compared for the same step-size in Figure 14.In order to make a fair compari-  son with LMS, it should be pointed out that Algorithm 3.1 in Figure 14 does not use the maximum step-size (gain).A comparison of Algorithm 3.1 and 4.1 is shown in Figure 15 for this example when the gain (step-size) of the loop is increased to K = 10.
The LMS case is not shown in Figure 15 since it becomes unstable.Although Algorithm 4.1 has a much slower convergence rate than Algorithm 3.1, it is still at least five times as fast as the fastest achievable LMS (by comparison with Figure 12).The reason for the significant increase in convergence rate is because the new methods do not use a correlation matrix at all and hence there is no equivalent inversion of such a matrix as in ordinary least-squares problems.Algorithm 3.1 is around 100 times faster than LMS.For the large eigen-value spread case.
It is worth comparing with the RLS algorithm for this case (Figure 16).An initial error covariance matrix was set up to be diag {1000, 1000} (for two parameters) in order to get a fast convergence.Unlike LMS, RLS is known not to be sensitive to correlation matrix eigenvalue spread [2].
Algorithm 4.1 and LMS (not shown) are much slower than RLS but Algorithm 3.1 is about twice as fast as RLS as shown in Figure 16.The gain was adjusted on Algorithm 3.1 in order to achieve the fastest convergence.
Example 3: The problem of an input signal which is not persistently exciting.
Consider the non-minimum phase system

 
 uu R 100  The system has an FIR equivalent system by using a Taylor series expansion.Hence we find that (to six weights of approximation): A steady dc (unit step) signal is fed to the input of the system and in steady-state a comparison of various recursive estimation schemes was made.Note that the input here is essentially a step input and not dc per se, but nevertheless some algorithms have difficulty with this type of driving signal.
It was found for 4 weights that the new methods of Algorithm 3.1 and 4.1 both converged to the exact values

Conclusion
Two new algorithms have been demonstrated which use feedback instead of correlation methods to estimate the weights of an unknown FIR system.It has been shown that the new algorithms give much smoother estimates of the weights than ordinary LMS.Other than that there is little difference until a driving signal is used whose correlation matrix has widely dispersed eigen-values.Under such conditions the Algorithm 4.1 has at least five times faster convergence and Algorithm 3.1 has about one hundred times faster than ordinary LMS.The disadvantage of Algorithm 3.1 is that the gain needs to be switched in sign as the sign of the first input sample changes.Both of the new algorithms have the property that the gain is not dependent on the number of estimated weights.

Appendix. State-Space Description
We can look at the more general problem when the input is time-varying but the weights are constant by writing the algorithm in state-space format.For Algorithm 3.1 we have   The time-varying matrix must have roots (eigen-values) which lie within the unit-circle on the z-plane.This gives us the same as (13) for the stability limit on the gain K.
Equation ( 3) clearly poses no problem provided and K are always positive.Then but when becomes negative we must change the sign of K making where 0 K is always positive.If the true weights are constant and there is zero measurement noise then we can write (2) as Now define the weight-error vector Now for some initial condition error 0  , (6) has solution Now write the lower triangular matrix where the diagonal elements   Due to its sparsity, such a matrix when raised to the same power as its dimension will always be zero i.e.
  (7), by using the Binomial theorem which, provided (3) holds dies out for large k, taking the weight-error with it.Hence ˆk
gain always positive.Furthermore from (25), for constant weights we can also show that the weights converge to their true values asymptotically .0LMSalgorithm which has a condition for convergence in the mean-square equivalent to ˆk 

Figures 3 - 5
show the weight convergence of Algorithms 3.1, 4.1 and LMS respectively for zero additive measurement noise.

Figure 5 .
Figure 5. Weight convergence for LMS algorithm.No measurement noise.

Figure 6 .
Figure 6.Comparison of weight-error norm for zero measurement noise.

Figure 11 .
Figure 11.Comparison of weight-error error norm for 12dB measurement noise and gain reduction K = 0.05.

Figure 12 .
Figure 12.Weight convergence for LMS algorithm and    uu R 100  .

Figure 15 .Figure 16 .
Figure 15.Weight-error norm for Algorithm 4.1 and Algorithm 3.1, K = 10 and .   uu R 100  W , and we can write(1) in weight-error format as the homogeneous vector difference equation