Iterative Reweighted l 1 Penalty Regression Approach for Line Spectral Estimation

In this paper, we proposed an iterative reweighted 1 l penalty regression approach to solve the line spectral estimation problem. In each iteration process, we first use the ideal of Bayesian lasso to update the sparse vectors; the derivative of the penalty function forms the regularization parameter. We choose the anti-trigonometric function as a penalty function to approximate the 0 l norm. Then we use the gradient descent method to update the dictionary parameters. The theoretical analysis and simulation results demonstrate the effectiveness of the method and show that the proposed algorithm outperforms other state-of-the-art methods for many practical cases.


Introduction
Spectral estimation technology is widely used in the fields of electronic countermeasures, radar, sonar and mobile communication.In this paper, we mainly consider the line spectral estimation in compressed sensing.Considering the problem of the line spectral estimation using a pre-specified discrete Fourier transform matrix, the sparse solution we obtained may not close to the real sparse vectors when the true frequency components may not lie on the pre-specified frequency grid.This error, referred as grid mismatch, results in performance degradation or even recovery failure.Therefore, in this paper, we treat the dictionary parameters as the unknown variable along with the sparse signal, and complete the optimization of the dictionary parameters when we estimate the sparse vector through the iterative way.
Rather than applying the traditional compressed sensing theory, an increasing For example, work [1] focused on the impact of the basis mismatch on the reconstruction error by treating the error as a perturbation between the presumed and the actual dictionaries.In work [2], to handle the grid mismatch, the true dictionary is approximated as a summation of a presumed dictionary and a structured parametrized matrix via the Taylor expansion.A highly coherent dictionary was used to approximate the real dictionary in [3], and a class of greedy algorithms that use the technique of band exclusion was proposed.On the other hand, in [4] [5] [6], the grid mismatch problem was studied by proposing an atomic norm-minimization approach to handle an infinite dictionary with continuous atoms.Bayesian statistics has also been applied to solve the grid mismatch problem.In work [7] and [8], Bayesian approaches were proposed to iteratively refine the dictionary by treating the sparse signals as hidden variables.
The work [8] used a generalized expectation-maximization (EM) algorithm to solve the dictionary parameters and determine the sparse vector.Work [9] proved that the problem of compressed sensing using a logarithmic penalty can be transformed into an iterative reweighted 2 l norm regression problem by providing a special surrogate function.
In addition, we analyze the first-order optimal condition of the original problem and then prove that the problem can be transformed into a series of reweighted lasso [10] problems by using the iterative method.In each step of the iteration, the derivative of the anti-triangular penalty function forms the weight of the 1 l norm.Compared with the algorithms proposed in [9] and [11], our method is more adaptive with regard to the choice of penalty function and the calculation method for the weight, additionally, the sparse effect of the 1 l norm is also better than the 2 l norm.However, it is well known that there is no expli- cit solution to the problem of the 1 l penalty regression.In this study, we use a Bayesian lasso approach to determine the optimal solution for each step.
The remainder of the paper is organized as follows.Section 2 is the description of the line spectral estimation problem, which we formulate as the penalty least squares problem with dictionary parameters.In Section 3, we provide a theoretical analysis and propose the iterative reweighted 1 l algorithm.In Sec- tion 4, we present several sets of numerical experiments to demonstrate that the iterative reweighted 1 l method is better than other state-of-the-art algorithms in many cases.Section 5 concludes the paper and provides some ideas for future work.

Line Spectral Estimation
Assume the line spectral estimation problem where the observed signal is a summation of a number of complex sinusoids: F.

Penalty Least Squares Regression
In the process of signal reconstruction, the dimension of Y is much smaller than the number of measurements ( N k  ).Since the signal is sparse, the Equation (2) would be transformed into an optimization problem (3): where 0 β stands for the number of the non-zero components of β .The optimization (3), however, is an NP-hard problem (which is difficult to find the solution in polynomial time).We can transform optimization (3) into a penalty least squares problem: The optimization (4) can be formulated as an unconstrained optimization problem by removing the constraint and adding a penalty term to the objective function: where λ represents the adjustable penalty parameter.Different penalty func- tions form different regularization parameters in the iterative process.We find that the penalty function of the inverse trigonometric function has better properties than other common penalty function such as logarithmic penalty function.
In the next section we propose an iterative reweighted 1 l sparse algorithm with anti-trigonometric function penalties.

Algorithm Description
We now develop an iterative reweighted algorithm for joint dictionary parameter learning and sparse signal recovery.Consider the line spectral estimation with anti-trigonometric penalty function: We consider the first derivative of the problem (6).Since the absolute value is involved, we summarize the following derivative functions: The penalty function ( ) cannot be guided at zero, i C β represents its sub-gradient at zero which is a set of real number: If we have the iteration value of step t: ( ) θ β , combine with ( 7) we can es- timate 1 t β + by solving the next weighted lasso problem: Here we use the ideal of Bayesian lasso [11] (which we will briefly introduce later in this article) to find 1 t β + .Similar to (7), the first derivative of (10) can be summarized as: The next step is to find 1 t θ + .In this situation we do not need to calculate the optimal solution, instead we are going to find the estimation The stop condition of the algorithm is controlled by tolerance value 1 2 , ω ω .In this paper we set the tolerance value equals 0.02 in the numerical simulation.
Based on the discussion above, we summarise our algorithm as follows:

Theoretical Analysis
First we want to prove that the objective function ( 6) is guaranteed to be non-increasing at each iteration: ( ) ( ) ( ) , , , Since we obtain 1 t θ + by the gradient descent method, it is obvious that ( ) ( ) LEMMA: Given that the adjust parameter 0 ϕ > , then we have the following inequality: We first denote ( ) ( ) and let 0 x ≥ , then by the mean value theorem we have: , where ζ between 1 x and 2 x .
Since ( ) f x is an increasing function and ( ) the following inequality: ( ) ( ) ( )( ) is always holds for any non-negative value 1 x and 2 x .If we let 1 t i x ϕ β = and , the inequality (15) would be certainly proved.
Next we consider the following equality: using the lemma above we can yields: ) where β + is the optimal solution of problem (10), which satisfied: ( ) Substituting (18) to (17), we show that we have the inequality: in different situations: When: ( ) ( ) , , ( ) ( ) , , When: The set of sub-gradient at ) should contains 0 when β + is the critical point, thus we have the following inequality: Consider the inequality we want to prove, we can easily formulate: , , 0 1 The discussion above proves that we can ensure that the function value keeps non-increasing at each iteration.In addition we want to illustrate that Consider the first-order condition of On the other hand we have the following conclusion in the situation of Compare the above two equalities we can conclude that As for the situation when After the discussion before we can summarize that 1 ˆt β + always satisfied the first-order condition (7) when Thus the value of

The Bayesian Lasso
In this article we use the ideal of Bayesian lasso to estimate the optimal solution of problem (10).
Assuming that the prior distribution of the parameter β follows the Laplace distribution: ( ) Combined with the likelihood function we can get the posterior probability: ( ) Solving problem ( 10) is equivalent to solving the maximal probability of posterior probability, which we can obtain from Gibbs sampling [13].
As Laplace distribution is difficult to directly derive intuitive full condition posterior distribution, the following integral (33) provides an effective solution: ( ) ( ) Using the above integral we can rewritten the Laplace prior distribution ( ) In problem (32) we let Then we can motivate the following hierarchical Bayesian lasso model: ( For Bayesian inference, the full condition distribution of β and v is: By repeated sampling, we will form a Markov chain contains a series of point: ( ) ( ) ( ) , , , , , , Since each iteration will lead to a Markov chain, we will get a long sample of β through the whole algorithm: And we have demonstrated that ( ) ( ) ( )

Simulation Results
In this section, we carry out a series of experiments to illustrate the performance of our proposed 1 l iterative reweighted algorithm (denoted to as l1-IR).In our simulations, we compare our proposed algorithm with other existing state-of-the-art methods, including the sparse Bayesian learning with dictionary refinement algorithm (denoted as DicRefCS) [7], the sparse Bayesian learning with dictionary estimation (denoted as SBL-DE) [8], and especially the super-resolution iterative reweighted algorithm (denoted as SR-IR) [11].
In order to control the noise level at some of the experiments, we first give the definition of observation quality by the peak-signal-to-noise ratio (PSNR): ( ) where 2 σ represents the variance of noise.We calculate the signal-to-noise ra- tio (PSNR) to complete the recovery effect comparison: Figure 1 indicates that the performance of the 1 l iterative reweighted algo- rithm is better than other state-of-the-art methods.With a low sampling number ( 20 N = ), the performances of the DicRefCS and SBL-DE algorithms deteri- orated quickly and failed when the sparseness S was higher than 5. On the other hand, our proposed algorithm still performed well, even when the sparseness S reached 8.It is worth mentioning that the SR-IR algorithm also performed better than the other two Bayesian algorithms but not as well as the L1-IR algorithm.
The reasons are that the 1 l norm is sparser than the 2 l norm and we chose a better weight function.
Next, we illustrate the influence of the sample size N on the recovery effect using another set of experiments.We keep 64 K = , the sparseness 3 S = and the PSNR maintains at 20.We change the number of measurements N from 6 to 32.For each N, Figure 2 shows the performance of respective algorithms.
It can be observed from Figure 2 that, with the increase in N, the recovery performance of the respective algorithms is increased to a relatively high level but our proposed algorithm outperforms the other methods for a small number Relatively speaking, the L1-IR algorithm and the SR-IR algorithm were more stable than the other algorithms as the noise level increased.Figure 3 indicates that the PSNR of all algorithms deteriorated quickly when the noise was very strong.In the case of relatively strong noise, some trials lead to failure results and the failure rate increases with an increasing noise level.Here, a trial was considered successful if the PSNR was higher than 15 dB.Table 1 shows the success rate of the algorithms for different levels of PSNR (at each level, we repeated 20 trials) in the last experiment.The success rate of DicRefCS and SBL-DE is obviously less than that of l1-IR and SR-IR when 2 sigma is large than 0.2.
The validity, superiority, and stability of the l1-IR algorithm are illustrated by these experiments, indicating that the algorithm is worth applying in some practical cases.

Conclusion
In this paper, we treated the real dictionary parameters as unknown variables, and studied the line spectral estimation problem with unknown dictionary parameters.Based on the ideal of Bayesian lasso and the analysis of the first-order    condition of the optimal solution, we proposed an iterative reweighted 1 l pe- nalty regression algorithm.We proved that in each step of the iterative process, the function value is continuously reduced until the approximate solution of the real sparse vector is obtained.The numerical results in Section 4 illustrated that the performance of our algorithm is better than other state-of-the-art algorithms in some cases.The disadvantage is that our method is more time-consuming, partly because in each sampling step it is necessary to ensure convergence, resulting in a sampling length that cannot be effectively reduced.Future studies will focus on this problem.

F.
Ye et al.DOI: 10.4236/apm.2018.82008156 Advances in Pure Mathematics number of scholars have concentrated on the grid mismatch problem instead.

F.Figure 1
Ye et al.DOI: 10.4236/apm.2018.82008164 Advances in Pure Mathematics where * β represents the original sparse signal and β represents the signal recovered by the algorithm.Parameters λ and ϕ have the same effect as re- gularization parameters, we choose 1 ϕ = and select optimal parameters λ by cross validation.In the following, we examine the behaviour of respective algorithms under different scenarios.First we control the noise level 20 PSNR = shows the changes in PSNR of the various algorithms with the change of sparseness S (the sparseness S represents the number of non-zero values in original sparse signal).We keep 64 K = and the sparseness S values range from 1 to 10.We set the number of measurements 20 N = in order to control the variables.Each data point is averaged by repeated tests.

Table 1 .
Successful rate of different algorithms.