An Efficient Adaptive Iteratively Reweighted l1 Algorithm for Elastic lq Regularization

In this paper, we propose an efficient adaptive iteratively reweighted 1  algorithm (A-IRL1 algorithm) for solving the elastic q  regularization problem. We prove that the sequence generated by the A-IRL1 algorithm is convergent for any rational ( ) q 0,1 ∈ and the limit is a critical point of the elastic q  regularization problem. Under certain conditions, we present an error bound for the limit point of convergent sequence.


Introduction
Compressed sensing (CS) has been emerging as very active research field and brings about great changes in the fields of signal processing in recent years [1] [2].The main task of CS focuses on the recovery of sparse signal from a small number of linear measurement data.It can be mathematically modeled as following optimization problem, 0 min subject to , where (commonly m N < ) is a measurement matrix and 0 x , formally called the 0  quasi-norm, denotes the number of nonzero components of ( ) , , , . In general, it is difficult to tackle problem (1) due to its nonsmooth and nonconvex nature.In recent years, some researchers have proposed the q  norm regularization problem [3]- [5] with 0 1 q < ≤ , that is, to consider the following q  regularization problem min subject to , or the unconstrained q  regularization problem 2 2 1 min , 2 where ( ) for 0 1 q < ≤ and 0 λ > is a regularization parameter.
When 1 q = , it is well known that the problems (2) and ( 3) are both convex optimization problems, and therefore, can be solved efficiently [6] [7].On the other hand, when 0 1 q < < , the above problems ( 2) and (3) lead to nonconvex, nonsmooth and even non-Lipschitz optimization problem.It is difficult to solve them fastly and efficiently.Iterative reweighted algorithms, which include iteratively reweighted 1  algorithm [8] and iteratively reweighted least squares [9], are very effective for solving the nonconvex q  regularization problem.
In this paper, we consider the following elastic q  regularization problem, where 1 2 , 0 λ λ > are two parameters.When 1 q = , the above problem (4) reduces to the well-known elastic-net regularization proposed by Zou and Hastie [10], which is an effective method for variable selection.In [10], Zou et al. showed that this method outperformed Lasso [11] in terms of prediction accuracy for both simulation studies and real-data applications on variable selection.For further statistical properties of the elastic-net regularization in detail, we refer to references [12] [13].When 0 1 q < < , problem ( 4) is an extension of elastic net regularization from 1  penalty to q  penalty.In statistics, elastic q  regularization is usually very effective for group variable selection.
Obviously, for 0 1 q < < , the q  norm term in (4) is not differentiable at zero.Therefore, in this paper, we study the following relaxed elastic q  minimization problem with 0 1 q < < ( ) ( ) The model ( 5) can be considered as an approximation to the model (4) as 0 ε → .In order to solve the above problem (5), we propose the following adaptive iteratively reweighted 1  minimization algorithm (A-IRL1 algorithm), ( ) where the weight ( ) is defined by the previous iterates and updated in each iteration as The adaptive iteratively update of k ε in the proposed algorithm is the same as the one in [9], which is also adopted in [14].The A-IRL1 algorithm (6) solves a convex 2 1 −   minimization problem, which can be solved by many efficient algorithms [6] [7] [15].
The relaxed elastic q  regularization problem (5) can be solved by A-IRL1 algorithm (6).In this paper, we prove that any sequence generated by the A-IRL1 algorithm (6) is convergent itself for any rational ( ) Moreover, we present an error bound between the limit point and the sparse solution of problem (1).
The rest of this paper is organized as follows: In Section 2, we summarize the A-IRL1 algorithm for solving elastic q  regularization problem (5).In Section 3, we present a detail convergence analysis for the A-IRL1 algorithm (6).We prove that the A-IRL1 algorithm is convergent for any rational ( ) based on an algebraic method with * 0 ε > .Furthermore, under certain conditions, we present an error bound between the limit point and the sparse solution of problem (1).Finally, a conclusion is given in Section 4.

A-IRL1 Algorithm for Solving Elastic q  Regularization
We give a detailed implementation of A-IRL1 algorithm (6) for solving elastic q  regularization problem (5).The algorithm is summarized as Algorithm 1.
In Algorithm 1, ( ) r x is the rearrangement of the absolute values of x + to be the approximate sparse solution and stop iteration.Otherwise, we stop the algorithm within a reasonable time and return the last is a nonincreasing sequence which is convergent to some nonnegative number * ε .In the next section, we prove that the sequence { } k x is convergent when * 0 ε > , and the limit is a critical point of problem ( 5) with * ε ε = .Furthermore, we also present an error bound for the limit point.

Convergence of Algorithm 1
In this section, we first prove that the the sequence { } k x generated by Algorithm 1 is bounded and asymptotically regular.Then, based on an algebraic method, we prove that Algorithm 1 is convergent for any rational ( ) 0,1 q ∈ with * 0 ε > .Next, we begin with the following inequality.
Lemma 1.Given 0 1 q < < and ) Proof.We first define ( ) ( ) For any 1 2 , 0 t t > , by the mean value theorem, we have where lies between and .
The following inequality is always hold for any After rewriting the terms of (9), we thus get the desired inequality (7).
Our next result shows the monotonicity of ( ) , , , and this sequence is also asymptotically regular.
x + is a solution of problem ( 6), we thus have, ( ) Besides, we can get the subgradient of ( ) , , , Hence, we find ( ) ( ) ( ) which means that there exists .
From Lemma 2 (10), we know that ( ) , , , , , , is also bounded.On the other hand, from (11) we obtain that the sequence { } k x is asymptotically regular.In order to prove that the whole sequence generated by Algorithm 1 is convergent, we need the following lemma, which plays an important role in the proof of convergence.The following lemma mainly states that for almost every system of n polynomial equations in n complex variables, if its corresponding highest ordered system of equations have only trivial solution, then there is a finite number of solutions to the n polynomial equations.For detailed proof refer to Theorem 3.1 in [16].
Lemma 3. ( [16]) Let n polynomial equations in n complex variables ( ) be given, and let ( ) Q z a c = be its corresponding highest ordered system of equations.If ( ) , , 0 Q z a c = has only the trivial solution 0 z = , then ( ) solutions, where i q is the degree of i P .With above lemmas, we are now in a position to present the convergence of Algorithm 1 for any rational number ( ) x .By (11), we know that the sequence { } 14) respectively, and letting j → ∞ yields ( ) where * * , 1, , The above Equation (18) demonstrates that the limit of any convergent subsequence of { } , , , , 0, , 0 .

S
x denotes the subvector of * x with components restricted to S and S A denotes the submatrix of A with columns restricted to S. Next, if we prove the Equation (21) has finite solutions, then we can obtain the set ( ) as a finite set.It is clear that (21) can be rewritten as follows: where ∈ and S I is the s s × identity matrix.We that (22) can further be rewritten as follows: ( ) ( ) where  is an s s × diagonal matrix with the diagonal entries ( ) Since all the solutions of Equation ( 21) satisfy (24), we can thus show that Equation (21) has finite solutions as long as we can prove that (24) has finite solutions.To do that, we show that the following system has finite solutions: ) . Now, we extract the highest order terms from system (25) to get the following system: To prove that system (26) has only trivial solution, we use the method of proof by contradiction.Without loss of generality, we assume , , , , 0, 0, , 0 is a nonzero solution of (26), 0 ( ) , , ,

S S S
A A I λ + is positive definite; implies that the matrix B is also positive definite, and thus we have 0  .This contradicts the assumption that 0  .Therefore, we get that the system (26) has only trivial solutions.According to Lemma 3, we deduce that the system (25) has finite solutions, which further implies that the Equation ( 21) has also finite solutions, that is, the set Theorem 1 gives a detailed convergence proof of Algorithm 1 based on an algebraic approach.In the next, we will present an error bound between the convergent limit and the sparse solution of problem (1).
Under the Restricted Isometry Property (RIP) on the matrix A, we present an error bound between the convergent limit and the sparse solution of problem (1).First of all, we give a definition of RIP on the matrix A as follows.
Definition: For every integer 1 s N ≤ ≤ , we define s δ as the s-restricted isometry constant of A as the smallest positive quantity such that for all subsets { } 1, 2, , T N ⊂  of cardinality at most s and vectors x supported on T.Under the RIP assumption, we can ensure that the limit * x is a reasonable approximation of the sparse solution if * x has a very small tail in the sense that (2) When * 0 ε = , there must exist a subsequence from { } x .Proof.(1).In Theorem 1, we have proved that the limit * where we use the assumption that the initial value 0 x satisfies 0 Ax b = and 0 1 ε = in Algorithm 1.By (31), we have ( ) et S be the index set of the s-sparse solution x, and let * S be the index set of s largest entries in the absolute value of * x .Since  x ε is an s-sparse vector.Therefore, in both cases, we have an s-sparse limit point  .This completes the proof.
Under the condition of RIP on the matrix A, when * 0 ε > , Theorem 2 provide an error bound between the convergent limit and the sparse solution of problem (1).While * 0 ε = , we present an error bound for the limit point of any convergent subsequence.In this case, the limit point of any convergent subsequence is an s-sparse vector.

Conclusion
The iteratively reweighted 1  algorithm has been widely used for solving nonconvex optimization problem.In this paper, we propose an efficient adaptive iteratively reweighted 1  algorithm (6) for solving the elastic q  regularization (5) and we prove the convergence of the algorithm.In particular, we first prove that the sequence generated by Algorithm 1 is bounded and the sequence is asymptotically regular.When * 0 ε > , based on an algebraic method, we prove that the sequence generated by Algorithm 1 is convergent for any rational ( ) 0,1 q ∈ and the limit is a critical point of problem (5) with * ε ε = .Furthermore, under the condition of the RIP on the matrix A, when * 0 ε > , we present an error bound between the convergent limit and the sparse solution of problem (1).While * 0 ε = , we present an error bound for the limit point of any convergent subsequence.Our established convergence results provide a theoretical guarantee for a wide range of applications of adaptive iteratively reweighted 1  algorithm.
then the sequence { } k x generated by Algorithm 1 is convergent.Denoting the limit by *x , i.e., , the limit * x is a critical point of problem(5) with * ε ε = .Proof.From (10), we know that the sequence is monotonically decreasing and bounded below.Thus, we can infer that the sequence { } k x is also bounded.The boundedness of { } k x implies that there exists at least one convergent subsequence.We assume that { } order to prove the convergence of the whole sequence { } k x , one first needs to prove that the limit point set, denoted by { } k x Y , which contains all the limit points of convergent subsequence of { } k x , is a finite set.A classification is made for the limit point set { } k x Y with different sparsity s, 1 s N ≤ ≤ .That is the set the limit points with each sparsity s.If we prove the set Ω is a finite set, then we obtain that the limit point set { } k x Y is also a finite set.Without loss of generality, we define a set integers.By using simple calculation for Equation (23), we get the following system:

→
set.Therefore, we get that the limit point set { } as k → ∞ , we thus obtain that the sequence { } k x is convergent.By the convergence of sequence { } k x and (18), we obtain that the limit * x is a critical point of problem (5) with * ε ε = .

1 s
which is the error term of the best s-term approximation of * x in the p  -norm.With the concept of RIP, we are able to prove the result of following theorem.Theorem 2. Suppose that x is an s-sparse solution of (1) satisfying Ax b = .Assume that A satisfies the RIP of order 2s with 2 and the initial point 0 large k and some integer k m k ≤ .In the former case, kx is an s -sparse vector, and we denote * k x x ε = .In the latter case, by the boundedness of { }