Application of Linearized Alternating Direction Multiplier Method in Dictionary Learning

The Alternating Direction Multiplier Method (ADMM) is widely used in various fields, and different variables are customized in the literature for different application scenarios [1] [2] [3] [4]. Among them, the linearized alternating direction multiplier method (LADMM) has received extensive attention because of its effectiveness and ease of implementation. This paper mainly discusses the application of ADMM in dictionary learning (non-convex problem). Many numerical experiments show that to achieve higher convergence accuracy, the convergence speed of ADMM is slower, especially near the optimal solution. Therefore, we introduce the linearized alternating direction multiplier method (LADMM) to accelerate the convergence speed of ADMM. Specifically, the problem is solved by linearizing the quadratic term of the subproblem, and the convergence of the algorithm is proved. Finally, there is a brief summary of the full text.


Introduction
With the development of technology, data collection and processing have become easier, and many areas involve high-dimensional data issues, such as information technology, economic finance, and data modeling.Faced with such huge data, many researchers have proposed different solutions, and compressed sensing and sparsity has become an effective algorithm, because sparsity reduces the dimensionality of data in a certain sense and alternates direction multiplier method (ADMM) [5].It is a typical idea of using divide-and-conquer, which is to transform the original high-dimensional problem into two or more low-dimensional problem-solving algorithms, which is in line with the processing requirements of

Introduction to the Method
The ADMM algorithm was first proposed by Gabay, Meicher and Glowinski in the mid-1970 [6] [7] [8].A similar idea originated in the mid-1950s.A large number of articles analyzed the nature of this method, but ADMM was used to solve the problem of partial differential equations.Now ADMM is mainly used to solve the optimization problem with separable variables, which solves the problem that the augmented Lagrangian algorithm with good properties can't solve.It can be parallelized, which speeds up the solution.The convergence and convergence rate of ADMM for convex optimization problems with two separable variables.Although there is a mature theoretical analysis, the convergence problem of convex optimization problems extended to more than three separable variables has not been improved in a good solution.Then ADMM is also a public problem for non-convex optimization problems.There have been many applications showing the effectiveness of ADMM for non-convex problems.Can ADMM be applied to more optimization problems and more non-convex optimization problems?What is the effect?This article will introduce the application of ADMM in non-convex optimization problems.
First consider the convex optimization problem with equality constraints ( ) where , , : Firstly, an optimization algorithm with good properties is introduced, which augments the Lagrangian multiplier method.The augmented Lagrangian function is defined as: where 0 ρ > is called the penalty parameter.when 0 ρ = , 0 L is the Lagran- gian function.The iterative steps of the augmented Lagrangian multiplier method are: arg min , : where λ is the Lagrangian multiplier, i.e. the dual variable.
The advantage of this algorithm is that the convergence of the iterative se-X.L. Yu quence can be guaranteed without too strong conditions.For example, for the penalty parameter, it is not required to increase to infinity in the iterative process, and a fixed value can be taken.But the disadvantage of this algorithm is that when the objective function is separable, the model becomes: where g is also a convex function.In the x iteration step, the augmented Lagrangian function L ρ is inseparable, and the discrete variables cannot be solved in parallel for the x subproblem.This leads to the ADMM algorithm we will discuss in the next section.The Alternating Direction Method (ADMM) is mainly used to solve the optimization problem with separable variables like (4), where , , , , . Let's assume that both f and g are convex functions, and then make other assumptions.Similar to the definition in the previous section, the augmented Lagrangian penalty function of ( 4) is: The steps of the ADMM algorithm iteration are as follows: where 0 ρ > .The similarity between the algorithm and the augmented Lagran- gian multiplier method is to iteratively solve the variables x and y and then iteratively solve the dual variables.
If the augmented Lagrangian multiplier method is used for iterative solution: As mentioned in the previous section, you can see that the augmented Lagrangian multiplier method deals with two separate variables at the same time, and ADMM alternates the variables, which is the origin of the algorithm name.It can be considered that this is the use of Gauss-Seidel iterations on two variables.For details, please refer to.It is obvious from the algorithm framework that the ADMM algorithm is more suitable for solving the problem of having separate variables because the objective functions f and g are also separated.
To get a simpler form of ADMM, normalize the dual variable so that ( ) Then the ADMM iteration becomes: arg min 2 arg min 2 : ADMM convergence: Regarding the convergence of ADMM, please refer to X. L. Yu the literature.

Application of ADMM in Dictionary Learning
As we all know, the alternating direction method (ADMM) is one of the effective algorithms for solving large-scale sparse optimization problems.

If m n <
and the dictionary D is full rank, then the underdetermined system of the problem has an infinite number of solutions, and the solution using the least non-zero coefficient is one of them, and is the solution we hope to find.Sparse expression is expressed as a mathematical expression ( ) Or ( ) where 0 ⋅ is 0 ι -module, which means that the corresponding vector takes a non-zero quantity.

Dictionary Design
Learn the dictionary based on the signal set.First given a data set assuming that there is a dictionary D so that for a given signal can be represented as a sparse representation of the dictionary, i.e. for a given signal i y , the model ( ) 0 P Or ( ) P can find the sparse coefficient i x .The question then is how to find such a dictionary D. Detailed reference can be found in the literature [9].
The model of the problem can be written as: where 0 τ is the upper bound of the coefficient sparsity, i x is the ith column of the coefficient matrix X, and Another model for dictionary learning is corresponding to the above model.
 is a fixed error value.Before applying ADMM, first make some transformations to the model, let X. L. Yu

Z DX =
, then the model becomes: Then the augmented Lagrangian function of the problem is: , , ,Λ Λ , 2 where Λ is the Lagrange multiplier matrix and Λ i is the ith column of Λ .Using the ADMM algorithm in the above model, there is a X subproblem ( ) Equivalent to The Z subproblem is ( ) This sub-question has a solution.

( ) ( )
But ADL (ADMM for Dictionary Learning) is prone to the local best of the problem.Using linearization techniques, we extended LADMM to solve the problem of ADL local straits and proved the convergence of the algorithm.Numerical experiments are used to illustrate the effectiveness of the proposed algorithm.

Application of Linearized ADMM in Dictionary Learning
In order to apply ADMM, we can rewrite (11) into the following form Then the augmented Lagrangian function of the model ( 20) is The iterative method of ADMM is: Now, we solve the subproblem in (22).First we solve the X-sub problem.
Because of the non-identity of matrix D, this subproblem does not show a solution.Inspired by [9], we linearize this quadratic term The parameter 0 ρ > controls the degree of approximation of X and For the above problem, it is known from [9] medium ( ( ) Furthermore, for the Y subproblem, the Equation ( 10) in [9] shows that the display solution is As can be seen from the above discussion, the LADMM iterative algorithm can be described by the following table.

Convergence Proof
In this section, we will demonstrate that the LADMM algorithm is convergent.
Let us remember that the set of elements that satisfy the above formula in S is * S .The KKT condition of the above formula can be written as the form of varia- tional inequality (VI) as: where In order to prove these conclusions, as well as the proof of convergence of LADMM, need to introduce some lemma.For details, please refer to the literature [9].

Numerical Experiments
In this chapter, we will discuss the application of the algorithm in image deblur ring to prove the effectiveness of the algorithm.All experiments were carried out on a four-core notebook computer with Intel Intel(R) Core(TM) i5-7200UCPU @ 2.50 GHz and 4 GB memory.Procedures for this experiment, pictures are referenced [10].
The noise levels are delta = 0.256, 0.511 respectively.For comparison, we also include the results of FTVd v4.1 in [11], which is the most advanced image deblurring algorithm.It can be seen from the pictures that our proposed algorithm and FTVd algorithm have the same quality as PSNR (Figure 2), and our algorithm does not need regularization operator.

Conclusions
In this paper, we propose a linearized alternating direction multiplier method X. L. Yu It is solved by splitting the problem into a number of low-dimensional sub-problems by augmented Lagrangian function construction.In recent years, a large number of working signals have pointed to the sparse expression of signals.Sparse expression refers to the use of a as a sparse linear representation of these signal atoms.In fact, the so-called sparse means that the number of non-zero coefficients is much smaller than that of n.Such a sparse representation may be a determined y Dx = or an approximate representation with an error term is the signal y sparse expression coefficient.In practice, p often takes a value of 1, 2, or ∞.

2 F⋅
is the Frobenius norm of the matrix, i.e. the sum of the squares of the elements of the matrix.
the following problem and use the solution of this problem to approximate the solution of the subproblem generated by ADMM.

Compute
that satisfies the KKT condition of the model, i.e.
average square error of each pixel.