A Symmetric Alternating Direction Method of Multipliers with Two Different Relaxation Factors for Solving Non-Separable Nonconvex Minimization Problems

Abstract

This paper proposes a symmetric alternating direction method of multipliers with two different relaxation factors for solving nonconvex optimization problems with linear constraints and a non-separable structure. Although many studies have proposed variants of symmetric ADMM with a single relaxation factor, incorporating techniques such as Bregman distances, inertial terms, regularization terms, or linearization, the most basic form of symmetric ADMM with two different relaxation factors for solving non-separable problems has not yet been resolved. The introduction of two different relaxation factors in this method yields a broader range of parameters, making it applicable to more practical problems, and also provides a fundamental theoretical basis for accelerating the algorithm with other techniques. Based on the Kurdyka-Łojasiewicz property, we establish the convergence of the sequences generated by the proposed algorithm and analyze its convergence rate.

Share and Cite:

Lu, M. and Wang, Z. (2026) A Symmetric Alternating Direction Method of Multipliers with Two Different Relaxation Factors for Solving Non-Separable Nonconvex Minimization Problems. Journal of Applied Mathematics and Physics, 14, 1714-1732. doi: 10.4236/jamp.2026.144081.

1. Introduction

In many practical problems such as image processing, machine learning, and statistical modeling [1]-[3], the objective function often contains a coupling term H( x,y ) . That is, for the nonconvex and nonseparable problem considered in this paper, the iterative scheme is as follows:

min x,y f( x )+g( y )+H( x,y ), s.t.Ax+y=b. (1)

where f: n { + } is a proper lower semicontinuous function, possibly nonsmooth and nonconvex, both g: m and H: n × m are continuously differentiable and possibly nonconvex functions, the gradient g is L g -Lipschitz continuous and the gradient H is L h -Lipschitz continuous. A m×n is a matrix and b m is a vector. In particular, when H( x,y )=0 , problem (1) reduces to a separable optimization problem of the following form:

min x,y f( x )+g( y ), s.t.Ax+y=b. (2)

The alternating direction method of multipliers (ADMM) is one of the most effective approaches for solving problem (2). The convergence of ADMM has been extensively studied [4]-[9], and the convergence rate analysis is relatively well-established for problem (2). However, when H( x,y )0 , for problem (1), the convergence results for ADMM are relatively limited. Gao et al. [10] proved the convergence of the proximal ADMM under the setting where H is smooth, f and g are convex, combined with the assumption that H is Lipschitz continuous. Chen et al. [11] proved the convergence of the extended ADMM when the coupling term H is a quadratic function. In recent years, researchers have begun to focus on solving (1) by using ADMM in non-convex settings, and have employed tools such as the Kurdyka-Łojasiewicz (KL) inequality to prove the convergence of the algorithm. Guo et al. [12] [13] proved the convergence of the classic ADMM and extended it to the generalized ADMM (GADMM). Liu et al. [14] proposed a linearized ADMM and proved the convergence of the algorithm.

Regarding problem (2), it has been established in the literature that under convergence occurs, the symmetric ADMM often converges faster than the classical ADMM. Wu et al. [15] proposed a symmetric ADMM with one relaxation factor and verified its convergence. On this basis, some scholars have studied the symmetric ADMM with a relaxation factor and its variants for solving the nonseparable problem (1). For example, Dang et al. [16] proposed a linear proximal symmetric ADMM and verified that the sequence generated by the algorithm converges to a stationary point of the problem. The iterative scheme is as follows:

{ x k+1 argmin x { f( x )+ x x k , x H( x k , y k )+β A T ( A x k +B y k b ) λ k ,Ax + α x 2 x x k 2 }, λ k+ 1 2 = λ k τβ( A x k+1 +B y k b ), y k+1 argmin y { g( y )+ y y k , y H( x k+1 , y k ) + β 2 A x k+1 +Byb λ k+ 1 2 β 2 }, λ k+1 = λ k+ 1 2 β( A x k+1 +B y k+1 b ).

The parameters are selected as follows:

τ( 1,0 ), α y l h ,

α x 8 μ 2 l h 2 ( τ+1 )β +β λ max ( A T A )+ l h ,β μ 2 ( b b 2 64τc ) 4τ .

Dang et al. [17] proposed an inertial Bregman symmetric ADMM algorithm and established its global convergence. The iterative scheme is as follows:

{ x k+1 argmin x { L ρ ( x, y k , λ k )+ Δ ϕ 1 ( x, x k )+ θ 1k x, x k1 x k + θ 2k x, x k2 x k1 }, λ k+ 1 2 = λ k rρ( A x k+1 +B y k b )+ θ 1k B( y k1 y k )+ θ 2k B( y k2 y k1 ), y k+1 argmin y { L ρ ( x k+1 ,y, λ k+ 1 2 ) + Δ ϕ 2 ( y, y k )+ θ 1k By,B( y k1 y k ) + θ 2k By,B( y k2 y k1 ) }, λ k+1 = λ k+ 1 2 ρ( A x k+1 +B y k+1 b )+ θ 1k B( y k y k1 )+ θ 2k B( y k1 y k2 ).

where Δ ϕ i ( s,t ) , i=1,2 represents the Bregman distances, and r( 1,1 ) is a relaxation factor. However, the convergence of the sequence generated by the algorithm depends on the strong convexity of the kernel function associated with the Bregman distance.

In recent studies on solving problem (2), we observe that Lu et al. [18] proposed a symmetric ADMM with two different relaxation factors, without introducing the Bregman distance, they proved its convergence with a broader range of parameters, numerical experiments verified that their algorithm converges faster than the algorithm proposed by [15]. Its iterative scheme is as follows:

{ x k+1 argmin x { β s ( x, y k , λ k ) }, λ k+ 1 2 = λ k αβ( A x k+1 + y k b ), y k+1 argmin y { β s ( x k+1 ,y, λ k+ 1 2 ) }, λ k+1 = λ k+ 1 2 sβ( A x k+1 + y k+1 b ). (3)

Among them, β s is the augmented Lagrangian function associated with problem (2), defined as follows:

β s ( x,y,λ )=f( x )+g( y ) λ,Ax+yb + sβ 2 Ax+yb 2 . (4)

where β>0 is the penalty parameter, α and s are two different relaxation factors.

The above research on symmetric ADMM algorithms for solving problem (1) is limited to those containing only one relaxation factor. However, inspired by the algorithm proposed by the work of Lu et al. [18], we consider introducing two relaxation factors to achieve wider applicability, faster convergence, and a more concise and unified algorithmic framework, this paper proposes using a symmetric ADMM with two different relaxation factors to solve the non-separable (1), the iterative scheme is as follows:

{ x k+1 argmin x { f( x )+H( x, y k ) λ k ,Ax + sβ 2 Ax+ y k b 2 }, λ k+ 1 2 = λ k αβ( A x k+1 + y k b ), y k+1 argmin y { g( y )+H( x k+1 ,y ) λ k 1 2 ,y + sβ 2 A x k+1 +yb 2 }, λ k+1 = λ k+ 1 2 sβ( A x k+1 + y k+1 b ). (5)

Compared with the algorithm proposed by Dang et al. [16], our algorithm has a wider range of parameter values, allowing it to be applied to more practical scenarios through appropriate parameter tuning. Moreover, unlike the algorithm proposed by Dang et al. [17], our method does not require the introduction of the Bregman distance, thereby providing a unified iterative framework for solving non-separable problems via ADMM. The optimality conditions are as follows:

{ 0f( x k+1 )+ x H( x k+1 , y k ) A T λ k +sβ A T ( A x k+1 + y k b ), λ k+ 1 2 = λ k αβ( A x k+1 + y k b ), 0=g( y k+1 )+ y H( x k+1 , y k+1 ) λ k+ 1 2 +sβ( A x k+1 + y k+1 b ), λ k+1 = λ k+ 1 2 sβ( A x k+1 + y k+1 b ). (6)

In the case where H=0 , when a=0 and s=1 , the proposed algorithm reduces to the classical ADMM. When a0 and s=1 , the algorithm reduces to the symmetric ADMM with a scaling factor. This demonstrates that our method provides an improved basic iterative framework for symmetric ADMM with relaxation factors.

2. Preliminaries

In this section, we provide the necessary definitions and properties required for the following study.

Notations:

  • , n , m×n : real numbers, n -dimensional real vectors, m×n -real matrices.

  • , and : inner product and associated norm.

Set-valued mapping F: n m :

  • Domain: domF:={ x n |F( x ) } .

  • Graph: GraF:={ ( x,y ) n × m :yF( x ) } .

Distance: For S n and x n , d( x,S ):=inf{ yx |yS } , with d( x,S ):=+ .

Definition 1. [19] The domain of function f: n { + } is denoted by

domf={ x R n :f( x )<+ },

then f( x ) is called a proper function.

Definition 2. [19] A function f: n { + } is said to be lower semicontinuous at x n if it satisfies

f( x ) liminf k f( x k ).

If this holds for every point in domf , then f is said to be a lower semicontinuous function.

Definition 3. [20] Let f: n { + } be a proper lower semicontinuous function.

(i) The Fréchet subdifferential of f at xdomf , written ^ f( x ) , is the set of vectors x * n that satisfy

lim yx inf yx f( y )f( x ) x * ,yx yx 0.

When xdomf , we set ^ f( x )= .

(ii) The limiting-subdifferential of f at xdomf , written f( x ) , is defined as follows

f( x )={ x * n , x n x,f( x n )f( x ), x n * ^ f( x n ),with x n * x * }.

Lemma 1. [21] Let g: R m R be a continuously differentiable function and suppose that g is L -Lipschitz continuous. Then

g( u )g( v ) uv,g( v ) L 2 uv 2 ,u,v R m .

Definition 4. ([22], Kurdyka-Lojasiewicz inequality) Let f: n { + } be a proper lower semicontinuous function. For < η 1 < η 1 + , set

[ η 1 <f< η 1 ]={ x n : η 1 <f( x )< η 2 }.

We say that function f has the KL property at x * dom( f ) if there exist η( 0,+ ] , a neighbourhood U of x * , and a continuous concave function φ:[ 0,η ) + , such that

(i) φ( 0 )=0 ;

(ii) φ is C 1 on ( 0,η ) and continuous at 0;

(iii) φ ( x )>0,x( 0,η ) ;

(iv) for all x in U[ f( x * )<f<f( x * )+η ] , the Kurdyka-Lojasiewicz inequality holds

φ ( f( x )f( x * ) )d( 0,f( x ) )1,

where d( x,f( x ) )= inf yf( x ) yx is the distance from x to f( x ) .

Lemma 2. ([23], Uniformized KL property) Let Ω be a compact set and f: n { + } be a proper and lower semicontinuous function. Assume that f is constant on Ω and satisfies the KL property at each point of Ω . Then, there exist ϵ>0,η>0 , and φ Φ η such that for all x ¯ Ω and for all x in the following intersection:

{ x n :d( x,Ω )<ϵ }[ f( x ¯ )<f<f( x ¯ )+η ],

one has

φ ( f( x )f( x ¯ ) )d( 0,f( x ) )1.

Lemma 3. [24] Suppose that F( x,y )=f( x )+g( x ) , where f: n { + } and g: m { + } are proper lower semicontinuous functions. Then for all ( x,y )domF=domf×domg , we have

F( x,y )= x F( x,y )× y F( x,y ).

Definition 5. ([24], Kurdyka-Lojasiewicz function) If f satisfies the KL property at each point of dom( f ) , then f is called a KL function.

Definition 6. ( x * , y * , λ * ) is a stationary point of the augmented Lagrangian function β s ( ) for problem (1), if and only if

{ x H( x * , y * )+ A T λ * f( x * ), y H( x * , y * )+ λ * =g( y * ), A x * + y * =b. (7)

Lemma 4. Let the iterative sequence generated by the algorithm (5) be denoted as { w k :( x k , y k , λ k ) } . Then, the following holds

{ 0f( x k+1 )+ x H( x k+1 , y k ) A T λ k+1 + α α+s A T ( λ k+1 λ k ) s 2 β α+s A T ( y k+1 y k ), g( y k+1 )= λ k+1 y H( x k+1 , y k+1 ), λ k+1 = λ k β[ ( α+s )( A x k+1 + y k b )+s( y k+1 y k ) ].

Proof. Combining the second and fourth equations in (6) yields

λ k+1 = λ k αβ( A x k+1 + y k b )sβ( x k+1 + y k+1 b ),

λ k+1 = λ k ( α+s )β( A x k+1 + y k b )sβ( y k+1 y k ),

subtracting the above two equations, we obtain

λ k+1 λ k =( α+s )β( A x k+1 + y k b )sβ( y k+1 y k ), (8)

and thus

A x k+1 + y k b= 1 ( α+s )β ( λ k+1 λ k ) s α+s ( y k+1 y k ), (9)

A x k+1 + y k+1 b= 1 ( α+s )β ( λ k+1 λ k )+ α α+s ( y k+1 y k ). (10)

According to the optimality condition (6) of the x -subproblem, we have

0f( x k+1 )+ x H( x k+1 , y k ) A T λ k +sβ A T ( A x k+1 + y k b ) =f( x k+1 )+ x H( x k+1 , y k ) A T λ k+1 + A T ( λ k+1 λ k ) +sβ A T ( A x k+1 + y k b ), (11)

putting (9) into (11), we get

0f( x k+1 )+ x H( x k+1 , y k ) A T λ k+1 + α α+s A T ( λ k+1 λ k ) s 2 β α+s A T ( y k+1 y k ). (12)

According to the third and fourth equations in (6), we obtain

g( y k+1 )= λ k+1 y H( x k+1 , y k+1 ). (13)

This completes the proof. □

3. Convergence Analysis

Assumption A. let f: n { + } be a proper lower semicontinuous function, and let g: m be a continuously differentiable function with an L g -Lipschitz continuous gradient g , and H: m × n be a continuously differentiable function with an L h -Lipschitz continuous gradient H . Also assume that

  • { ( α,s ) 2 |α<s<0 } ,

  • { β|β> 1 s ( L g + L h ) }

Lemma 5. Let the iterative sequence generated by the algorithm (5) be denoted

as { w k :( x k , y k , λ k ) } . Suppose that this sequence is bounded and that Assumption A holds. Then the following conclusion holds

β s ( w k+1 ) β s ( w k )η( x k+1 x k 2 + y k+1 y k 2 ), (14)

where η>0 (see Remark (0.1)).

Proof. The polarization identity implies

sβ 2 A x k+1 + y k+1 b 2 sβ 2 A x k+1 + y k b 2 = sβ 2 y k y k+1 2 +sβ A x k+1 + y k+1 b, y k+1 y k . (15)

Due to g is L g -Lipschitz continuous and H is L h -Lipschitz continuous, we combining the lemma 1, then we have

g( y k+1 )g( y k ) g( y k+1 ), y k+1 y k + L 2 y k y k+1 2 . (16)

H( x k+1 , y k+1 )H( x k+1 , y k ) y H( x k+1 , y k+1 ), y k+1 y k + L 2 y k y k+1 2 . (17)

From the fourth equation in the optimality condition (6) of the problem, we obtain

g( y k+1 )= λ k+ 1 2 sβ( A x k+1 + y k+1 b ). (18)

We obtain λ k+1 =g( y k+1 )+ y H( x k+1 , y k+1 ) in (13) and combining the Lipschitz continuities of g and H , we have

λ k+1 λ k 2 = g( y k+1 )g( y k )+ y H( x k+1 , y k+1 ) y H( x k , y k ) 2 2 g( y k+1 )g( y k ) 2 +2 y H( x k+1 , y k+1 ) y H( x k , y k ) 2 2 L g 2 y k+1 y k 2 +2 L h 2 ( x k+1 , y k+1 )( x k , y k ) 2 =2( L g 2 + L h 2 ) y k+1 y k 2 +2 L h 2 x k+1 x k 2 . (19)

Recall (9) and (10), we get

αβ A x k+1 + y k b 2 +sβ A x k+1 + y k+1 b 2 = 1 ( α+s )β λ k λ k+1 2 + αsβ α+s y k y k+1 2 . (20)

From the definition of the augmented Lagrangian function β s ( ) in (4), and in combination with (15), we have

β s ( x k+1 , y k+1 , λ k+ 1 2 ) β s ( x k+1 , y k , λ k+ 1 2 ) =g( y k+1 )g( y k ) λ k+ 1 2 , y k+1 y k +H( x k+1 , y k+1 )H( x k+1 , y k ) + sβ 2 A x k+1 + y k+1 b 2 sβ 2 A x k+1 + y k b 2 =g( y k+1 )g( y k ) λ k+ 1 2 , y k+1 y k +H( x k+1 , y k+1 )H( x k+1 , y k ) sβ 2 y k y k+1 2 +sβ A x k+1 + y k+1 b, y k+1 y k . (21)

Then, combining it with (16) and (17) yields

β s ( x k+1 , y k+1 , λ k+ 1 2 ) β s ( x k+1 , y k , λ k+ 1 2 ) g( y k+1 ), y k+1 y k + L g 2 y k y k+1 2 λ k+ 1 2 , y k+1 y k + y H( x k+1 , y k+1 ), y k+1 y k + L h 2 y k y k+1 2 sβ 2 y k y k+1 2 +sβ A x k+1 + y k+1 b, y k+1 y k . (22)

Substituting (13) into the above expression and simplifying, we obtain

β s ( x k+1 , y k+1 , λ k+ 1 2 ) β s ( x k+1 , y k , λ k+ 1 2 ) L g + L h sβ 2 y k+1 y k 2 . (23)

Note that, by using (4), (6), (20) and (19), we have

β s ( x k+1 , y k+1 , λ k+1 ) β s ( x k+1 , y k+1 , λ k+ 1 2 ) + β s ( x k+1 , y k , λ k+ 1 2 ) β s ( x k+1 , y k , λ k ) = λ k+1 λ k+ 1 2 ,A x k+1 + y k+1 b + λ k+ 1 2 λ k ,A x k+1 + y k b =sβ A x k+1 + y k+1 b 2 +αβ A x k+1 + y k b 2 = 1 ( α+s )β λ k+1 λ k 2 + αsβ α+s y k+1 y k 2 2( L g 2 + L h 2 )+αs β 2 ( α+s )β y k+1 y k 2 + 2 L h 2 ( α+s )β x k+1 x k 2 . (24)

Summing the inequalities (23), (24) and the first equation in Section (5) we obtain

β s ( w k+1 ) β s ( w k ) = β s ( x k+1 , y k+1 , λ k+1 ) β s ( x k+1 , y k+1 , λ k+ 1 2 )+ β s ( x k+1 , y k+1 , λ k+ 1 2 ) β s ( x k+1 , y k , λ k+ 1 2 )+ β s ( x k+1 , y k , λ k+ 1 2 ) β s ( x k+1 , y k , λ k ) + β s ( x k+1 , y k , λ k ) β s ( x k , y k , λ k ) [ L g + L h sβ 2 + 2( L g 2 + L h 2 )+αs β 2 ( α+s )β ] y k+1 y k 2 + 2 L h 2 ( α+s )β x k+1 x k 2 . (25)

Thus

β s ( w k+1 ) β s ( w k )η( x k+1 x k 2 + y k+1 y k 2 ), (26)

there η:=min{ sβ L g L h 2 2( L g 2 + L h 2 )+αs β 2 ( α+s )β , 2 L h 2 ( α+s )β } . This completes the proof. □

Remark 3.1. Thus, as long as η>0 , the sequence β s ( w k+1 ) has sufficient descent properties, which means that β s ( w k+1 ) is monotonically nonincreasing. Therefore, we can achieve η>0 by appropriately choosing the parameters ( α,s,β ) . The constraints on these parameters are as follows:

  • { ( α,s ) 2 |α<s<0 } ,

  • { β|β> 1 s ( L g + L h ) }

Lemma 6. Let the iterative sequence generated by the algorithm (5) be denoted as { w k :( x k , y k , λ k ) } . Suppose that this sequence is bounded, that Assumption A holds and η>0 . Then we have

k=0 + w k+1 w k 2 <+.

Proof. The boundedness of { w k } implies the existence of a subsequence { w k j } converging to w * . Moreover, the lower semicontinuity of f( x ) together with the continuity of g( y ) ensures that β s ( ) is lower semicontinuous. Hence,

β s ( w * ) liminf j+ β s ( w k j ).

Hence, { β s ( w k j ) } is bounded below. The fact that it is also nonincreasing implies its convergence. Since { β s ( w k ) } is monotonic and contains a convergent subsequence, it follows that { β s ( w k ) } itself converges, satisfying β s ( w k ) β s ( w * ) . Finally, invoking Equation (14) yields

η( x k+1 x k 2 + y k+1 y k 2 ) β s ( w k ) β s ( w k+1 ),k.

By summing over k=0,,n , and observing that β s ( w 0 )<+ , we arrive at

η k=0 n ( x k+1 x k 2 + y k+1 y k 2 ) β s ( w 0 ) β s ( w n+1 ) β s ( w 0 ) β s ( w * )<+.

The condition η>0 yields k=0 y k+1 y k 2 <+ and k=0 x k+1 x k 2 <+ . Using (19), we further obtain k=0 λ k+1 λ k 2 <+ . Hence, k=0 + w k+1 w k 2 <+ . This completes the proof. □

Lemma 7. Let the iterative sequence generated by the algorithm (5) be denoted as { w k :( x k , y k , λ k ) } . Suppose that this sequence is bounded and that Assumption A holds. We define

{ ε 1 k+1 = A T ( λ k λ k+1 )+ x H( x k+1 , y k+1 ) x H( x k+1 , y k )+sβ( y k+1 y k ), ε 2 k+1 = s α+s ( λ k+1 λ k )+ αsβ α+s ( y k+1 y k ), ε 3 k+1 = 1 ( α+s )β ( λ k+1 λ k ) α α+s ( y k+1 y k ). (27)

Hence, it holds that ε k+1 :=( ε 1 k+1 , ε 2 k+1 , ε 3 k+1 ) β s ( w k+1 ) , and there exists a constant δ>0 such that

d( 0, β s ( w k+1 ) )δ( x k+1 x k + y k+1 y k ). (28)

Proof. From the definition of the function β s ( ) in (4), the following system of equations holds

{ x β s ( w k+1 )=f( x k+1 )+ x H( x k+1 , y k+1 ) A T λ k+1 +sβ A T ( A x k+1 + y k+1 b ), y β s ( w k+1 )=g( y k+1 )+ y H( x k+1 , y k+1 ) λ k+1 +sβ( A x k+1 + y k+1 b ), λ β s ( w k+1 )=( A x k+1 + y k+1 b ). (29)

From the optimality condition (6), after rearrangement, we have

{ A T λ k x H( x k+1 , y k )sβ A T ( A x k+1 + y k b )f( x k+1 ), λ k+ 1 2 y H( x k+1 , y k+1 )sβ A T ( A x k+1 + y k+1 b )=g( y k+1 ), 1 ( α+s )β ( λ k+1 λ k ) α α+s ( y k+1 y k )=( A x k+1 + y k+1 b ).

Then, by substituting the above into (29), we obtain

{ A T ( λ k λ k+1 )+ x H( x k+1 , y k+1 ) x H( x k+1 , y k )+sβ( y k+1 y k ) x β s ( w k+1 ), s α+s ( λ k+1 λ k )+ αsβ α+s ( y k+1 y k ) y β s ( w k+1 ), 1 ( α+s )β ( λ k+1 λ k ) α α+s ( y k+1 y k ) λ β s ( w k+1 ).

Consequently, applying Lemma 3 yields ( ε 1 k+1 , ε 2 k+1 , ε 3 k+1 ) β s ( w k+1 ) . Based on the preceding relation, we can find δ 1 , δ 2 such that

ε k+1 2 = ( ε 1 k+1 , ε 2 k+1 , ε 3 k+1 ) 2 ε 1 k+1 2 + ε 2 k+1 2 + ε 3 k+1 2 δ 1 2 y k+1 y k 2 + δ 2 2 λ k+1 λ k 2 . (30)

Applying (19), there exists δ for which the following holds

d 2 ( 0, β s ( w k+1 ) ) ε k+1 2 δ 2 ( x k+1 x k 2 + y k+1 y k 2 ),

then

d( 0, β s ( w k+1 ) )δ( x k+1 x k + y k+1 y k ). (31)

This completes the proof. □

Lemma 8. Let the iterative sequence generated by the algorithm (5) be denoted as { w k :( x k , y k , λ k ) } . Suppose that this sequence is bounded and that Assumption A holds. Let Ω represent the set of all cluster points of the sequence { w k } . Then the following statement is true

(i) Ω is a nonempty compact set, and

d( w k ,Ω )0,ask+;

(ii) Ωcrit β , where crit β denotes the set of all stationary points of β s ;

(iii) β s ( ) is finite and constant on Ω , which equals to

inf k β s ( w k )= lim k+ β s ( w k ).

Proof. We now verify the above results one by one.

(i) This is immediate from the definition of limit points.

(ii) Suppose w * Ω . Then there exists a subsequence { w k j } of { w k } such that w k j w * . Applying Lemma 6 yields

lim k+ w k+1 w k =0, (32)

thus, w k j +1 w * . Noting that x k+1 minimize β s ( x, y k , λ k ) with respect to x , we get

β s ( x k+1 , y k , λ k ) β s ( x * , y k , λ k ). (33)

With respect to the variables y , λ , and ( y k j , λ k j )( y * , λ * ) . Hence, it follows that

limsup j+ β s ( x k j +1 , y k j , λ k j )= limsup j+ β s ( x k j +1 , y k j +1 , λ k j +1 ). (34)

Furthermore, applying (33) yields

limsup j+ β s ( x k j +1 , y k j +1 , λ k j +1 ) β s ( x * , y * , λ * ). (35)

Since β s ( ) is lower semicontinuous, we know that

limsup j+ β s ( x k j +1 , y k j +1 , λ k j +1 ) β s ( x * , y * , λ * ). (36)

From (34), (35) and (36), we obtain

lim j+ f( x k j +1 )=f( x * ).

Taking the limit in (6) along the subsequence { ( x k j +1 , y k j +1 , λ k j +1 ) } using (32) again, we obtain

{ x H( x * , y * )+ A T λ * f( x * ), y H( x * , y * )+ λ * =g( y * ), A x * + y * b=0.

Therefore, ( x * , y * , λ * ) satisfies the critical point condition of (4), which implies that w * crit β s . Thus, Ωcrit β s .

(iii) Take any ( x * , y * , λ * )Ω . Then there exists a subsequence { ( x k j , y k j , λ k j ) } , such that { ( x k j , y k j , λ k j ) }( x * , y * , λ * ) . Since β s ( w k ) is nonincreasing and has a convergent subsequence, the entire sequence β s ( w k ) converges, we have

lim k+ β s ( x k , y k , λ k )= β s ( x * , y * , λ * ).

That is, β s ( ) takes a constant value on Ω . Clearly,

inf k β s ( w k )= lim k+ β s ( w k ).

This completes the proof. □

Theorem 9. Let the iterative sequence generated by the algorithm (5) be denoted as { w k :( x k , y k , λ k ) } . Suppose that this sequence is bounded, that Assumption A holds and η>0 . When β s ( ) is a KL function, then { w k } has finite length, that is

k=0 + w k+1 w k <+.

Moreover, it converges to a critical point of β s ( ) .

Proof. Lemma 8 implies that β s ( w k ) β s ( w * ) for any w * Ω . Next, we examine two cases.

(i) Suppose there exists k 0 with β s ( w k 0 )= β s ( w * ) , then using (14) and Remark 0.1, for every k> k 0 , we have

η( x k+1 x k 2 + y k+1 y k 2 ) β s ( w k ) β s ( w k+1 ) β s ( w k 0 ) β s ( w * )=0.

Hence, y k+1 = y k and x k+1 = x k for any k> k 0 . Then, by (19), we further obtain λ k+1 = λ k for any k> k 0 +1 , which means w k+1 = w k .

(ii) If β s ( w k )> β s ( w * ) holds for all k , then the following convergence properties hold:

  • Since d( w k ,Ω )0 , for any ε 1 >0 there exists k 1 >0 such that for all k> k 1 , it holds d( w k ,Ω )< ε 1 is true.

  • Since β s ( w k ) β s ( w * ) , for any ε 2 >0 there exists k 2 >0 such that for all k> k 2 , it holds that β s ( w k )< β s ( w * )+ ε 2 is true.

Now set k> k ˜ =max{ k 1 , k 2 } and any ε 1 , ε 2 >0 , we have

d( w k ,Ω )< ε 1 , β s ( w * )< β s ( w k )< β s ( w * )+ ε 2 .

By Lemma 8, we have established that Ω is a nonempty compact set and that β s ( ) is constant Ω . Consequently, Lemma 2 implies that

φ ( β s ( w k ) β s ( w * ) )d( 0, β s ( w k ) )1,k> k ˜ . (37)

Drawing on what has been clearly established, namely that

β s ( w k ) β s ( w k+1 )= β s ( w k ) β s ( w * )( β s ( w k+1 ) β s ( w * ) ),

and the concavity of φ( ) , it follows that

φ( β s ( w k ) β s ( w * ) )φ( β s ( w k+1 ) β s ( w * ) ) φ ( β s ( w k ) β s ( w * ) )( β s ( w k ) β s ( w k+1 ) ).

Now, taking the above inequality together with

d( 0, β s ( w k ) )ξ y k y k1 , φ ( β s ( w k ) β s ( w * ) )>0,

and relation (37), it follows that

β s ( w k ) β s ( w k+1 ) φ( β s ( w k ) β s ( w * ) )φ( β s ( w k+1 ) β s ( w * ) ) φ ( β s ( w k ) β s ( w * ) ) d( 0, β s ( w k ) )[ φ( β s ( w k ) β s ( w * ) )φ( β s ( w k+1 ) β s ( w * ) ) ] δ( x k x k1 + y k y k1 )[ φ( β s ( w k ) β s ( w * ) )φ( β s ( w k+1 ) β s ( w * ) ) ].

For convenience, we define

Δ p,q :=φ( β s ( w p ) β s ( w * ) )φ( β s ( w q ) β s ( w * ) ).

Then the above equation is equivalent to

β s ( w k ) β s ( w k+1 )δ( x k x k1 + y k y k1 ) Δ k,k+1 . (38)

Together, Lemma 5 and (38), imply that

η( x k+1 x k 2 + y k+1 y k 2 )δ( x k x k1 + y k y k1 ) Δ k,k+1 ,k> k ˜ ,

together with

1 2 ( x k+1 x k + y k+1 y k ) 2 ( x k+1 x k 2 + y k+1 y k 2 ),

and thereby

2( x k+1 x k + y k+1 y k )2 x k x k1 + y k y k1 2δ η Δ k,k+1 ,k> k ˜ .

Using the fact that 2 αβ α+β , we have

2( x k+1 x k + y k+1 y k ) x k x k1 + y k y k1 + 2δ η Δ k,k+1 . (39)

Taking the sum of (39) over k= k ˜ +1,,N gives

2 k= k ˜ +1 N ( x k+1 x k + y k+1 y k ) k= k ˜ +1 N ( x k x k1 + y k y k1 )+ 2δ η Δ k ˜ +1,N+1 .

From Definition, we have φ( β s ( w N+1 ) β s ( w * ) )>0 from Definition 4. Rearranging terms and taking N+ gives

k= k ˜ +1 + ( x k+1 x k + y k+1 y k ) x k ˜ +1 x k ˜ + y k ˜ +1 y k ˜ + 2δ η φ( β s ( w k ˜ +1 ) β s ( w * ) ). (40)

Thus,

k=0 + y k+1 y k <+, (41)

and

k=0 + x k+1 x k <+. (42)

Substituting into (19) yields

k=0 + λ k+1 λ k <+. (43)

Additionally, we note that

w k+1 w k = ( x k+1 x k 2 + y k+1 y k 2 + λ k+1 λ k 2 ) 1/2 x k+1 x k + y k+1 y k + λ k+1 λ k .

From (41), (42) and (43), we arrive at

k=0 + w k+1 w k <+.

Finally, { w k } converges to a critical point of β s ( ) by Lemma 8. This completes the proof. □

Lemma 10. Let { w k :=( x k , y k , λ k ) } be the sequence generated by the symmetric ADMM (5), If at least one of the following statements holds

(i) liminf x + f( x )=+ .

(ii) inf x f( x )> and liminf y + g( y )=+ .

Then we can conclude that the sequence { w k :=( x k , y k , λ k ) } is bounded.

Proof. In the first place, suppose that condition (i) holds. Based on Lemma 5, we arrive at

β s ( x k , y k , λ k ) β s ( x 1 , y 1 , λ 1 ).

By combining (4) with g( y k+1 )= λ k+1 y H( x k+1 , y k+1 ) , we get

β s ( x 1 , y 1 , λ 1 )f( x k )+g( y k )+H( x k , y k ) λ k ,A x k + y k b + sβ 2 A x k + y k b 2 =f( x k )+g( y k )+H( x k , y k ) g( y k )+ y H( x k , y k ),A x k + y k b + sβ 2 A x k + y k b 2 f( x k )+( L g 2 + L h 2 + sβ 2 ) A x k + y k b 2 .

Note that (i) implies that inf x f( x )> . Based on Assumption A, we can obtain L g 2 + L h 2 + sβ 2 >0 . Thus, we can deduce that { x k } and { y k } are bounded. Hence, { λ k } is also bounded, and thus { w k } is bounded. This completes the proof. □

Theorem 11. (Convergence rate) Let the iterative sequence generated by the algorithm (5) be denoted as { w k :( x k , y k , λ k ) } . Suppose that this sequence is bounded, that Assumption A holds and η>0 , and { w k :( x k , y k , λ k ) }{ w * =( x * , y * , λ * ) } . Assume that β s ( ) possesses the KL property at ( x * , y * , λ * ) , and that the corresponding function is given by φ( s )=c t 1θ , with θ[ 0,1 ) , c>0 . Then the following three statements hold

(i) If θ=0 , the sequence { w k =( x k , y k , λ k ) } converges in finitely many steps. This means we can find an index k with w k = w * .

(ii) If θ( 0, 1 2 ] , then there is a constant c 1 >0 and τ[ 0,1 ) so that

( x k , y k , λ k )( x * , y * , λ * ) c 1 τ k .

(iii) If θ( 1 2 ,1 ) , then there is a constant c 2 >0 for which

( x k , y k , λ k )( x * , y * , λ * ) c 2 k ( θ1 )/ ( 2θ1 ) .

Proof. For θ=0 , we have φ( t )=ct and φ ( t )=c . Suppose, contrary to the claim, that { w k =( x k , y k , λ k ) } does not terminate in finitely many steps. Then, for large enough k , the KL property yields cd( 0, β s ( w k ) )1 , which contradicts Lemma 7. Now let θ>0 and set Δ k = i=k + ( x i+1 x i + y i+1 y i ) for k0 . From (40) we deduce

Δ k ˜ +1 Δ k ˜ Δ k ˜ +1 + 2δ η φ( β s ( w k ˜ +1 ) β s ( w * ) ). (44)

Because β s ( ) has the KL property at w * , we have

φ ( β s ( w k ˜ +1 ) β s ( w * ) )d( 0, β s ( w k ˜ +1 ) )1.

This inequality can be rearranged as

( β s ( w k ˜ +1 ) β s ( w * ) ) θ c( 1θ )d( 0, β s ( w k ˜ +1 ) ). (45)

Finally, applying Lemma 7 yields

d( 0, β s ( w k ˜ +1 ) )δ( x k ˜ +1 x k ˜ + y k ˜ +1 y k ˜ )=δ( Δ k ˜ Δ k ˜ +1 ). (46)

From (45) and (46), there exists γ= [ c( 1θ )δ ] 1θ θ >0 satisfying

φ( β s ( w k ˜ +1 ) β s ( w * ) )=c ( β s ( w k ˜ +1 ) β s ( w * ) ) 1θ γ ( Δ k ˜ Δ k ˜ +1 ) ( 1θ )/θ .

Inserting this into (44) gives

Δ k ˜ +1 Δ k ˜ Δ k ˜ +1 + 2δ η γ ( Δ k ˜ Δ k ˜ +1 ) ( 1θ )/θ . (47)

Now, by (47) and the results of Attouch and Bolte [25], we obtain

  • Case 1: θ( 0, 1 2 ] , then there exist c 1 >0 and τ[ 0,1 ) such that

x k x * + y k y * c 1 τ k . (48)

  • Case 2: θ( 1 2 ,1 ) , then there exists c 2 >0 such that

x k x * + y k y * c 2 k θ1 2θ1 . (49)

Next, applying (19) yields

λ k λ * 2 [ L g 2 y k y * 2 + L h 2 y k y * 2 + L h 2 x k x * 2 ] 1 2 2 [ L g y k y * + L h 2 y k y * + L h x k x * ]. (50)

Thus, (ii) and (iii) follow from (48)-(50).

4. Conclusion

In this paper, we propose a symmetric alternating direction method of multipliers with two different relaxation factors for minimizing the sum of two non-separable nonconvex functions. For problems where the objective function contains a coupling term, i.e., the case where f and g are non-separable, research remains limited in both convex and non-convex settings. We review the development of symmetric ADMM for solving non-separable problems and find that although many existing works have proposed symmetric ADMM variants incorporating techniques such as Bregman distances, inertial terms, regularization terms, or linearization, they often introduce only one relaxation factor. Inspired by this, we introduce two different relaxation factors and apply the algorithm to non-separable nonconvex problems, thereby refining the basic form of symmetric ADMM for solving non-separable problems. This makes the parameter range of the algorithm broader, allowing it to be adapted to more practical problems by adjusting the parameters. It also provides fundamental theoretical support for further integration with other techniques. Finally, based on the Kurdyka-Łojasiewicz (KL) property, we prove that the sequence generated by the algorithm converges to a stationary point of the problem and further analyze its finite-step convergence, linear convergence, and sublinear convergence.

Acknowledgements

Sincere thanks to the members of JAMP for their professional performance, and special thanks to managing editor Hellen XU for a rare attitude of high quality.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Shang, F., Xu, T., Liu, Y., Liu, H., Shen, L. and Gong, M. (2021) Differentially Private ADMM Algorithms for Machine Learning. IEEE Transactions on Information Forensics and Security, 16, 4733-4745.[CrossRef
[2] Chan, S.H. (2019) Performance Analysis of Plug-And-Play ADMM: A Graph Signal Processing Perspective. IEEE Transactions on Computational Imaging, 5, 274-286.[CrossRef
[3] Xu, H., Caramanis, C. and Mannor, S. (2013) Outlier-robust PCA: The High-Dimensional Case. IEEE Transactions on Information Theory, 59, 546-572.[CrossRef
[4] Glowinski, R. and Marroco, A. (1975) Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité d’une classe de problèmes de Dirichlet non linéaires. Revue française dautomatique, informatique, recherche opérationnelle. Analyse numérique, 9, 41-76.[CrossRef
[5] Douglas, J. and Rachford, H.H. (1956) On the Numerical Solution of Heat Conduction Problems in Two and Three Space Variables. Transactions of the American Mathematical Society, 82, 421-439.[CrossRef
[6] He, B., Liu, H., Wang, Z. and Yuan, X. (2014) A Strictly Contractive Peaceman—Rachford Splitting Method for Convex Programming. SIAM Journal on Optimization, 24, 1011-1040.[CrossRef] [PubMed]
[7] He, B., Ma, F. and Yuan, X. (2016) Convergence Study on the Symmetric Version of ADMM with Larger Step Sizes. SIAM Journal on Imaging Sciences, 9, 1467-1501.[CrossRef
[8] Jia, Z., Gao, X., Cai, X. and Han, D. (2021) The Convergence Rate Analysis of the Symmetric ADMM for the Nonconvex Separable Optimization Problems. Journal of Industrial & Management Optimization, 17, 1943-1971.[CrossRef
[9] Liu, P., Jian, J., He, B. and Jiang, X. (2022) Convergence of Bregman Peaceman-Rachford Splitting Method for Nonconvex Nonseparable Optimization. Journal of the Operations Research Society of China, 11, 707-733.[CrossRef
[10] Gao, X. and Zhang, S. (2016) First-Order Algorithms for Convex Optimization with Nonseparable Objective and Coupled Constraints. Journal of the Operations Research Society of China, 5, 131-159.[CrossRef
[11] Chen, C., Li, M., Liu, X. and Ye, Y. (2017) Extended ADMM and BCD for Nonseparable Convex Minimization Models with Quadratic Coupling Terms: Convergence Analysis and Insights. Mathematical Programming, 173, 37-77.[CrossRef
[12] Guo, K. and Han, D.R. and Wu, T.T. (2018) Convergence Analysis for Optimization Problems with Nonseparable Nonconvex Objective and Linear Constraints. Pacific Journal of Optimization, 14, 489-506.
https://webofscience.clarivate.cn/wos/woscc/full-record/WOS:000461411400008
[13] Guo, K. and Wang, X. (2018) Convergence of Generalized Alternating Direction Method of Multipliers for Nonseparable Nonconvex Objective with Linear Constraints. Journal of Mathematical Research with Applications, 38, 18.
https://d.wanfangdata.com.cn/periodical/sxyjypl201805010
[14] Liu, Q., Shen, X. and Gu, Y. (2019) Linearized ADMM for Nonconvex Nonsmooth Optimization with Convergence Analysis. IEEE Access, 7, 76131-76144.[CrossRef
[15] Wu, Z., Li, M., Wang, D.Z.W. and Han, D. (2017) A Symmetric Alternating Direction Method of Multipliers for Separable Nonconvex Minimization Problems. Asia-Pacific Journal of Operational Research, 34, Article ID: 1750030.[CrossRef
[16] Dang, Y.Z. and Cui, T.T. (2023) Convergence Analysis of Linear Symmetric Proximal ADMM for Nonconvex Nonsmooth Nonseparable Optimization. Journal of Systems Science and Mathematical Sciences, 43, 2949-2969. (In Chinese)http://dx.chinadoi.cn/10.12341/jssms23295[CrossRef
[17] Dang, Y. and Liu, K. (2026) A Two-Step Inertial Bregman Symmetric Admm-Type Algorithm with Kl-Property for Nonconvex Nonsmooth Nonseparable Optimization Problems with Application. Journal of Computational and Applied Mathematics, 472, Article ID: 116815.[CrossRef
[18] Mei, L. and Ke, G. (2026) Convergence Analysis of the Symmetric Alternating Direction Method of Multipliers for Two-Block Separable Nonconvex Optimization Problems with Linear Constraints. Pacific Journal of Optimization.[CrossRef
[19] Beck, A. (2017) First-Order Methods in Optimization. Society for Industrial and Applied Mathematics.[CrossRef
[20] Guo, K., Han, D.R. and Wu, T.T. (2016) Convergence of Alternating Direction Method for Minimizing Sum of Two Nonconvex Functions with Linear Constraints. International Journal of Computer Mathematics, 94, 1653-1669.[CrossRef
[21] Nesterov, Y. (2013) Introductory Lectures on Convex Optimization: A Basic Course, Springer Science and Business Media.
[22] Attouch, H., Bolte, J., Redont, P. and Soubeyran, A. (2010) Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Łojasiewicz Inequality. Mathematics of Operations Research, 35, 438-457.[CrossRef
[23] Bolte, J., Sabach, S. and Teboulle, M. (2013) Proximal Alternating Linearized Minimization for Nonconvex and Nonsmooth Problems. Mathematical Programming, 146, 459-494.[CrossRef
[24] Attouch, H., Bolte, J. and Svaiter, B.F. (2011) Convergence of Descent Methods for Semi-Algebraic and Tame Problems: Proximal Algorithms, Forward-Backward Splitting, and Regularized Gauss-Seidel Methods. Mathematical Programming, 137, 91-129.[CrossRef
[25] Attouch, H. and Bolte, J. (2007) On the Convergence of the Proximal Algorithm for Nonsmooth Functions Involving Analytic Features. Mathematical Programming, 116, 5-16.[CrossRef

Copyright © 2026 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.