A Symmetric Alternating Direction Method of Multipliers with Two Different Relaxation Factors for Solving Non-Separable Nonconvex Minimization Problems ()
1. Introduction
In many practical problems such as image processing, machine learning, and statistical modeling [1]-[3], the objective function often contains a coupling term
. That is, for the nonconvex and nonseparable problem considered in this paper, the iterative scheme is as follows:
(1)
where
is a proper lower semicontinuous function, possibly nonsmooth and nonconvex, both
and
are continuously differentiable and possibly nonconvex functions, the gradient
is
-Lipschitz continuous and the gradient
is
-Lipschitz continuous.
is a matrix and
is a vector. In particular, when
, problem (1) reduces to a separable optimization problem of the following form:
(2)
The alternating direction method of multipliers (ADMM) is one of the most effective approaches for solving problem (2). The convergence of ADMM has been extensively studied [4]-[9], and the convergence rate analysis is relatively well-established for problem (2). However, when
, for problem (1), the convergence results for ADMM are relatively limited. Gao et al. [10] proved the convergence of the proximal ADMM under the setting where
is smooth,
and
are convex, combined with the assumption that
is Lipschitz continuous. Chen et al. [11] proved the convergence of the extended ADMM when the coupling term
is a quadratic function. In recent years, researchers have begun to focus on solving (1) by using ADMM in non-convex settings, and have employed tools such as the Kurdyka-Łojasiewicz (KL) inequality to prove the convergence of the algorithm. Guo et al. [12] [13] proved the convergence of the classic ADMM and extended it to the generalized ADMM (GADMM). Liu et al. [14] proposed a linearized ADMM and proved the convergence of the algorithm.
Regarding problem (2), it has been established in the literature that under convergence occurs, the symmetric ADMM often converges faster than the classical ADMM. Wu et al. [15] proposed a symmetric ADMM with one relaxation factor and verified its convergence. On this basis, some scholars have studied the symmetric ADMM with a relaxation factor and its variants for solving the nonseparable problem (1). For example, Dang et al. [16] proposed a linear proximal symmetric ADMM and verified that the sequence generated by the algorithm converges to a stationary point of the problem. The iterative scheme is as follows:
The parameters are selected as follows:
Dang et al. [17] proposed an inertial Bregman symmetric ADMM algorithm and established its global convergence. The iterative scheme is as follows:
where
,
represents the Bregman distances, and
is a relaxation factor. However, the convergence of the sequence generated by the algorithm depends on the strong convexity of the kernel function associated with the Bregman distance.
In recent studies on solving problem (2), we observe that Lu et al. [18] proposed a symmetric ADMM with two different relaxation factors, without introducing the Bregman distance, they proved its convergence with a broader range of parameters, numerical experiments verified that their algorithm converges faster than the algorithm proposed by [15]. Its iterative scheme is as follows:
(3)
Among them,
is the augmented Lagrangian function associated with problem (2), defined as follows:
(4)
where
is the penalty parameter,
and s are two different relaxation factors.
The above research on symmetric ADMM algorithms for solving problem (1) is limited to those containing only one relaxation factor. However, inspired by the algorithm proposed by the work of Lu et al. [18], we consider introducing two relaxation factors to achieve wider applicability, faster convergence, and a more concise and unified algorithmic framework, this paper proposes using a symmetric ADMM with two different relaxation factors to solve the non-separable (1), the iterative scheme is as follows:
(5)
Compared with the algorithm proposed by Dang et al. [16], our algorithm has a wider range of parameter values, allowing it to be applied to more practical scenarios through appropriate parameter tuning. Moreover, unlike the algorithm proposed by Dang et al. [17], our method does not require the introduction of the Bregman distance, thereby providing a unified iterative framework for solving non-separable problems via ADMM. The optimality conditions are as follows:
(6)
In the case where
, when
and
, the proposed algorithm reduces to the classical ADMM. When
and
, the algorithm reduces to the symmetric ADMM with a scaling factor. This demonstrates that our method provides an improved basic iterative framework for symmetric ADMM with relaxation factors.
2. Preliminaries
In this section, we provide the necessary definitions and properties required for the following study.
Notations:
,
,
: real numbers,
-dimensional real vectors,
-real matrices.
and
: inner product and associated norm.
Set-valued mapping
:
Distance: For
and
,
, with
.
Definition 1. [19] The domain of function
is denoted by
then
is called a proper function.
Definition 2. [19] A function
is said to be lower semicontinuous at
if it satisfies
If this holds for every point in
, then
is said to be a lower semicontinuous function.
Definition 3. [20] Let
be a proper lower semicontinuous function.
(i) The Fréchet subdifferential of
at
, written
, is the set of vectors
that satisfy
When
, we set
.
(ii) The limiting-subdifferential of
at
, written
, is defined as follows
Lemma 1. [21] Let
be a continuously differentiable function and suppose that
is
-Lipschitz continuous. Then
Definition 4. ([22], Kurdyka-Lojasiewicz inequality) Let
be a proper lower semicontinuous function. For
, set
We say that function
has the KL property at
if there exist
, a neighbourhood
of
, and a continuous concave function
, such that
(i)
;
(ii)
is
on
and continuous at 0;
(iii)
;
(iv) for all
in
, the Kurdyka-Lojasiewicz inequality holds
where
is the distance from
to
.
Lemma 2. ([23], Uniformized KL property) Let
be a compact set and
be a proper and lower semicontinuous function. Assume that
is constant on
and satisfies the KL property at each point of
. Then, there exist
, and
such that for all
and for all
in the following intersection:
one has
Lemma 3. [24] Suppose that
, where
and
are proper lower semicontinuous functions. Then for all
, we have
Definition 5. ([24], Kurdyka-Lojasiewicz function) If
satisfies the KL property at each point of
, then
is called a KL function.
Definition 6.
is a stationary point of the augmented Lagrangian function
for problem (1), if and only if
(7)
Lemma 4. Let the iterative sequence generated by the algorithm (5) be denoted as
. Then, the following holds
Proof. Combining the second and fourth equations in (6) yields
subtracting the above two equations, we obtain
(8)
and thus
(9)
(10)
According to the optimality condition (6) of the
-subproblem, we have
(11)
putting (9) into (11), we get
(12)
According to the third and fourth equations in (6), we obtain
(13)
This completes the proof. □
3. Convergence Analysis
Assumption A. let
be a proper lower semicontinuous function, and let
be a continuously differentiable function with an
-Lipschitz continuous gradient
, and
be a continuously differentiable function with an
-Lipschitz continuous gradient
. Also assume that
,
Lemma 5. Let the iterative sequence generated by the algorithm (5) be denoted
as
. Suppose that this sequence is bounded and that Assumption A holds. Then the following conclusion holds
(14)
where
(see Remark (0.1)).
Proof. The polarization identity implies
(15)
Due to
is
-Lipschitz continuous and
is
-Lipschitz continuous, we combining the lemma 1, then we have
(16)
(17)
From the fourth equation in the optimality condition (6) of the problem, we obtain
(18)
We obtain
in (13) and combining the Lipschitz continuities of
and
, we have
(19)
Recall (9) and (10), we get
(20)
From the definition of the augmented Lagrangian function
in (4), and in combination with (15), we have
(21)
Then, combining it with (16) and (17) yields
(22)
Substituting (13) into the above expression and simplifying, we obtain
(23)
Note that, by using (4), (6), (20) and (19), we have
(24)
Summing the inequalities (23), (24) and the first equation in Section (5) we obtain
(25)
Thus
(26)
there
. This completes the proof. □
Remark 3.1. Thus, as long as
, the sequence
has sufficient descent properties, which means that
is monotonically nonincreasing. Therefore, we can achieve
by appropriately choosing the parameters
. The constraints on these parameters are as follows:
,
Lemma 6. Let the iterative sequence generated by the algorithm (5) be denoted as
. Suppose that this sequence is bounded, that Assumption A holds and
. Then we have
Proof. The boundedness of
implies the existence of a subsequence
converging to
. Moreover, the lower semicontinuity of
together with the continuity of
ensures that
is lower semicontinuous. Hence,
Hence,
is bounded below. The fact that it is also nonincreasing implies its convergence. Since
is monotonic and contains a convergent subsequence, it follows that
itself converges, satisfying
. Finally, invoking Equation (14) yields
By summing over
, and observing that
, we arrive at
The condition
yields
and
. Using (19), we further obtain
. Hence,
. This completes the proof. □
Lemma 7. Let the iterative sequence generated by the algorithm (5) be denoted as
. Suppose that this sequence is bounded and that Assumption A holds. We define
(27)
Hence, it holds that
, and there exists a constant
such that
(28)
Proof. From the definition of the function
in (4), the following system of equations holds
(29)
From the optimality condition (6), after rearrangement, we have
Then, by substituting the above into (29), we obtain
Consequently, applying Lemma 3 yields
. Based on the preceding relation, we can find
such that
(30)
Applying (19), there exists
for which the following holds
then
(31)
This completes the proof. □
Lemma 8. Let the iterative sequence generated by the algorithm (5) be denoted as
. Suppose that this sequence is bounded and that Assumption A holds. Let
represent the set of all cluster points of the sequence
. Then the following statement is true
(i)
is a nonempty compact set, and
(ii)
, where
denotes the set of all stationary points of
;
(iii)
is finite and constant on
, which equals to
Proof. We now verify the above results one by one.
(i) This is immediate from the definition of limit points.
(ii) Suppose
. Then there exists a subsequence
of
such that
. Applying Lemma 6 yields
(32)
thus,
. Noting that
minimize
with respect to
, we get
(33)
With respect to the variables
,
, and
. Hence, it follows that
(34)
Furthermore, applying (33) yields
(35)
Since
is lower semicontinuous, we know that
(36)
From (34), (35) and (36), we obtain
Taking the limit in (6) along the subsequence
using (32) again, we obtain
Therefore,
satisfies the critical point condition of (4), which implies that
. Thus,
.
(iii) Take any
. Then there exists a subsequence
, such that
. Since
is nonincreasing and has a convergent subsequence, the entire sequence
converges, we have
That is,
takes a constant value on
. Clearly,
This completes the proof. □
Theorem 9. Let the iterative sequence generated by the algorithm (5) be denoted as
. Suppose that this sequence is bounded, that Assumption A holds and
. When
is a KL function, then
has finite length, that is
Moreover, it converges to a critical point of
.
Proof. Lemma 8 implies that
for any
. Next, we examine two cases.
(i) Suppose there exists
with
, then using (14) and Remark 0.1, for every
, we have
Hence,
and
for any
. Then, by (19), we further obtain
for any
, which means
.
(ii) If
holds for all
, then the following convergence properties hold:
Since
, for any
there exists
such that for all
, it holds
is true.
Since
, for any
there exists
such that for all
, it holds that
is true.
Now set and any
, we have
By Lemma 8, we have established that
is a nonempty compact set and that
is constant
. Consequently, Lemma 2 implies that
(37)
Drawing on what has been clearly established, namely that
and the concavity of
, it follows that
Now, taking the above inequality together with
and relation (37), it follows that
For convenience, we define
Then the above equation is equivalent to
(38)
Together, Lemma 5 and (38), imply that
together with
and thereby
Using the fact that
, we have
(39)
Taking the sum of (39) over
gives
From Definition, we have
from Definition 4. Rearranging terms and taking
gives
(40)
Thus,
(41)
and
(42)
Substituting into (19) yields
(43)
Additionally, we note that
From (41), (42) and (43), we arrive at
Finally,
converges to a critical point of
by Lemma 8. This completes the proof. □
Lemma 10. Let
be the sequence generated by the symmetric ADMM (5), If at least one of the following statements holds
(i)
.
(ii)
and
.
Then we can conclude that the sequence
is bounded.
Proof. In the first place, suppose that condition (i) holds. Based on Lemma 5, we arrive at
By combining (4) with
, we get
Note that (i) implies that
. Based on Assumption A, we can obtain
. Thus, we can deduce that
and
are bounded. Hence,
is also bounded, and thus
is bounded. This completes the proof. □
Theorem 11. (Convergence rate) Let the iterative sequence generated by the algorithm (5) be denoted as
. Suppose that this sequence is bounded, that Assumption A holds and
, and
. Assume that
possesses the KL property at
, and that the corresponding function is given by
, with
,
. Then the following three statements hold
(i) If
, the sequence
converges in finitely many steps. This means we can find an index
with
.
(ii) If
, then there is a constant
and
so that
(iii) If
, then there is a constant
for which
Proof. For
, we have
and
. Suppose, contrary to the claim, that
does not terminate in finitely many steps. Then, for large enough
, the KL property yields
, which contradicts Lemma 7. Now let
and set
for
. From (40) we deduce
(44)
Because
has the KL property at
, we have
This inequality can be rearranged as
(45)
Finally, applying Lemma 7 yields
(46)
From (45) and (46), there exists
satisfying
Inserting this into (44) gives
(47)
Now, by (47) and the results of Attouch and Bolte [25], we obtain
(48)
(49)
Next, applying (19) yields
(50)
Thus, (ii) and (iii) follow from (48)-(50).
4. Conclusion
In this paper, we propose a symmetric alternating direction method of multipliers with two different relaxation factors for minimizing the sum of two non-separable nonconvex functions. For problems where the objective function contains a coupling term, i.e., the case where
and
are non-separable, research remains limited in both convex and non-convex settings. We review the development of symmetric ADMM for solving non-separable problems and find that although many existing works have proposed symmetric ADMM variants incorporating techniques such as Bregman distances, inertial terms, regularization terms, or linearization, they often introduce only one relaxation factor. Inspired by this, we introduce two different relaxation factors and apply the algorithm to non-separable nonconvex problems, thereby refining the basic form of symmetric ADMM for solving non-separable problems. This makes the parameter range of the algorithm broader, allowing it to be adapted to more practical problems by adjusting the parameters. It also provides fundamental theoretical support for further integration with other techniques. Finally, based on the Kurdyka-Łojasiewicz (KL) property, we prove that the sequence generated by the algorithm converges to a stationary point of the problem and further analyze its finite-step convergence, linear convergence, and sublinear convergence.
Acknowledgements
Sincere thanks to the members of JAMP for their professional performance, and special thanks to managing editor Hellen XU for a rare attitude of high quality.