Improved Robust Low-Rank Regularization Tensor Completion

Xiangyi Wang; Wei Jiang

doi:10.4236/oalib.1109425

Open Access Library Journal > Vol.9 No.11, November 2022

Improved Robust Low-Rank Regularization Tensor Completion

Xiangyi Wang¹, Wei Jiang^1,2
¹School of Mathematics, Liaoning Normal University, Dalian, China.
²College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China.
DOI: 10.4236/oalib.1109425 PDF HTML XML 107 Downloads 682 Views

Abstract

In recent years, there are many surrogates for tensor tubal rank. In this paper, we propose a hybrid norm consisting of the weighted nuclear norm and the weighted Frobenius norm (WTNFN) of a tensor. The WTNFN is a surrogate for tensor tubal rank, and studies the weighted tensor nuclear and Frobenius norm minimization (WTNFNM) problem. The aim is to enhance the stability of the solution and improve the shortcomings of the traditional method of minimizing approximate rank functions based on the tensor nuclear norm (TNN) in the field of low-rank tensor recovery. Based on this definition, we build a novel model for typical tensor recovery problems i.e. the weighted tensor nuclear and Frobenius tensor completion (WTNFNTC). The experimental results on synthetic data and real data show that the stability and recovery effects of the model are improved compared with related algorithms.

Keywords

Low-Rank Recovery, T-SVD, Weighted Tensor Nuclear, Weighted Tensor Frobenius Norm

Share and Cite:

Wang, X.Y. and Jiang, W. (2022) Improved Robust Low-Rank Regularization Tensor Completion. Open Access Library Journal, 9, 1-25. doi: 10.4236/oalib.1109425.

1. Introduction

With the development of modern computer information technology, various data have emerged in many application fields such as mobile Internet, statistics, and psychometrics. These data are not only huge in scale, but also have the remarkable characteristics of extremely high dimensionality and complex structure. As a natural representation of multi-dimensional data, tensors are the generalization of 1-order vectors and 2-order matrices in higher-order forms [1].

In recent years, the research interest in tensors has expanded to many fields: e.g., computer vision [2], machine learning [3], pattern recognition [4], signal processing and image processing [5] [6]. For example, in a video recommendation system, user rating data can be expressed as a 3-order tensor of “user × video × time” [7]; a color image is a 3-way object with column, row and color modes. Compared with vectors and matrices, in these applications, tensors can better retain the intrinsic structure information of high-dimensional spatial data. But the tensor of interest is frequently low-rank, or approximately so [8], and hence has a much lower-dimensional structure. This motivates the low-rank tensor approximation and recovery problem. The core of the low-rank tensor recovery model is the tensor rank minimization.

In matrix processing, the rank minimization is NP-hard and is difficult to solve, the problem is generally relaxed by substitutively minimizing the nuclear norm of the estimated matrix, which is a convex relaxation of minimizing the matrix rank [9]. However, as a high-level extension of vectors and matrices, whether there is also a method similar to the matrix LMRA for low-rank tensor approximation and how to extend the low-rank definition from a matrix to a tensor is still a question worth studying. Because the notions of the tensor nuclear norm, and even tensor rank, are still ambiguous or nonuniform.

The most two popular tensor rank definitions are the CANDECOMP/PARAFAC(CP)-rank, which is related to the CANDECOMP/ PARAFAC decomposition [10], and Tucker-rank, which is corresponding to the Tucker decomposition [11] [12]. The CP rank is defined as the smallest number of rank one tensor decomposition, where the computation is generally NP-hard [13] and its convex relaxation is intractable. For the Tucker rank minimization, Liu et al. [14] first proposed the sum of nuclear norms (SNN) of unfolding matrices of a tensor to approximate the sum of Tucker rank of a tensor for tensor completion. Unfortunately, albeit the tucker rank and its convex relaxation are tractable and widely used, its direct unfolding and folding operation tend to damage intrinsic structure information of the tensor and SNN is not the convex envelope of the tucker rank according to [15].

Recently, Kilmer et al. [16] focused on a new tensor decomposition paradigm, namely tensor singular value decomposition (T-SVD), which is similar to the singular value decomposition of matrix. By using T-SVD, Kilmer et al. [16] defined the concept of tensor tubal rank, which is based on tensor-product (t-product) and its algebraic framework. T-SVD allows the matrix analysis to be extended to the tensor, while avoiding the inherent information loss in matricization or flattening of the tensor. The tensor tubal rank can well characterize the intrinsic structure of a tensor, and thus, solving low-tubal-rank related problems starts to attract more and more reasonable research interest these years.

This work focuses on the low-tubal-rank tensor recovery problem under the T-SVD framework. For the underlying tensor with low-tubal-rank assumption, it can be recovered by solving the following optimization:

$\min_{X} f (X) + τ \cdot r a n k_{t} (X)$ (1)

where $τ > 0$ is a regularization parameter, $r a n k_{t} (\cdot)$ denotes the tensor tubal rank.

It has been proved that solving (1) is NP-hard. The commonly used convex relaxation for the rank function is the tensor nuclear norm. In the recent work [17], Lu et al. first defined the tensor average rank and deduced that a tensor always has low average rank if it has low tubal rank and then proposed a novel tensor nuclear norm that is proven to be the convex envelope of the tensor average rank within the unit ball of the tensor spectral norm, which performed much better than other types of tensor nuclear norm in many tasks. This methodology is called as tensor nuclear norm minimization (TNNM). The nuclear norm of a tensor $X \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , denoted by ${‖ X ‖}_{⋆}$ . The TNNM [17] approach has been attracting significant attention due to its rapid development in both theory and implementation. A general TNNM problem can be formulated as

$\min_{X} f (X) + τ \cdot {‖ X ‖}_{⋆}$ (2)

However, even though tensor nuclear norm relaxation is becoming a popular scheme for the low-tubal-tensor recovery problem, it is also associated with shortcoming. The tensor nuclear norm suffers from the major limitation that all singular values are simultaneously minimized, which implies that large singular values are penalized more heavily than small ones. Under this circumstance, method based on nuclear norms usually fails to find a good solution. Even worse, the resulting global optimal solution may deviate significantly from the ground truth. In order to break through the limitations of the tensor nuclear norm relaxation method, there is currently work to solve the low-tubal-rank recovery problem by employing a weighted tensor nuclear norm and nonconvex relaxation strategy. The essence of relaxation is to overcome the unbalanced punishment of different singular values. Essentially, they will keep the larger singular values larger and shrink the smaller singular values. Hence, the method has already been well studied and applied in many low-tubal-rank tensor recovery problems, and in real applications, weighted tensor nuclear norm relaxation methods usually perform better than tensor nuclear norm relaxation. In addition, for low-rank representation, the Frobenius norm has been shown to perform similarly to the nuclear norm [18] [19].

This paper mainly studies the tensor completion problem. Different from the existing nonconvex tensor completion models, this paper also considers the stability and low rank of the solution of the tensor recovery model, and proposes a relatively stable low-rank nonconvex approximation of tensors. Specifically, our contributions can be summarized in the following three points.

・ First, Similar to the tensor nuclear norm definition form, we show that the square of the tensor Frobenius norm can be defined as the sum of its singular value squares. Based on the above definition, assign different weights to singular values of a tensor. We propose a surrogate of the tensor tubal rank, i.e., the WTNFN.

・ Second, to optimize the WTNFN-based minimization Problem, we extend the weighted tensor nuclear and Frobenius norm singular value thresholding (WTNFNSVT) operator which is given with careful derivations. For the tensor in the complex field, and demonstrate that it is the exact solution to the WTNFN-based minimization problem. The bounded convergence is also demonstrated with strict proof.

・ Third, we adapt the proposed weighted tensor nuclear and Frobenius norm model to different computer vision tasks. In tensor completion the model outperforms TNN methods in both PSNR index and visual perception. The proposed WTNFN-TC models also achieve better performance than traditional algorithms, especially, the low-rank learning methods designed for these tasks. The global optimal solution obtained above is relatively stable compared to existing weighted nuclear norm models.

The rest of this paper proceeds as follows: Section II briefly introduces some notations, preliminaries, and some preliminary works; Section III analyzes the optimization of the WTNFNM models, including the basic WTNFN operator in detail, and the convergences of the proposed methods are specifically analyzed with strict formulation; Section IV applies the WTNFNM model to tensor completion and compares the proposed algorithm with numerical experiments demonstrate the effectiveness of the proposed methods compared with other state-of-the-art methods; finally the conclusions and future work are given in Section V.

2. Notations and Preliminaries

2.1. Notations

We denote tensors by boldface Euler script letters, e.g., $A$ . Matrices are denoted by boldface capital letters, e.g., $A$ , vectors are denoted by boldface lowercase letters, e.g., $a$ , and scalars are denoted by lowercase letters, e.g., a. Real and complex numbers are in $ℝ$ and $ℂ$ fields respectively. For a 3-way tensor $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , we denote $A (i, :,:)$ , $A (:, i, :)$ , and $A (:,:, i)$ as the i-th horizontal, lateral and frontal slice. Generally, the frontal slice $A (:,:, i)$ can also be denoted as $A^{(i)}$ . We denote its $(i, j, k)$ -th entry as $A (i, j, k)$ or $A_{i j k}$ . The tube is denoted $A (i, j,:)$ . The inner product between $A$ and $B$ in $A \in ℝ^{n_{1} \times n_{2}}$ , is defined as $〈 A, B 〉 = Tr (A^{*} B)$ , where $Tr (\cdot)$ stands for the trace, and $A^{T}$ is the conjugate transpose of $A$ , and the inner product of any two tensors $A$ and $B$ in $ℂ^{n_{1} \times n_{2} \times n_{3}}$ , $〈 A, B 〉 = \sum_{i = 1}^{n_{3}} 〈 A^{(i)}, B^{(i)} 〉$ .

For any $A \in ℂ^{n_{1} \times n_{2} \times n_{3}}$ , the complex conjugate of $A$ is denoted as $c o n j (A)$ which takes the complex conjugate of each entry of $A$ . $⌈ b ⌉$ is the nearest integer greater than or equal to b. Now we consider applying the Discrete Fourier Transformation (DFT) on tensors. $\bar{A}$ is the result of the Discrete Fourier Transformation of $A$ along the third dimension, that is,

$\bar{A} = f f t (A, [],3)$ (3)

We denote $\bar{A} \in ℂ^{n_{1} n_{3} \times n_{2} n_{3}}$ as a block diagonal matrix with its i-th block on the diagonal as the i-th frontal slice ${\bar{A}}^{(i)}$ of $A$ ,. i.e.,

$\bar{A} = bdiag (\bar{A}) = [\begin{matrix} {\bar{A}}^{(1)} \\ {\bar{A}}^{(2)} \\ ⋱ \\ {\bar{A}}^{(n_{3})} \end{matrix}]$ (4)

Next, we define the block circulant matrix $bcirc (A)$ reshapes tensor $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ into a block circulant matrix with size $n_{1} n_{3} \times n_{2} n_{3}$ , i.e.,

$bcirc (A) = [\begin{matrix} A^{(1)} & A^{(n_{3})} & \dots & A^{(2)} \\ A^{(2)} & A^{(1)} & \dots & A^{(3)} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ A^{(n_{3})} & A^{(n_{3} - 1)} & \dots & A^{(1)} \end{matrix}]$ (5)

2.2. T-Product and T-SVD

For tensor $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , $unfold (A)$ unfolds tensor $A$ as follows:

$unfold (A) = [\begin{matrix} A^{(1)} \\ A^{(2)} \\ ⋮ \\ A^{(n_{3})} \end{matrix}], fold (unfold (A)) = A$ (6)

According to [17], t-product is equivalent to the familiar matrix multiplication in the Fourier domain.i.e., $C = A * B$ is equivalent to $\bar{C} = \bar{A} \cdot \bar{B}$ . Then the tensor product of two third-order tensors can be defined as follows:

Definition 1 (T-product [16]) Let $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ and $B \in ℝ^{n_{1} \times m \times n_{3}}$ . Then the t-product $C = A * B$ is defined to be a tensor of size $n_{1} \times m \times n_{3}$

$C = A * B = fold (bcirc (A) \cdot unfold (B))$ (7)

Definition 2 (Conjugate Transpose [16]) The conjugate transpose of a tensor of $A \in ℂ^{n_{1} \times n_{2} \times n_{3}}$ size $n_{1} \times n_{2} \times n_{3}$ is a $n_{2} \times n_{1} \times n_{3}$ tensor $A^{T}$ obtained by conjugate transposing each of the frontal slice and then reversing the order of transposed frontal slices 2 through $n_{3}$ . If tensor is $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , Conjugate transpose is also recorded as $A^{T}$ .

Definition 3 (Identity Tensor [16]) The identity tensor $I \in ℝ^{n \times n \times n_{3}}$ is the tensor whose first frontal slice is an $n \times n$ identity matrix, and whose other frontal slices are all zeros.

Definition 4 (Orthogonal Tensor [16]) A tensor $Q \in ℝ^{n \times n \times n_{3}}$ is orthogonal if it satisfies

$Q^{T} * Q = Q * Q^{T} = I$ (8)

Definition 5 (F-diagonal Tensor [16]) A tensor is called F-diagonal if each of its frontal slices is a diagonal matrix.

As stated above, tensor-SVD can be defined.

Theorem 1 (T-SVD [17]) Let $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ . Then it can be factorized as

$A = U * S * V^{T}$ (9)

where $U \in ℝ^{n_{1} \times n_{1} \times n_{3}}$ , $V \in ℝ^{n_{2} \times n_{2} \times n_{3}}$ are orthogona tensors, and $S \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ is an F-diagonal tensor.

It is worth noting that T-SVD can be efficiently computed based on the matrix SVD in the Fourier domain. The details of T-SVD are described in Algorithm 1.

Definition 6 (Tensor Tubal Rank [20] [21]) For $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , the tensor tubal rank, denoted as $r a n k_{t} (A)$ , is defined as the number of nonzero singular tubes of $S$ , where $S$ is from the T-SVD of $A = U * S * V^{T}$ . The tensor tubal rank is equivalent to the number of nonzero singular values of $A$ . That is

$r a n k_{t} (A) = # {i, S (i, i,:) \neq 0} = # {i, S (i, i,1) \neq 0} .$ (10)

where $S (i, i,1) = \frac{1}{n_{3}} \sum_{j = 1}^{n_{3}} \bar{S} (i, i, j)$ , it uses the property induced by inverse DFT. Denote $σ (A) \in ℝ^{n \times n_{3}}$ with $σ_{i j} (A) = \bar{S} (i, i, j)$ , therefore, the Tensor unclear norm and Frobenius norm tensor are defined as follows.

Definition 7 (Tensor Nuclear Norm (TNN) [22]) For tensor $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , based on the result about the tensor nuclear norm derived from the T-product. we have the tensor nuclear norm (TNN) is defined to be

${‖ A ‖}_{⋆} = \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} σ_{i j} (A)$ (11)

Theorem 2 (Tensor Frobenius Norm (TFN)) Given a tensor $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , if $A = U * S * V^{T}$ the following hold:

${‖ A ‖}_{F} = \frac{1}{\sqrt{n_{3}}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} \sqrt{σ_{i j}^{2} (A)}$ (12)

3. Proposed Algorithm Scheme

Based on the definition of tensor nuclear norm and tensor Frobenius norm in Definition 8 and Theorem 2, we present a new nonconvex regularizer which is defined as the sum of the weighted tensor nuclear norm and the square of the weighted tensor Frobenius norm (WTNFN) of $X \in ℝ^{n_{1} \times n_{2} \times n_{3}}$

$\begin{matrix} {‖ X ‖}_{WTNFN} = {‖ X ‖}_{W, ⋆} + γ {‖ X ‖}_{W, F}^{2} \\ = \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} w_{i j} σ_{i j} (X) + \frac{γ}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} w_{i j} σ_{i j}^{2} (X) \end{matrix}$ (13)

where $W = {(w_{i j})}_{n \times n_{3}}$ is the weight matrix, $n = \min (n_{1}, n_{2})$ , $γ$ is a positive parameter.

3.1. Problem Formulation

Consider the general weighted tensor nuclear and Frobenius norm framework for tensor tubal rank minimization problems

$\min_{X} f (X) + τ \cdot {‖ X ‖}_{WTNFN}$ (14)

where $X \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , $τ > 0$ is a positive parameter. And the loss function $f (X)$ satisfy the following assumptions:

・ $f : ℝ^{n_{1} \times n_{2} \times n_{3}} \to ℝ^{+}$ is continuously differentiable with Lipschitz continuous gradient, i.e., for any $X, Y \in ℝ^{n_{1} \times n_{2} \times n_{3}}$

${‖ \nabla f (X) - \nabla f (Y) ‖}_{F} \leq L (f) ‖ X - Y ‖$ (15)

where $L (f) > 0$ is the Lipschitz constant.

・ f is coercive, i.e., $f (X) \to \infty$ when $X \to \infty$ .

To enhance low rank, we need to design a scheme to keep the weights of large singular values sufficiently small and the weights of small singular values sufficiently large, which will lead to a nearly unbiased low rank approximation. Intuitively, one can set each weight, which the entries of matrix $w_{i j} \in ℝ^{+}$ to be inversely proportional to the corresponding singular value $σ_{i j} (X)$ , which will penalize large singular values less and overcome the unfair penalization of different singular values.

3.2. Methodology

We solve problem (14) by generating an iterative sequence $X^{(k)}$ , where $X^{(k + 1)}$ is the minimizer of a weighted tensor nuclear and Frobenius norm regularized problem that comes from the linearized approximation of the objective function $f (X)$ at $X^{(k)}$ . To realize this, for the loss function $f (X)$ , we approximate it as a quadratic function

$f (X) = f (X^{(k)}) + 〈 \nabla f (X^{(k)}), X - X^{(k)} 〉 + \frac{μ^{(k)}}{2} {‖ X - X^{(k)} ‖}_{F}^{2}$ (16)

Since optimizing $f (X)$ directly is hard, we minimize its approximation (16) instead. Hence, we solve the following relaxed problem to update $X^{(k + 1)}$

$\begin{matrix} X^{(k + 1)} = \underset{X}{\arg \min} f (X^{(k)}) + 〈 \nabla f (X^{(k)}), X - X^{(k)} 〉 \\ + \frac{μ^{(k)}}{2} {‖ X - X^{(k)} ‖}_{F}^{2} + τ \cdot {‖ X^{(k)} ‖}_{WTNFN} \end{matrix}$ (17)

By ignoring constant terms and combining others, (17) can be expressed equivalently as

$X^{(k + 1)} = \frac{μ^{(k)}}{2 τ} {‖ X - (X^{(k)} - \frac{1}{μ^{(k)}} \nabla f (X^{(k)})) ‖}_{F}^{2} + τ \cdot {‖ X^{(k)} ‖}_{WTNFN}$ (18)

Thus, consider $Y^{(k)} = X^{(k)} - \frac{1}{μ^{(k)}} \nabla f (X^{(k)})$ the above formula is equivalent

to the following, which is a nonconvex proximal operator problem. Next, we will prove (18) has a closed-form solution by exploiting the special structure of it.

3.3. WTNFNSVT Operator

The weighted tensor nuclear and Frobenius norm singular value thresholding (WTNFNSVT) operator is defined as follows to compute the proximal minimizer of the tensor weighted nuclear and Frobenius norm minimization.

Theorem 3 (WTNFNSVT Operator) Let $Y = U * S * V^{T}$ be the T-SVD of $Y \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ . For any $τ > 0$ , $γ > 0$ , the solution of the problem

$\min_{X} {‖ X - Y ‖}_{F}^{2} + τ \cdot {‖ X ‖}_{WTNFN}$ (19)

is given by $D_{W, γ}$ , which is defined the WTNFNSVT operator as

$D_{W, γ} = U * S_{W, γ} * V^{T}$ (20)

where $S_{W, γ} = i f f t ({({(1 + γ τ W)}^{- 1} (\bar{S} - \frac{τ}{2} W))}_{+}, [],3)$ , $W \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ is F-diagonal whose diagonal entries of the j-th frontal slice are equal to the j-th column of weight matrix $W$ . The entries of weight matrix $W$ satisfies $0 \leq w_{1 j} \leq w_{2 j} \leq \dots \leq w_{n j}$ , $n = \min [n_{1}, n_{2}]$ , $j = 1, 2, \dots, n_{3}$ .

Theorem 2 shows that the WTNFNSVT operator $D_{W, γ}$ gives a close-from of the proximal minimizer of the WTNFN minimization with monotone nonnegative weights, which is a natural extension of the weighted matrix SVT. The feasibility of this extension is guaranteed by the T-SVD framework. Using Algorithm 1, we show how to compute $D_{W, γ}$ efficiently in Algorithm 2.

Obviously, when $γ = 0$ the weighted tensor nuclear norm minimization (WTNNM) was a special case of this problem at that time. And secondly, when all weights were set the same, tensor nuclear norm minimization (TNNM) was a special case of this problem. Therefore, the solution of this model covers the solutions of traditional TNNM and WTNNM.

3.4. The Setting of Weighting Matrix

In previous sections, we proposed to utilize the WTNFNM model and its solving method. By introducing the weight vector, the WTNFNM model improves the flexibility of the original TNNM model. However, the weight vector itself also brings more parameters in the model. Appropriate setting of the weights plays a crucial role in the success of the proposed WTNFNM model.

Inspired by the weighted unclear norm proposed by Gu et al. [23], a similar weighting method can be used in the weighted tensor unclear norm and Frobenius norm, that is

$w_{i j}^{k + 1} = \frac{C}{σ_{i j} (X^{(k)}) + ε}$ (21)

where $σ_{i j} (X^{(k)})$ is the singular value of the tensor $X$ in the k-th iteration, $w_{i j}^{k + 1}$ is the corresponding weight of $σ_{i j} (X^{(k)})$ . Because (21) is monotonically decreasing with the singular values, the non-descending order of weights with respect to singular values will be kept throughout the reweighting process. According to the Remark 1 in [23], we can get Corollary 1 as follow:

Corollary 1. Let $Y = U * S * V^{T}$ be the T-SVD of tensor $Y \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ , $σ_{i j} (Y) : = \bar{S} (i, i, j)$ denotes singular value of the tensor $Y$ . If the regularization parameter C is positive and the positive value $ε$ is small enough. By using

the reweighting formula $w_{i j}^{k + 1} = \frac{C}{σ_{i j} (X^{(k)}) + ε}$ , and initial estimate $X^{(0)} = Y$ . The reweighted problem

$\min_{X} ‖ X - Y ‖ + τ \cdot ({‖ X ‖}_{w, ⋆} + γ {‖ X ‖}_{w, F}^{2})$ (22)

has a closed-form solution

$\hat{X} = U * S_{w} * V^{T}$ (23)

where $S_{w}$ is a F-diagonal tensor, $σ_{i j} (X) : = {\bar{S}}_{w} (i, i, j)$ . Consider $c_{1} = τ (σ_{j} ({\bar{Y}}^{(i)}) - ε - γ C)$ and $c_{2} = τ^{2} {(σ_{j} ({\bar{Y}}^{(i)}) - ε - γ C)}^{2} - 4 (2 C - τ ε σ_{j} ({\bar{Y}}^{(i)}))$ . We get

$σ_{i j} (\hat{X}) = {\begin{array}{l} 0 & if c_{2} > 0 \\ \frac{c_{1} + \sqrt{c_{2}}}{2} & if c_{2} > 0 \end{array}$ (24)

The proof of Corollary 1 can be found in supplementary material. Corollary 1 shows that although a reweighting strategy (21) is used, we do not need to iteratively perform the thresholding and weight calculation operations. In each iteration of both the WTNFNM models to solve the typical tensor recovery problems, and the tensor completion algorithms, such a reweighting strategy is performed on the WTNFNSVT subproblem (step 2 in Algorithms 2) to adjust weights based on current $X^{(k)}$ . In implementation, we initialize $X^{(0)}$ as the observation tensor $Y$ . The above weight setting strategy greatly facilitates the WTNFNM calculations. Note that there remain two parameters C and $γ$ in the WTNFNM implementation. In all of our experiments, we set it by experience for certain tasks. Please see the following sections for details.

3.5. Convergence Analysis

In this section, we give the convergence analysis for (18). Suppose that the sequence $X^{(k)}$ is generated by Algorithm 2. For the simplicity of notation, we denote $σ_{i j}^{k}$ as the singular value of $X$ in the k-th iteration, and $w_{i j}^{k}$ is the corresponding weight of $σ_{i j}^{k}$ .

Theorem 4. In problem (14), assume that $f (X)$ satisfy the assumptions and $\nabla f$ is Lipschitz continuous. Consider $F (X) = f (X) + τ \cdot {‖ X ‖}_{WTNFN}$ , and the parameter $μ^{(k)} \geq \frac{L (f)}{1 - τ α}$ for any $k \geq 0$ , $0 < α < 1$ . Then the sequence ${X^{(k)}}$ and ${F (X^{(k)})}$ generated by WTNFNM satisfies the following properties:

1) $F (X^{(k)})$ is monotonically decreasing. Indeed,

$F (X^{(k)}) - F (X^{(k + 1)}) \leq \frac{μ^{(k)} - L (f)}{1 - 2 τ} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2}$ (25)

2) The sequence ${X^{(k)}}$ is always bounded. Moreover, $\lim_{k \to + \infty} (X^{(k + 1)} - X^{(k)}) = 0$ .

3) Any accumulation point $\hat{X}$ of the sequence ${X^{(k)}}$ is a critical point of $F (X)$ .

Proof. (1) First, with the definition of $X^{(k + 1)}$ , i.e.,

$X^{(k + 1)} = \arg \min_{X} \frac{μ^{(k)}}{2 τ} {‖ X - (X^{(k)} - \frac{1}{μ} \nabla f (X^{(k)})) ‖}_{F}^{2} + {‖ X ‖}_{WTNFN}$ (26)

We have

$\begin{matrix} X^{(k + 1)} = \arg \min_{X} \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} (w_{i j}^{(k)} σ_{i j} + w_{i j}^{(k)} σ_{i j}^{2}) + \frac{μ^{(k)}}{2 τ} {‖ X - (X^{(k)} - \frac{1}{μ} \nabla f (X^{(k)})) ‖}_{F}^{2} \\ = \arg \min_{X} \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} (w_{i j}^{(k)} σ_{i j} + w_{i j}^{(k)} σ_{i j}^{2}) + 〈 \nabla f (X^{(k)}), X - X^{(k)} 〉 \\ + \frac{μ^{(k)}}{2 τ} {‖ X - X^{(k)} ‖}_{F}^{2} \end{matrix}$

Then

$\begin{array}{l} \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} (w_{i j}^{(k)} σ_{i j}^{(k + 1)} + w_{i j}^{(k)} {(σ_{i j}^{(k + 1)})}^{2}) \\ + 〈 \nabla f (X^{(k)}), X^{(k + 1)} - X^{(k)} 〉 + \frac{μ^{(k)}}{2 τ} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2} \\ \leq \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} (w_{i j}^{(k)} σ_{i j}^{(k)} + w_{i j}^{(k)} {(σ_{i j}^{(k)})}^{2}) \\ + 〈 \nabla f (X^{(k)}), X^{(k)} - X^{(k)} 〉 + \frac{μ^{(k)}}{2 τ} {‖ X^{(k)} - X^{(k)} ‖}_{F}^{2} \end{array}$

which can be rewritten as

$\begin{array}{l} 〈 \nabla f (X^{(k)}), X^{(k + 1)} - X^{(k)} 〉 \\ \leq \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} (w_{i j}^{(k)} (σ_{i j}^{(k)} - σ_{i j}^{(k + 1)}) + ({(σ_{i j}^{(k)})}^{2} - {(σ_{i j}^{(k + 1)})}^{2})) - \frac{μ^{(k)}}{2 τ} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2} \end{array}$ (27)

Second, note that f is Lipschitz gradient continuous. According to [ [24] Lemma 2.11], we have

$f (X^{(k + 1)}) - f (X^{(k)}) \leq 〈 \nabla f (X^{(k)}), X^{(k + 1)} - X^{(k)} 〉 + \frac{L (f)}{2} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2}$ (28)

Finally, we combine (27) and (28), it obtains that

$\begin{array}{l} F (X^{(k + 1)}) - F (X^{(k)}) \\ = \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} (w_{i j}^{(k + 1)} (σ_{i j}^{(k + 1)} - {(σ_{i j}^{(k + 1)})}^{2}) + w_{i j}^{(k)} (σ_{i j}^{(k)} - {(σ_{i j}^{(k)})}^{2})) \\ + f (X^{(k + 1)}) - f (X^{(k)}) \\ \leq \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} (w_{i j}^{(k)} (σ_{i j}^{(k)} - σ_{i j}^{(k + 1)}) + {(σ_{i j}^{(k)})}^{2} - {(σ_{i j}^{(k + 1)})}^{2}) \\ + 〈 \nabla f (X^{(k)}), X^{(k + 1)} - X^{(k)} 〉 + \frac{L (f)}{2 τ} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2} \end{array}$

which leads

$F (X^{(k + 1)}) + \frac{μ^{(k)} - L (f)}{2 τ} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2} \leq F (X^{(k)})$ (29)

Since for any integer k, if $μ^{(k)} \geq \frac{L (f)}{1 - α τ}$ we have $\frac{μ^{(k)} - L (f)}{2 τ} \geq \frac{α}{2} μ^{(k)}$ always held. Then the monotone search criterion (25) is always satisfied, which makes the $F (X^{(k)})$ is monotonically decreasing.

2) It’s easy to obtain that $F (X^{(k)})$ is bounded. Then, we know that ${X^{(k)}}$ is bounded as well from the assumption. Besides, summing up the inequality (27) for all $k \geq 0$ , it leads that

$\begin{matrix} F (X^{(0)}) \geq \sum_{k = 0}^{+ \infty} \frac{μ^{(k)} - L (f)}{2 τ} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2} \\ \geq \frac{2 - 2 α τ}{α L (f)} \sum_{k = 0}^{+ \infty} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2} \end{matrix}$

Hence, we get

$\sum_{k = 0}^{+ \infty} {‖ X^{(k + 1)} - X^{(k)} ‖}_{F}^{2} \leq \frac{μ^{(k)} - L (f)}{2 τ} < + \infty$ (30)

From (30), it yields that

$\lim_{k \to + \infty} (X^{(k + 1)} - X^{(k)}) = 0$ (31)

3) Because of the boundedness of the sequence ${X^{(k)}}$ , there exists a subsequence ${X^{(k_{t})}}$ such that it converges to an accumulation point $\hat{X}$ , i.e., $\lim_{t \to + \infty} X^{(k_{t})} = \hat{X}$ . Based on that convergence result (31), we have $\lim_{t \to + \infty} X^{(k_{t})} = \hat{X}$ , which implies that $\lim_{t \to + \infty} σ_{i j} (X^{(k_{t})}) = σ_{i j} (\hat{X})$ for all i and j. Note that the subsequence ${X^{(k_{t})}}$ is bounded obviously, according to the [ [25] Proposition 2.1.5], there exists ${\hat{w}}_{i j} \in w_{i j} (σ_{i j} (\hat{X}) + σ_{i j}^{2} (\hat{X}))$ such that $\lim_{t \to + \infty} w_{i j}^{(k_{t})} = {\hat{w}}_{i j}$ for all i and j. Denote

$T (X, W) = \frac{1}{n_{3}} \sum_{i = 1}^{m} \sum_{j = 1}^{n_{3}} w_{i j} (σ_{i j} (X) + σ_{i j}^{2} (X))$ (32)

Since $X^{(k_{t})}$ is optimal to problem (30), there exists $Z^{(k_{t})} \in T (X^{(k_{t})}, W^{(k_{t})})$ such that

$Z^{(k_{t})} \in T (X^{(k_{t})}, W^{(k_{t})})$ (33)

such that

$Z^{(k_{t} + 1)} + \nabla f (X^{(k_{t})}) + μ^{(k_{t})} (X^{(k_{t} + 1)} - X^{(k_{t})}) = O$ (34)

Let $t \to + \infty$ in the above equation, then we obtain that there exists $\hat{Z} \in \partial T (\hat{X}, \hat{W})$

$O = \hat{Z} + \nabla f (\hat{X}) \in \partial F (\hat{X})$ (35)

Thus $\hat{X}$ is a critical point of problem (14). This completes the proof.

4. Experiments

In this section, we conducted a series of experiments on both synthetic and real-world data to demonstrate the superiority of the proposed model. we empirically demonstrated that the WTNFN regularization algorithm leads to more stable solutions than WTNN (with $γ = 0$ ). Moreover, it could achieve better performance when compared to the previous algorithm.

We solved the low rank tubal tensor completion problem by the following program using the Discrete Fourier Transform based tensor nuclear norm based on the proposed weighted tensor nuclear and Frobenius norm

$\min_{X} τ {‖ X ‖}_{WTNFN} + \frac{1}{2} {‖ P_{Ω} (X) - P_{Ω} (M) ‖}_{F}^{2},$ (36)

where $X, M, Ω$ were respectively the observed data and the underlying recover result, a binary support indicator tensor. Zeros in $Ω$ indicate the missing entries in the observed tensor, and $P_{Ω} : ℝ^{n_{1} \times n_{1} \times n_{3}} \to ℝ^{n_{1} \times n_{1} \times n_{3}}$ was a linear operator that keeps the entries in unchanged and sets those outside zeros. It was easy to know that the gradient of the squared loss function was a Lipschitz continuous function with a Lipschitz constant $L (f) = 1$ . The quality of the restored image was evaluated in terms of the peak-signal-to-noise ratio (PSNR) defined by

$PSNR = 10 \log_{10} (\frac{{‖ M ‖}_{\infty}^{2}}{\frac{1}{n_{1} n_{2} n_{3}} {‖ \hat{X} - M ‖}_{F}^{2}})$ (37)

4.1. Stability of the WTNFN Regularization Model

In this section, we show that WTNFN model leads to more stable solutions and provides improved experimental results than WTNN for color images. In our experiments, the testing images shown in Figure 1 were the “Im1” and “Im5” with size 300 × 300 × 3. For each image, pixels were sampled randomly with sampling rates of 30%, 50%, and 70%. These two images were used to illustrate the effectiveness of the Frobenius norm in the WTNFN.

We first applied the WTNFN with different $γ$ values to “Im1” and “Im5”. A suitable value of should $γ$ result to a stabilized solution and meanwhile result to a solution of high PSNR value. We denote WTNN the WTNN regularization model with $γ = 0$ , WTNFN1 with $γ$ is about 10⁻⁶, WTNFN2 with a lager $γ$ about 10⁻⁵. Figure 1 shows the PSNR values of sampling rates of 30%, 50%, and 70%, respectively, for recovering images “Im1” and “Im5” 10 times. Comparing the WTNFN with different values of $γ$ , we can find that the PSNR values of WTNFN1 and WTNFN2 are more stable than that of WTNN, which indicates that the Frobenius norm in the WTNFN penalty can induce stable solutions. In addition, we found that WTNFN11 and WTNFN2 behave very similarly for the values of PSNR for the same image at the same sample rate. So we should choose moderate value of $γ$ to balance the stability and the quality of the restored image. The results in Figure 1 show that setting $γ = 10^{- 6}$ to make sure that always achieves good results.

4.2. Synthetic Experiments

We conducted an experiment to demonstrate the practical applicability of the WTNFN heuristic for recovering low-rank tensors. We generated a tensor by $X = M * N$ , where the entries of $M \in ℝ^{n \times r \times n}$ and $N \in ℝ^{r \times n \times n}$ were independently sampled from standard Gaussian distribution. The tubal rank of tensor $X$ was defined as r. We solve (36) and obtain the solution $\hat{X}$ . Then we reported to the tubal rank $r_{t}$ of $\hat{X}$ and Relative Error was defined as

$RelativeError : = \frac{{‖ \hat{X} - X ‖}_{F}}{{‖ X ‖}_{F}}$ (38)

We simply considered tensors of size $n \times n \times n$ , with $n = [50,100,200]$ . To

Figure 1. Performance of the WTNFN on images “Im1” and “Im5” for different randomly sampled sets. (a) Results for sampling rate 30% “Im1”. (b) Results for sampling rate 30% “Im5”. (c) Results for sampling rate 50% “Im1”. (d) Results for sampling rate 50% “Im5”. (e) Results for sampling rate 70% “Im1”. (f) Results for sampling rate 70% “Im5”.

create the known entries, we sampled $m = p n^{3}$ items uniformly from $X$ , where p was the sampling rate. A reference quantity that is defined as $d_{r} = r (2 n - r) n$ in the literature [26]. We set the parameter $τ = 0.01$ , $γ = 10^{- 6}$ for model (36) in experiment. Table 1 respectively gives the recovery results for the linear transform DFT. It can be seen that the WTNFN program (36) gives the correct rank estimation of $X$ in all cases and the relative errors are small. Therefore, these results do a good job of validating the correct recovery guarantees of our algorithm.

Next, we studied the recovery phenomenon with different tubal ranks of $X$ and different sampling rates of p. We generated a 50 × 50 × 50 tensor $X$ similarly to the previous experiment. The range of variations for tubal ranks r and sampling rates p were $r = [1, \dots,20]$ and $p = [0.01, \dots,0.99]$ . The tubal rank was gradually increased in steps of 1. The sampling rate was gradually increased in steps of 0.01. We conducted an experiment for each $(p, r)$ pair. The algorithms stopped when the Relative Error was less than 10⁻⁵. Figure 2 reports the percentage of success for each pair of WTNFN and TNN for the linear transform DFT, in which the yellow region reflects the full successful recovery, and the blue region reflects the full failed recovery. The bigger the yellow area, the stronger ability to recover. It can be seen that the WTNFN method has a much larger yellow area than the TNN method. The recovery phenomena can be explained by the fact that WTNFN has a stronger ability to recovery than TNN. In a word, the WTNFN minimization model is more robust and efficient than the classic TNN minimization model.

4.3. Real World Experiments

Due to coding and transmission difficulties, photos may be partially damaged or obscured by text or branding. Matrix completion methods, of course, can be used to recover the missing information in these images. We considered the

Table 1. Exact tensor completion on random data.

Figure 2. Comparison of the recovery capacity varying tubal rank r and sampling rate p of the WTNFN (a) and TNN (b). The x-axis represents the sampling rate p, ranging from 0.01 to 0.99, the y-axis represents the tubal rank r, ranging from 1 to 20.

WFNFN to color image recovery with partially missing pixels, as color images with three channels (Red, Green, Blue) can be modeled as third-order low-tubal rank tensors. The Peak Signal-to-Noise Ratio (PSNR) is a widely used measure for assessing image quality. To compare the performance of different approaches, we utilize the PSNR value of the recovered image. Larger values of PSNR indicate better image restoration performance.

We considered four methods for image recovery and compare their performance: 1) TNN-DFT [26]: tensor nuclear norm based discrete Fourier transform tensor completion model; 2) TNNR [27]: tensor truncated nuclear norm regularization; 3) L_p [22]: tensor completion by nonconvex L_p function regularization; 4) TCTF [28]: tensor completion by tensor factorization.

a) Random Mask: We first conducted experiments to deal with a relatively easy tensor completion problem. We randomly selected 5 images with a size of 300 × 300 × 3 from the LJU dataset, and randomly sampled 50% of pixels for each image.

Five recovered image examples with sampling rate of 50% by different methods are shown in “Fig.”. In the magnified views of the comparable sections in color boxes, it reveals that our recovered images have a clearer shape, fuller color, and brightness than the other four state-of-the-art tensor completion methods. Furthermore, we display the PSNR values achieved by various approaches on five images in Figure 3 when the sampling rate is 50%. It can demonstrate that the WTNFN-based algorithms can achieve much higher PSNR values compared with the other four state-of-the-art tensor completion methods quantitatively for the majority of the images.

b) Text Mask: Text removal is difficult because the pixels covered by the text are not spread evenly throughout the image, and the text may obscure essential textural information. We start by looking at the text's location and treating the

Figure 3. Five images of natural scenes.

Table 2. PSNR values of the various methods on text mask for image recovery.

Figure 4. Comparison of the PSNR values obtained by TNNDTF, TNNR, L_P, TCTF and WTNFN on the tested images in Figure 5

items that correspond to it as missing values. As a result, the text removal problem can be thought of as a tensor completion problem in general. In this experiment, we apply our method to estimate the images from text mask.

As previously noted, we compared our results to the TNN-DFT, TNNR, L_p, and TCTF methods. PSNR measurements are used to assess performance. The recovered images from the five tensor completion algorithms are shown in Figure 6. On all five image sequences, We rounded the experimental results to two decimal places. Table 2 provides the PSNR values of the compared approaches. We

Figure 5. Performance comparison for image recovery on some sample images. (a) original image; (b) observed image; (c) recovered images by TNN-DFT, (d) recovered images by TNNR, (e) recovered images by L_p, (f) recovered images by TCTF, (g) recovered images by WTNFN.

can see that the WTNFN has absolute superiority, as evidenced by its far higher PSNR values.

5. Conclusions

In this work, we have presented a novel method called the weighted nuclear and Frobenius norm regularization for low-tubal rank tensor recovery. We proposed the solving approach for the weighted nuclear and Frobenius norm proximal (WTNFNSVT) operator. The WTNFN model was subsequently extended to tensor completion tasks, and efficient algorithms were developed to solve these using the derived WTNFNSVT operator. Influenced by the weighted nuclear for automatic weight setting. We also developed a rational scheme for automatic weight setting, which yields closed from solutions of the WTNFN operator. This approach can shorten the calculation time.

On both synthetic and actual visual datasets, we ran many experiments. In comparison to state-of-the-art tensor completion approaches based on the tenor nuclear norm or tensor factorization, the experimental findings indicate the

Figure 6. Comparison of tensor completion algorithms for text removal problem. (a) original image; (b) observed image; (c) recovered images by TNN-DFT (d) recovered images by TNNR, (e) recovered images by L_p, (f) recovered images by TCTF, (g) recovered images by WTNFN.

advantages of the WTNFN-based algorithms.

Acknowledgements

This work is supported by Innovative talents support plan of colleges and universities in Liaoning Province (2021).

Appendix A. Proof of Theorem 2

Lemma 1. Given a third-order tensor $A \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ if $U \in ℝ^{n_{1} \times n_{2} \times n_{3}}$ is orthogonal, i.e., $U * U^{T} = U^{T} * U = I$ , that is

${‖ U * A ‖}_{F} = {‖ A * U ‖}_{F} = {‖ X ‖}_{F}$ (39)

The theorem 2 can be proved by lemma 1.

Proof By lemma 1 and $A = U * S * V^{T}$ can get

${‖ A ‖}_{F} = {‖ S ‖}_{F} = \frac{1}{\sqrt{n_{3}}} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{3}} \sqrt{σ_{i j}^{2} (A)}$ (40)

End of proof.

Appendix B. Proof of Theorem 3

Proof of Theorem 3 requires the following Lemma 2.

Lemma 2 (Proximal Operator for WNFN) For a matrix $X \in ℂ^{m \times n} (m > n)$ , denote by $Y = U Σ^{T} V$ the singular value decomposition of matrix $Y$ , and $0 \leq w_{1} \leq w_{2} \leq \dots \leq w_{n}$ , the following minimization problem

$\min_{X} {‖ X - Y ‖}_{F}^{2} + τ \cdot ({‖ X ‖}_{w, ⋆} + γ {‖ X ‖}_{w, F}^{2})$ (41)

The solution is $p r o x_{{‖ \cdot ‖}_{w, ⋆} + {‖ \cdot ‖}_{w, F}^{2}}$ ,

$p r o x_{{‖ \cdot ‖}_{w, ⋆} + {‖ \cdot ‖}_{w, F}^{2}} = U S_{w} (Σ) V^{T}$ (42)

where $S_{w} (Σ) = \max {{(1 + γ τ w_{i})}^{- 1} (Σ_{i i} - \frac{2}{τ} w_{i}),0}$

B1. Proof of Lemma 1

Proof In order to prove Lemma 2, we first show the following theorem for the proximal operator of weighted nuclear and Frobenius norm for real-valued matrices. Then, we extend the result to prove Lemma 2 for complex-valued matrices.

For any $X, Y \in ℝ^{m \times n} (m > n)$ , denote by $\bar{U} D {\bar{V}}^{T}$ and $U Σ V$ the singular value decomposition of matrix $X$ and $Y$ , where $Σ = (\begin{matrix} d i a g (σ_{1}, σ_{2}, \dots, σ_{n}) \\ 0 \end{matrix}) \in ℝ^{m \times n}$ , and $D = (\begin{matrix} d i a g (d_{1}, d_{2}, \dots, d_{n}) \\ 0 \end{matrix})$ are the diagonal singular value matrices. Based on the property of Frobenius norm, the following derivations hold:

$\begin{array}{l} {‖ X - Y ‖}_{F}^{2} + τ \cdot ({‖ X ‖}_{w, *} + γ {‖ X ‖}_{w, F}^{2}) \\ = Tr (Y^{T} Y) - 2 Tr (Y^{T} X) + Tr (X^{T} X) + τ \cdot (\sum_{i}^{n} w_{i} d_{i} + γ \sum_{i}^{n} w_{i} d_{i}^{2}) \\ = \sum_{i}^{n} σ_{i}^{2} - 2 Tr (Y^{T} X) + \sum_{i}^{n} d_{i}^{2} + τ \cdot (\sum_{i}^{n} w_{i} d_{i} + γ \sum_{i}^{n} w_{i} d_{i}^{2}) \end{array}$

Based on the von Neumann trace inequality, we know that $Tr (X^{T} Y)$ achieves its upper bound $\sum_{i}^{n} σ_{i} d_{i}$ if $U = \bar{U}$ , $V = \bar{V}$ . Then we have

$\min_{X} {‖ X - Y ‖}_{F}^{2} + τ \cdot ({‖ X ‖}_{w, *} + γ {‖ X ‖}_{w, F}^{2})$

$\begin{array}{l} \Leftrightarrow \min_{D} \sum_{i}^{n} σ_{i}^{2} - 2 \sum_{i}^{n} σ_{i} d_{i} + \sum_{i}^{n} d_{i}^{2} + τ \cdot (\sum_{i}^{n} w_{i} d_{i} + γ \sum_{i}^{n} w_{i} d_{i}^{2}) \\ s .t . d_{1} \geq d_{2} \geq \dots \geq d_{n} \geq 0 \end{array}$

$\begin{array}{l} \Leftrightarrow \min_{D} \sum_{i}^{n} {(σ_{i} - d_{i})}^{2} + τ \cdot (w_{i} d_{i} + γ w_{i} d_{i}^{2}) \\ s .t . d_{1} \geq d_{2} \geq \dots \geq d_{n} \geq 0 \end{array}$

From the above derivation, we can see that the optimal solution of the WNFNP problem is $X^{⋆} = U D V^{T}$ , where $D$ is the optimum of the constrained quadratic optimization problem.

The following proof holds the Lemma 2 on the complex number field.

Let $Y = Y_{1} + i Y_{2} \in ℂ^{n \times m}$ be an arbitrary complex-valued matrix and its SVD be $Y = (U + i P) Σ {(V + i Q)}^{T}$ , where $Y_{1} \in ℝ^{n \times m}$ , $Y_{2} \in ℝ^{n \times m}$ , $U \in ℝ^{n \times n}$ , $P \in ℝ^{n \times m}$ , $Σ \in ℝ^{n_{0} \times n_{0}}$ , $V \in ℝ^{m \times m}$ , $Q \in ℝ^{m \times m}$ , and $n_{0} = {n, m}$ . Then, it is easy to verify that the SVD of the matrix

$A = [\begin{array}{r} Y_{1} & Y_{2} \\ - Y_{2} & Y_{1} \end{array}] \in ℝ^{2 m \times 2 m}$ (43)

can be expressed as

$A = [\begin{matrix} U & P \\ - P & U \end{matrix}] [\begin{matrix} Σ \\ Σ \end{matrix}] {[\begin{matrix} V & - Q \\ Q & V \end{matrix}]}^{T}$ (44)

Let $X = L_{1} + i L_{2} \in ℂ^{n \times m}$ , $B = [\begin{array}{r} L_{1} & L_{2} \\ - L_{2} & L_{1} \end{array}] \in ℝ^{2 n \times 2 m}$ , $\tilde{w} = {[w^{T}, w^{T}]}^{T} \in ℝ_{+}^{2 n_{0}}$ , and the functions

$h_{1} (X) = {‖ X - Y ‖}_{F}^{2} + τ \cdot (w {‖ X ‖}_{⋆} + γ w {‖ X ‖}_{F}^{2})$ (45)

$h_{2} (B) = \frac{1}{2} {‖ A - B ‖}_{F}^{2} + τ \cdot (\frac{\tilde{w}}{2} {‖ A ‖}_{⋆} + γ \frac{\tilde{w}}{2} {‖ A ‖}_{F}^{2})$ (46)

From the decomposition formula of the above matrix, $h_{1} (X) = h_{2} (B)$ , According to Lemma 1, one minimizer to the following minimization problem

$\min_{B} \frac{1}{2} {‖ A - B ‖}_{F}^{2} + τ \cdot (\frac{\tilde{w}}{2} {‖ A ‖}_{⋆} + γ \frac{\tilde{w}}{2} {‖ A ‖}_{F}^{2})$ (47)

The optimal solution is $B^{⋆} = U S_{\tilde{w}, γ} V^{T}$ . Since $h_{1} (X) = h_{2} (B)$ , the function $h_{2} (B^{⋆})$ the same minimum value as $h_{2} (X^{⋆})$ with $X^{⋆} = U S_{w, γ} V^{T}$ .

End of proof.

B2. Proof of Theorem 3

Proof Theorem 3 describes one minimizer to the following minimization problem

$\min_{X} {‖ X - Y ‖}_{F}^{2} + τ \cdot {‖ X ‖}_{WTNFN} \Leftrightarrow \min_{{\bar{X}}^{(i)}} \frac{1}{n_{3}} \sum_{i = 1}^{n_{3}} ({‖ {\bar{X}}^{(i)} - {\bar{Y}}^{(i)} ‖}_{F}^{2}) + τ \cdot {‖ {\bar{X}}^{(i)} ‖}_{WTNFN}$ (48)

From Lemma 2, the optimal solution of the above formula can be expressed as

${\bar{X}}^{(i)}^{⋆} = {\bar{U}}^{(i)} {\bar{S}}_{w, γ}^{(i)} ({\bar{S}}^{(i)}) {\bar{V}}^{(i)}^{T}$ (49)

where ${\bar{S}}_{w, γ}^{(i)} ({\bar{S}}^{(i)}) = \max {{(1 + γ w_{i, j})}^{- 1} ({\bar{S}}^{(i)} - w_{i, j}),0}$ , denotes by ${\bar{Y}}^{(i)} = {\bar{U}}^{(i)} {\bar{S}}^{(i)} {\bar{V}}^{(i)}^{T}$ the singular value decomposition of matrix ${\bar{Y}}^{(i)}$ . So denotes $\bar{U}$ , ${\bar{S}}_{w, γ}$ , $\bar{V}$ as three tensors. Their frontal slices are ${\bar{U}}^{(i)}$ , ${\bar{S}}_{w, γ}^{(i)}$ , ${\bar{V}}^{(i)}$ . Performing the inverse fast Fourier transform to get $U = i f f t (\bar{U}, [],3)$ , $S_{w, γ} = i f f t ({\bar{S}}_{w, γ}, [],3)$ , $V = i f f t (\bar{V}, [],3)$ , the optimal solution as follow

$X = U S_{w, γ} V^{T}$ (50)

End of proof.

B3. Proof of Corollary 1

Proof From Lemma 1, we can see that the weighted near-end operator problem:

$\min_{X} {‖ X - Y ‖}_{F}^{2} + τ \cdot {‖ X ‖}_{WTNFN} = \min_{{\bar{X}}^{(i)}} \frac{1}{n_{3}} \sum_{i = 1}^{n_{3}} ({‖ {\bar{X}}^{(i)} - {\bar{Y}}^{(i)} ‖}_{F}^{2}) + τ \cdot {‖ {\bar{X}}^{(i)} ‖}_{WTNFN}$ (51)

The closed form solution is

${\bar{X}}^{(i)}^{⋆} = {\bar{U}}^{(i)} {\bar{S}}_{w, γ}^{(i)} ({\bar{S}}^{(i)}) {\bar{V}}^{(i)}^{T}$ (52)

where

${\bar{S}}_{w, γ}^{(i)} = (\begin{matrix} diag (σ_{1} ({\bar{X}}^{(i)}^{⋆}), \dots, σ_{\min {n_{1}, n_{2}}} ({\bar{X}}^{(i)}^{⋆})) \\ 0 \end{matrix})$ (53)

According to the updated rules of the algorithm, in each iteration, the weight ${w_{i, j}^{(k)}}$ are in a non-descending order, and the singular values $σ_{j} ({\bar{X}}^{(i)})$ can be updated independently in a non-increasing order as

$\begin{array}{l} σ_{j} ({\bar{X}}_{k}^{(i)}) = \max {{(1 + γ w_{i, j}^{(k - 1)})}^{- 1} (σ_{j} ({\bar{Y}}^{(i)}) - \frac{2}{τ} w_{i, j}^{(k - 1)}),0} \\ = \max {{(1 + γ C {(σ_{j} ({\bar{X}}_{k - 1}^{(i)}) + ε)}^{- 1})}^{- 1}, (σ_{j} ({\bar{Y}}^{(i)}) - - \frac{2}{τ} C {(σ_{j} ({\bar{X}}_{k - 1}^{(i)}) + ε)}^{- 1}),0} \end{array}$ (54)

Based on the prior condition that the weights are initialized as $w_{i, j}^{(0)} = \frac{C}{σ_{j} ({\bar{Y}}^{(i)}) + ε}$ , then we have

$\begin{array}{l} σ_{j} ({\bar{X}}_{1}^{(i)}) = \max {{(1 + γ w_{i, j}^{(0)})}^{- 1} (σ_{j} ({\bar{Y}}^{(i)}) - \frac{2}{τ} w_{i, j}^{(0)}),0} \\ = \max {{(1 + γ C {(σ_{j} ({\bar{Y}}^{(i)}) + ε)}^{- 1})}^{- 1}, (σ_{j} ({\bar{Y}}^{(i)}) - - \frac{2}{τ} C {(σ_{j} ({\bar{Y}}^{(i)}) + ε)}^{- 1}),0} \\ \leq σ_{j} ({\bar{Y}}^{(i)}) \end{array}$ (55)

Since $γ > 0, C > 0$ . Then in the next iteration, we have

$\begin{array}{l} σ_{j} ({\bar{X}}_{2}^{(i)}) = \max {{(1 + γ w_{i, j}^{(1)})}^{- 1} (σ_{j} ({\bar{Y}}^{(i)}) - \frac{2}{τ} w_{i, j}^{(1)}),0} \\ = \max {{(1 + γ C {(σ_{j} ({\bar{X}}_{2}^{(i)}) + ε)}^{- 1})}^{- 1}, (σ_{j} ({\bar{Y}}^{(i)}) - \frac{2}{τ} C {(σ_{j} ({\bar{X}}_{2}^{(i)}) + ε)}^{- 1}),0} \\ \leq \max {{(1 + γ C {(σ_{j} ({\bar{Y}}^{(i)}) + ε)}^{- 1})}^{- 1}, (σ_{j} ({\bar{Y}}^{(i)}) - - \frac{2}{τ} C {(σ_{j} ({\bar{Y}}^{(i)}) + ε)}^{- 1}),0} \\ = σ_{j} ({\bar{X}}_{1}^{(i)}) \end{array}$ (56)

Then, the weights in the first and second iteration satisfy:

$w_{i, j}^{(2)} = C {(σ_{j} ({\bar{X}}_{2}^{(i)}) + ε)}^{- 1} \geq C {(σ_{j} ({\bar{X}}_{1}^{(i)}) + ε)}^{- 1} = w_{i, j}^{(1)}$ (57)

Hence, in the I iteration, suppose that, holds for all the singular values, then the inequality

$\begin{array}{l} σ_{j} ({\bar{X}}_{I + 1}^{(i)}) \\ = \max {{(1 + γ C {(σ_{j} ({\bar{X}}_{I}^{(i)}) + ε)}^{- 1})}^{- 1}, (σ_{j} ({\bar{Y}}^{(i)}) - \frac{2}{τ} C {(σ_{j} ({\bar{X}}_{I}^{(i)}) + ε)}^{- 1}),0} \\ \leq \max {{(1 + γ C {(σ_{j} ({\bar{X}}_{I - 1}^{(i)}) + ε)}^{- 1})}^{- 1}, (σ_{j} ({\bar{Y}}^{(i)}) - \frac{2}{τ} C {(σ_{j} ({\bar{X}}_{I - 1}^{(i)}) + ε)}^{- 1}),0} \\ = σ_{j} ({\bar{X}}_{I}^{(i)}) \end{array}$ (58)

In summary, $\forall j = 1, 2, \dots, \min {n_{1}, n_{2}}$ , $i = 1, 2, 3, \dots, n_{3}$ , $σ_{j} ({\bar{X}}_{k}^{(i)}) \leq σ_{j} ({\bar{X}}_{k - 1}^{(i)})$ and $w_{i, j}^{(k)} \geq w_{i, j}^{(k - 1)}$ . Because the singular value ${σ_{j} ({\bar{X}}_{k}^{(i)})}$ is bounded below by 0, then the sequence is convergent. And sequence ${σ_{j} ({\bar{X}}_{k}^{(i)})}$ converges to a non-negative number $σ_{j} ({\bar{X}}_{k}^{(i) ⋆})$ .

$\begin{array}{l} σ_{j} ({\bar{X}}^{(i) ⋆}) = \lim_{k \to \infty} σ_{j} ({\bar{X}}_{k}^{(i)}) \\ = \max {{(1 + γ \lim_{k \to \infty} w_{i, j}^{(k)})}^{- 1} (σ_{j} ({\bar{Y}}^{(i)}) - \frac{2}{τ} \lim_{k \to \infty} w_{i, j}^{(k)}),0} \\ = {(1 + γ C {(σ_{j} ({\bar{X}}^{(i) ⋆}) + ε)}^{- 1})}^{- 1} (σ_{j} ({\bar{Y}}^{(i)}) - \frac{2}{τ} C {(σ_{j} ({\bar{X}}^{(i) ⋆}) + ε)}^{- 1}) \end{array}$ (59)

Then we have

$σ_{j} ({\bar{X}}^{(i) ⋆}) = {\begin{cases} 0 if c_{2} > 0 \\ \frac{c_{1} + \sqrt{c_{2}}}{2} if c_{2} > 0 \end{cases}$ (60)

where $c_{2} = τ^{2} {(σ_{j} ({\bar{Y}}^{(i)}) - ε - γ C)}^{2} - 4 (2 C - τ ε σ_{j} ({\bar{Y}}^{(i)}))$ and $c_{1} = τ (σ_{j} ({\bar{Y}}^{(i)}) - ε - γ C)$ .

End of proof.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	Tucker, L.R. (1963) Implications of Factor Analysis of Three-Way Matrices for Measurement of Change. Problems in Measuring Change, 15, 122-137. https://doi.org/10.1108/09534810210423008
[2]	Vasilescu, M.A.O. and Terzopoulos, D. (2002) Multilinear Analysis of Image Ensembles: TensorFaces. In: Heyden, A., et al., Eds., European Conference on Computer Vision, Springer, Berlin, 447-460. https://doi.org/10.1007/3-540-47969-4_30
[3]	Sidiropoulos, N.D., De Lathauwer, L., Fu, X., et al. (2017) Tensor Decomposition for Signal Processing and Machine Learning. IEEE Transactions on Signal Processing, 65, 3551-3582. https://doi.org/10.1109/TSP.2017.2690524
[4]	Lai, Z., Wong, W.K., Jin, Z., et al. (2012) Sparse Approximation to the Eigensubspace for Discrimination. IEEE Transactions on Neural Networks and Learning Systems, 23, 1948-1960. https://doi.org/10.1109/TNNLS.2012.2217154
[5]	De Lathauwer, L. and De Moor, B. (1998) From Matrix to Tensor: Multilinear Algebra and Signal Processing. Institute of Mathematics and Its Applications Conference Series 67, Oxford University Press, Oxford, 1-16.
[6]	Cichocki, A., Mandic, D., De Lathauwer, L., et al. (2015) Tensor Decompositions for Signal Processing Applications: From Two-Way to Multiway Component Analysis. IEEE Signal Processing Magazine, 32, 145-163. https://doi.org/10.1109/MSP.2013.2297439
[7]	Song, Q., Huang, X., Ge, H., et al. (2017) Multi-Aspect Streaming Tensor Completion. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, 13-17 August 2017, 435-443. https://doi.org/10.1145/3097983.3098007
[8]	Kolda, T.G. and Bader, B.W. (2009) Tensor Decompositions and Applications. SIAM Review, 51, 455-500. https://doi.org/10.1137/07070111X
[9]	Fazel, M. (2002) Matrix Rank Minimization with Applications. PhD Thesis, Stanford University, Redwood City.
[10]	Kiers, H.A. (2000) Towards a Standardized Notation and Terminology in Multiway Analysis. Journal of Chemometrics: A Journal of the Chemometrics Society, 14, 105-122. https://doi.org/10.1002/1099-128X(200005/06)14:3<105::AID-CEM582>3.0.CO;2-I
[11]	Tucker, L.R. (1966) Some Mathematical Notes on Three-Mode Factor Analysis. Psychometrika, 31, 279-311. https://doi.org/10.1007/BF02289464
[12]	Kapteyn, A., Neudecker, H. and Wansbeek, T. (1986) An Approach Ton-Mode Components Analysis. Psychometrika, 51, 269-275. https://doi.org/10.1007/BF02293984
[13]	Hillar, C.J. and Lim, L.-H. (2013) Most Tensor Problems Are NP-Hard. Journal of the ACM, 60, Article No. 45. https://doi.org/10.1145/2512329
[14]	Liu, J., Musialski, P., Wonka, P. and Ye, J. (2013) Tensor Completion for Estimating Missing Values in Visual Data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 208-220. https://doi.org/10.1109/TPAMI.2012.39
[15]	Romera-Paredes, B. and Pontil, M. (2013) A New Convex Relaxation for Tensor Completion. 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, 5-10 December 2013, 2967-2975.
[16]	Kilmer, M.E. and Martin, C.D. (2011) Factorization Strategies for Third-Order Tensors. Linear Algebra and Its Applications, 435, 641-658. https://doi.org/10.1016/j.laa.2010.09.020
[17]	Lu, C., Feng, J., Liu, W., Lin, Z. and Yan, S. (2020) Tensor Robust Principal Component Analysis with a New Tensor Nuclear Norm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 925-938.
[18]	Peng, X., Lu, C., Yi, Z. and Tang, H. (2018) Connections between Nuclear-Norm and Frobenius-Norm-Based Representations. IEEE Transactions on Neural Networks and Learning Systems, 29, 218-224. https://doi.org/10.1109/TNNLS.2016.2608834
[19]	Zhang, H., Yi, Z. and Peng, X. (2014) fLRR: Fast Low-Rank Representation Using Frobenius-Norm. Electronics Letters, 50, 936-938. https://doi.org/10.1049/el.2014.1396
[20]	Kilmer, M.E., Braman, K., Hao, N. and Hoover, R.C. (2013) Third-Order Tensors as Operators on Matrices: A Theoretical and Computational Framework with Applications in Imaging. SIAM Journal on Matrix Analysis and Applications, 34, 148-172. https://doi.org/10.1137/110837711
[21]	Zhang, Z., Ely, G., Aeron, S., Hao, N. and Kilmer, M. (2014) Novel Methods for Multilinear Data Completion and De-Noising Based on Tensor-SVD. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 3842-3849.
[22]	Wang, H., Zhang, F., Wang, J., et al. (2022) Generalized Nonconvex Approach for Low-Tubal-Rank Tensor Recovery. IEEE Transactions on Neural Networks and Learning Systems, 33, 3305-3319. https://doi.org/10.1109/TNNLS.2021.3051650
[23]	Gu, S., Xie, Q., Meng, D., et al. (2017) Weighted Nuclear Norm Minimization and Its Applications to Low Level Vision. International Journal of Computer Vision, 121, 183-208. https://doi.org/10.1007/s11263-016-0930-5
[24]	Clarke, F.H. (1990) Optimization and Nonsmooth Analysis. Society for Industrial and Applied Mathematics, Philadelphia. https://doi.org/10.1137/1.9781611971309
[25]	Beck, A. and Teboulle, M. (2009) A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2, 183-202. https://doi.org/10.1137/080716542
[26]	Lu, C., Peng, X. and Wei, Y. (2019) Low-Rank Tensor Completion with a New Tensor Nuclear Norm Induced by Invertible Linear Transforms. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 16-20 June 2019, 5996-6004. https://doi.org/10.1109/CVPR.2019.00615
[27]	Xue, S., Qiu, W., Liu, F., et al. (2018) Low-Rank Tensor Completion by Truncated Nuclear Norm Regularization. 2018 24th International Conference on Pattern Recognition (ICPR) IEEE, Beijing, 20-24 August 2018, 2600-2605. https://doi.org/10.1109/ICPR.2018.8546008
[28]	Zhou, P., Lu, C., Lin, Z., et al. (2017) Tensor Factorization for Low-Rank Tensor Completion. IEEE Transactions on Image Processing, 27, 1152-1163. https://doi.org/10.1109/TIP.2017.2762595

Journals Menu

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies