On Accelerated Singular Value Thresholding Algorithm for Matrix Completion

An accelerated singular value thresholding (SVT) algorithm was introduced for matrix completion in a recent paper [1], which applies an adaptive line search scheme and improves the convergence rate from O(1/N) for SVT to O(1/N2), where N is the number of iterations. In this paper, we show that it is the same as the Nemirovski’s approach, and then modify it to obtain an accelerate Nemirovski’s technique and prove the convergence. Our preliminary computational results are very favorable.


Introduction
In many practical problems of interest, such as recommender system [2]- [4], one would like to recover a matrix from a small sampling of its entries.These problems can be formulated as the following matrix completion problem: ( ) ( ) where is the given incomplete data matrix, Ω is the set of locations corresponding to the observed entries.The problem (1) is NP-hard because the rank function is non-convex and discontinuous.However, it is known that the rank of a matrix is equal to the number of nonvanishing singular values, and the nuclear norm L. Wang et al.
3446 * ⋅ stand for the sum of the singular values.Thus, the rank minimization problem (1) can be relaxed as following convex program problem: and Candès and Recht [5] proved that under suitable conditions, the program (1) and ( 2) are formally equivalent in the sense that they have exactly the same unique solution.Now, there are many algorithms for solving the model ( 2), such as singular value thresholding (SVT) algorithm [6], accelerated proximal gradient (APG) algorithm [7] and so on.The singular value thresholding (SVT) algorithm solves the following problem: where P Ω denotes the orthogonal projector onto the span of matrices vanishing outside of Ω so that the ( ) and zero otherwise.In [6], Cai et al. showed that the optimal solution to the problem (3) converges to that of (2) as τ → ∞ .
However, SVT has only a global convergence rate of ( ) where N is the number of iterations.Then Liu Jun et al. [1] proposed an accelerated SVT algorithm by considering the Lagrange dual problem of (1.3) as follows: , 2 erator [1], and ( ) h Y is convex and continuously differentiable with Lipschitz continuous gradient.In the accelerated SVT algorithm, an adaptive line search scheme was adopted based on Nemirovski's technique [8].For reference purpose, in the following we list the related key steps of the algorithm in [1]: Step 7: compute ( ) Step 8: if ( ) ( ) ( ) Step 9: go to step 14; Step 10: else; Step 11: 2 Step 12: end if; Step 13: end while; Step 14: set , ( ) Comparing with Nemirovski's scheme for updating L k , i.e.
in step 14, Liu Jun et al. [1] [9] used a more flexible scheme, in which L k is not required to monotonically increase.As L k is decreased, the step size 1 k L is increased, it is expected that the number of iterations may be reduced.The idea is really attractive.However, it is found that the approach is just the same as the Nemirovski's algorithm in the following section.

Main Procedure
First, we declare that the value of ω (in step 14) is always not bigger than 2, this means that it is just the same as Nemirovski's line search scheme.
Since ( ) h Y is convex and continuously differentiable with Lipschitz continuous gradient, we have , where L is the Lipschitz gradient constant [10].
Define function which along with ( ) From ( 5), ( 6) and ( 7), we have Combining ( 8), ( 9), ( ) ( ) ( ) Then, we make a few improvement based on the original algorithm to obtain a revised algorithm.The overall steps can be organized as follows: Algorithm 1 The Modified ASVT Algorithm Step 1: Input: ( ) , Step 4: compute Step 5: compute ( ) Step 6: while 1 do; Step 7: compute ( ) Step 8: if ( ) ( ) ( ) Step 9: , go to step 3; Step 10: else; Step 11: Step 12: end if; Step 13: end while; Step 14: set Step 15: end for.The convergence of algorithm 1 is given by the following theorem.And in this algorithm, the upper bounds of the order of convergence have nothing to do with the initial value L 1 , which means that the modified scheme improved and accelerated the Nemirovski's technique.
Theorem 1 For large enough k, the approximate solution Y k obtained by the modified algorithm satisfies ( ) ( ) where R h is the distance from the starting point to the optimal solution set.
Proof.According to the convergence of Nemirovski's algorithm ([8] Theorem 10.2.2), it has ( ) ( ) In the following we declare that for large enough k in our modified algorithm.
Suppose that there exists a positive integer k such that , where l denotes the number of implementing step 11 in kth iteration.Since the test condition in step 8 is satisfied when k L L ≥ , it easily follows that 2 2 . Therefore, we have = , then by recurrence, we obtain L → ∞ , which is impossible by finiteness of the initial value L 1 .Thus, when k is large enough, we have . This completes the proof.

Computational Results
In this section, numerical experiments with MATLAB were performed to compare the performance of the Nemirovski's technique algorithm and further to gain an insight into the behavior of our approach on synthetic dataset.
Code Ne-SVT and M-ASVT, based on the Nemirovski's technique SVT method and our modified algorithm, respectively.Specific problems as follows: similar to the paper [1], we generate m × n matrices M of rank r by randomly selecting two matrices of M L and M R , with the dimension of m × r and r × n, respectively.Each having independent identically distributed Gaussian entries, and setting Suppose M Ω is the observed part of M, and the set of observed indices is sampled uniformly at random.Let p be the ratio, that is z p n mn = , where n z is the number of the observed entries.Then different algorithms will be used to recover the missing entries from the partially observed information by solving the optimization problem (3) with a given parameter τ .In our tests, τ was set to t mn on the basis of Cai et al. [6], where 2 t ≤ ≤ .We adopt the relative reconstruction error defined by following: where X is the computed solution of an algorithm, then there is using the "error" to evaluate the quality of the algorithm.In addition, we set 1 1 L = , 0.8 η = for all test problems.Compiled using MATLAB2010, both Ne-SVT and M-ASVT were run under a Windows XP system on a AMD Fusion APU E-450 1.65 Ghz personal computer with 1.98 GB of memory and about 16 digits of precision.
Firstly, we compare the relative error between Ne-SVT and M-ASVT for solving the randomly generated low rank matrix problem.The initial parameters follows: = 100, n = 50, r = 5, p = 0.9, so meaning that 90% entries are observed.We will recover the other 10% entries by running Ne-SVT and M-ASVT separately.Table 1 reports the relative reconstruction error of different methods after 30, 60, 90 and 120 iterations.We can observe that the convergence rate of M-ASVT is really faster than that of the other, which is consistent with our analysis, and the further shows we can see Figure 1.
Then we can still test these algorithms with different settings on the following different low rank matrix completion problems: 1) Fix the matrix size ( ) , m n , the rank r and the ratio of the observed entries p. Then test the performance with respect to different choices of the parameter τ.We fix m = 100, n = 50, r = 3, p = 0.9, and let τ change from 2 mn to 5 mn ; 2) Fix some number remains the same, and only let p change from 0.5 to 0.9; 3) Fix some number remains the same, let rank r change from 3 to 15; 4) Fix some number remains the same, only change the size of M.
Finally, Table 2 shows the comparative results of randomly generated matrix completion problems, in which we choose error < 10 −6 as the stop condition.Clearly, we can know that the more smaller τ, the more bigger p,   the more smaller rank r and the more bigger the size, then the more smaller the error and the more better efficiently.And we can observe that the M-ASVT performance surpasses Ne-SVT in all cases.

Conclusion
In this paper, we point out that the ASVT algorithm [1] is essentially the same as the Nemirovski's approach, then modify it to obtain the accelerate Nemirovski's approach and prove the convergence.We also give the comparative results of the convergence rate of Ne-SVT and M-ASVT.The preliminary computational results show that our approach is really more efficient.We empirically choose 0.8 η = ( ) in our test, and then we plan to study the better choice of η and develop the adaptive line search scheme to further improve the algorithm.

Figure 1 .
Figure 1.Convergence rate of Ne-SVT and M-ASVT on synthetic data.

Table 1 .
Relative error comparison between Ne-SVT and M-ASVT.

Table 2 .
Comparisons between Ne-SVT and M-ASVT on the synthetic dataset with different settings.