A Geometric Gaussian Kaczmarz Method for Large Scaled Consistent Linear Equations ()
1. Introduction
We are concerned with the approximation solution of large-scaled linear systems of equation of the form
(1)
where
is a real matrix,
is a real vector and
is an unknown vector to be determined.
The Kaczmarz [2] method is a good candidate for solving large problems (1) due to its simplicity and good performance, which is applied in numerous fields, such as image reconstruction [3] [4] and digital signal processing [5]. The Kaczmarz method can be formulated as
(2)
where
, i.e. all m equations in the linear systems (1) are swept through after m iterations.
There are many extended Kaczmarz methods are derived in recent years. Strohmer and Vershynin in [6] proposed a randomized Kaczmarz (RK) method with the expected exponential rate of convergence, which is also known as “linear convergence”. In 2018, Bai and Wu [7] proposed a greedy randomized Kaczmarz (GRK) method. GRK introduces an effective probability criterion to grasp larger entries of the residual vector at each iteration and was proved to be of the linear convergence rate. The index set is determined by
(3)
where
(4)
In order to improve the influence on the convergence rate by diagonal scalings of the coefficient matrix in (1). Yang in [1] presented a geometric probability randomized Kaczmarz algorithm (GPRK) to determine a subset
of
by a given rule such that the magnitude of the value of
should be larger than a prescribed threshold if the row index
, where
(5)
Generally, the block Kaczmarz methods [8] [9] select some rows from the coefficient matrix A to construct the block matrix and compute the Moore-Penrose pseudoinverse of the block at each iteration. However, the cost of computing the Moore-Penrose pseudoinverse of a matrix is so expensive, which impacts gravely the CPU time of the algorithm.
Gower and Richtrik proposed a Gaussian Kaczmarz (GK) method [10] whose iteration process is defined by
(6)
where
is a Gaussian vector with mean
and the covariance matrix
, i.e.,
. Here I denotes the identity matrix. The expected linear convergence rate was analyzed in the case that A is of full column rank. The idea of the GK method is also used in [11] to avoid computing the pseudoinverses of submatrices. We will adapt this iteration process in our method.
In this paper, we improve the GPRK method in [1] and present a geometric Gaussian Kaczmarz (GGK) method. The GGK introduces a new block strategy and uses the iteration process of the GK method in (6). The block set strategy of GGK is more efficient than that in [1] because it can grasp as many as possible rows of the system matrix to participate in projection. This is illustrated in Section 3.
The rest of this paper is arranged as follows. A geometric Gaussian Kaczmarz algorithm is presented in Section 2. Its convergence is also proved. Section 3 shows several numerical examples for the proposed method and Section 4 draws some conclusions.
2. A Geometric Gaussian Kaczmarz Algorithm
This section describes a geometric Gaussian Kaczmarz (GGK) algorithm to compute the solution of (1). Algorithm 1 summarizes the GGK algorithm. Steps 2, 3 and 4 determine the block control sequence
, which is simpler than and different from that in [1], which determines the block control sequence by probability
Steps 5 and 6 give the iteration process of the GGK method.
The following results show the convergence of Algorithm 1.
Theorem 1. Assume the linear system (1) is consistent, and then the iterative sequence
generated by Algorithm 1 converges to the least-norm solution
of linear systems (1). Moreover, the solution error of linear systems (1) satisfies
(7)
Proof. According to Algorithm 1, we have
Algorithm 1. A geometric Gaussian Kaczmarz algorithm (GGK).
Denote the projector
, then
is orthogonal because
and
. Thus we have
The last equality holds because
Let
denote the matrix whose columns orderly are constituted of all the vector
with
, then
. Denoted by
, we have
moreover,
It then holds that
For each
, since
we have
This completes the proof. □
We remark that
which means that Algorithm 1 is convergent. In fact,
and
it follows that
3. Numerical Experiments
In this section, we use Algorithm 1 for solving different types of consistent linear systems (1) and compare it with GPRK in [1]. All experiments are carried out using MATLAB (version R2019b) on a personal computer with 2.0 GHz central processing unit (Inter(R) Core(TM) i7-8565U CPU) and 8 GB memory, and 64 bit Windows operation system.
The coefficient matrix
is either generated by the MATLAB function
or taken from the University of Florida sparse matrix collection [12]. To ensure each linear system (1) with a selected matrix A is consistent, we let
in (1) be generated by
, where the exact solution
is produced randomly by the MATLAB function randn and is to be determined. The efficiency of each method is evaluated by the number of iterations (IT) and the CPU time (CPU). It can be measured by the speed-up of GGK against GPRK (SU) is defined by
The effectiveness of both methods is measured by the relative residual (RR) defined by
We set the initial solution
be 0 in all experiments, and the iteration does not terminate until
. The numerical results of each method shown in this section are arithmetical average quantities with respect to 50 repeated trials. We set
in GGK for each example.
3.1. Experiments with Sparse Matrix
This subsection considers the linear systems (1) with sparse matrices. These matrices include some flat ones in Table 1 and the tall ones (their transposes) in Table 2 from the University of Florida sparse matrix collection [12].
Table 1. Comparison of IT, CPU and SU of the GGK method with that of the GPRK method for different flat spare systems.
Table 2. Comparison of IT, CPU and SU of the GGK method with that of the GPRK method for different thin spare systems.
Table 1 and Table 2 list the average number of iterations (IT), computing time (CPU), and the related speed-up (SU) of GGK against GPRK. Figure 1 and Figure 2 plot RR versus IT (left) and CPU (right) of different methods for solving (1) with the matrix bibd_16_8 and crew1T, respectively. From Table 1 and Table 2 we see that the GGK method needs smaller IT and CPU than the GPRK method does in all cases. We can see whether system 1 is overdetermined or underdetermined. When the matrix is flat, the value of SU locates in the interval (2.70, 628.92), while for the case of the thin matrix, the value of SU ranges from 2.49 to 641.08. It is observed from Figure 1 and Figure 2 that the GGK method converges faster than the GPRK method.
3.2. Experiments with Dense Matrix
In this subsection, the test matrices are dense normally distributed random matrices including thin and flat matrices. The sizes of rows and columns of the selected matrices vary from 2000 to 30,000.
Table 3 lists the test results of IT, CPU and SU for flat dense matrix with different size. Table 4 displays IT, CPU and SU for a different thin dense matrix with different size. Figure 3 and Figure 4 plot RR versus IT (left) and CPU (right) of GGK and GPRK methods for solving (1) with the matrix
and
, respectively. From Table 1 and Table 2, we can
Figure 1. RR versus IT (left) and CPU (right) of the GGK method compared with the GPRK method for solving (1) with the matrix bibd_16_8.
Figure 2. RR versus IT (left) and CPU (right) of the GGK method compared with the GPRK method for solving (1) with the matrix crew1T.
Table 3. Comparison of IT, CPU and SU of the GGK method with that of the GPRK method for different flat dense systems.
Table 4. Comparison of IT, CPU and SU of the GGK method with that of the GPRK method for different thin dense systems.
see in all cases, GGK needs less IT and CPU time than GPRK does, and the speed-up of GGK against GPRK ranges from tens of times to hundreds of times, i.e., the speed-up of GGK against GPRK varies from 73.30 to 227.15 in the case of flat and from 11.39 to 425.19 in the case of thin. Similar results to Figure 1
Figure 3. RR versus IT (left) and CPU (right) of the GGK method compared with the GPRK method for solving (1) with the matrix
.
Figure 4. RR versus IT (left) and CPU (right) of the GGK method compared with the GPRK method for solving (1) with the matrix
.
and Figure 2 are drawn from Figure 3 and Figure 4 that the GGK method converges much faster than the GPRK method.
4. Conclusion
We develop a geometric Gauss-Kaczmarz (GGK) algorithm for solving large-scale consistent linear systems and the convergence is proved for this algorithm. Numerical experiments show that the GGK algorithm has better efficiency and effectiveness than the GPRK algorithm. In our future work, we will focus on block Kaczmarz methods to solve ill-posed problems.
Acknowledgements
The authors thank the reviewers for providing some helpful comments. Research by G.H. was supported in part by Application Fundamentals Foundation of STD of Sichuan (grant 2020YJ0366) and the Opening Project of Sichuan Province University Key Laboratory of Bridge Non-destructionDetecting and Engineering Computing (grant 2020QZJ03), and research by F.Y. was partially supported by NNSF (grant 11501392) and SUSE (grant 2019RC09, 2020RC25), and research by Y.M. Liao was partially supported by the Innovation Fund of Postgraduate of SUSE (grant y2021101).