^{1}

^{2}

^{*}

^{1}

^{3}

This paper presents a geometric Gaussian Kaczmarz (GGK) method for solving the large-scaled consistent linear systems of equation. The GGK method improves the geometric probability randomized Kaczmarz method in [1] by introducing a new block set strategy and the iteration process. The GGK is proved to be of linear convergence. Several numerical examples show the efficiency and effectiveness of the GGK method.

We are concerned with the approximation solution of large-scaled linear systems of equation of the form

A x = b , (1)

where A ∈ ℝ m × n is a real matrix, b ∈ ℝ m is a real vector and x ∈ ℝ n is an unknown vector to be determined.

The Kaczmarz [

x k + 1 = x k + b ( i ) − A ( i ) x k ‖ A ( i ) ‖ 2 2 ( A ( i ) ) T (2)

where i = ( k mod m ) + 1 , i.e. all m equations in the linear systems (1) are swept through after m iterations.

There are many extended Kaczmarz methods are derived in recent years. Strohmer and Vershynin in [

u k = { i | | b ( i ) − A ( i ) x k | 2 ≥ ε k ‖ b − A x k ‖ 2 2 ‖ A ( i ) ‖ 2 2 } , (3)

where

ε k = 1 2 ( 1 ‖ b − A x k ‖ 2 2 max 1 ≤ i ≤ m { | b ( i ) − A ( i ) x k | 2 ‖ A ( i ) ‖ 2 2 } + 1 ‖ A ‖ F 2 ) . (4)

In order to improve the influence on the convergence rate by diagonal scalings of the coefficient matrix in (1). Yang in [

d i k = ‖ b ( i ) − A ( i ) x k ‖ A ( i ) ‖ 2 2 ( A ( i ) ) T ‖ . (5)

Generally, the block Kaczmarz methods [

Gower and Richtrik proposed a Gaussian Kaczmarz (GK) method [

x k + 1 = x k + ζ k T ( b − A x k ) ‖ A T ζ k ‖ 2 2 A T ζ k , (6)

where ζ k is a Gaussian vector with mean 0 ∈ ℝ m and the covariance matrix I ∈ ℝ m × m , i.e., ζ k ~ N ( 0, I ) . Here I denotes the identity matrix. The expected linear convergence rate was analyzed in the case that A is of full column rank. The idea of the GK method is also used in [

In this paper, we improve the GPRK method in [

The rest of this paper is arranged as follows. A geometric Gaussian Kaczmarz algorithm is presented in Section 2. Its convergence is also proved. Section 3 shows several numerical examples for the proposed method and Section 4 draws some conclusions.

This section describes a geometric Gaussian Kaczmarz (GGK) algorithm to compute the solution of (1). Algorithm 1 summarizes the GGK algorithm. Steps 2, 3 and 4 determine the block control sequence { τ k } k ≥ 0 , which is simpler than and different from that in [

P r ( r o w = i k ) = max 1 ≤ i ≤ m ( d i k ) 2 ∑ i = 1 m ( d i k ) 2 .

Steps 5 and 6 give the iteration process of the GGK method.

The following results show the convergence of Algorithm 1.

Theorem 1. Assume the linear system (1) is consistent, and then the iterative sequence { x k } k ≥ 0 generated by Algorithm 1 converges to the least-norm solution x ∗ = A † b of linear systems (1). Moreover, the solution error of linear systems (1) satisfies

‖ x k + 1 − x * ‖ 2 2 ≤ ( 1 − η ‖ A τ k ‖ F 2 λ max ( A τ k A τ k T ) λ min ( A T A ) ‖ A ‖ F 2 ) ‖ x k − x * ‖ 2 2 . (7)

Proof. According to Algorithm 1, we have

x k + 1 − x * = x k − x * + ζ k T ( b − A x k ) ‖ A T ζ k ‖ 2 2 A T ζ k = x k − x * − ( A T ζ k ) T ( x k − x * ) ‖ A T ζ k ‖ 2 2 A T ζ k = ( I − ( A T ζ k ) T A T ζ k ‖ A T ζ k ‖ 2 2 ) ( x k − x * ) .

Algorithm 1. A geometric Gaussian Kaczmarz algorithm (GGK).

Denote the projector P k = ( A T ζ k ) T A T ζ k ‖ A T ζ k ‖ 2 2 , then P k is orthogonal because P k T = P k and P k 2 = P k . Thus we have

‖ x k + 1 − x * ‖ 2 2 = ‖ ( I − ( A T ζ k ) T A T ζ k ‖ A T ζ k ‖ 2 2 ) ( x k − x * ) ‖ 2 2 = ‖ x k − x * ‖ 2 2 − ‖ ζ k T ( b − A x k ) ‖ A T ζ k ‖ 2 2 A T ζ k ‖ 2 2 = ‖ x k − x * ‖ 2 2 − | ζ k T ( b − A x k ) | 2 ‖ A T ζ k ‖ 2 2 = ‖ x k − x * ‖ 2 2 − ‖ ζ k ‖ 2 4 ‖ A T ζ k ‖ 2 2 .

The last equality holds because

ζ k T ( b − A x k ) = ∑ i ∈ τ k ( b ( i ) − A ( i ) x k ) e i T ( b − A x k ) = ∑ i ∈ τ k | b ( i ) − A ( i ) x k | 2 = ‖ ζ k ‖ 2 2 .

Let E k ∈ ℝ m × | τ k | denote the matrix whose columns orderly are constituted of all the vector e i ∈ ℝ m with i ∈ τ k , then A τ k = E k T A . Denoted by ζ ^ k = E k T ζ k , we have

‖ A T ζ k ‖ 2 2 = ζ k T A A T ζ k = ζ ^ k T E k T A A T E k ζ ^ k = ζ ^ k T A τ k A τ k T ζ ^ k = ‖ A τ k T ζ ^ k ‖ 2 2 ,

moreover,

‖ A T ζ k ‖ 2 2 ≤ λ max ( A τ k A τ k T ) ‖ ζ ^ k ‖ 2 2 = λ max ( A τ k A τ k T ) ‖ ζ k ‖ 2 2 .

It then holds that

‖ x k + 1 − x * ‖ 2 2 ≤ ‖ x k − x * ‖ 2 2 − ‖ ζ k ‖ 2 2 λ max ( A τ k A τ k T ) = ‖ x k − x * ‖ 2 2 − ∑ i ∈ τ k | b ( i ) − A ( i ) x k | 2 λ max ( A τ k A τ k T ) = ‖ x k − x * ‖ 2 2 − ∑ i ∈ τ k ( d i k ) 2 ‖ A ( i ) ‖ 2 2 λ max ( A τ k A τ k T ) ≤ ‖ x k − x * ‖ 2 2 − η max i ∈ τ k ( d i k ) 2 ∑ i ∈ τ k ‖ A ( i ) ‖ 2 2 λ max ( A τ k A τ k T ) = ‖ x k − x * ‖ 2 2 − η max i ∈ τ k ( d i k ) 2 ‖ A τ k ‖ F 2 λ max ( A τ k A τ k T ) .

For each k ≥ 0 , since

max i ∈ τ k ( d i k ) 2 = max i ∈ τ k | b ( i ) − A ( i ) x k | 2 ‖ A ( i ) ‖ 2 2 ≥ ∑ i = 1 m ‖ A ( i ) ‖ 2 2 ‖ A ‖ F 2 | b ( i ) − A ( i ) x k | 2 ‖ A ( i ) ‖ 2 2 = ‖ b − A x k ‖ 2 2 ‖ A ‖ F 2 ,

we have

‖ x k + 1 − x * ‖ 2 2 ≤ ‖ x k − x * ‖ 2 2 − η ‖ b − A x k ‖ 2 2 ‖ A ‖ F 2 ‖ A τ k ‖ F 2 λ max ( A τ k A τ k T ) ≤ ‖ x k − x * ‖ 2 2 − η λ min ( A T A ) λ max ( A τ k A τ k T ) ‖ A τ k ‖ F 2 ‖ A ‖ F 2 ‖ x k − x * ‖ 2 2 = ( 1 − η λ min ( A T A ) λ max ( A τ k A τ k T ) ‖ A τ k ‖ F 2 ‖ A ‖ F 2 ) ‖ x k − x * ‖ 2 2 .

This completes the proof. □

We remark that

0 ≤ 1 − η λ min ( A T A ) λ max ( A τ k A τ k T ) ‖ A τ k ‖ F 2 ‖ A ‖ F 2 ≤ 1,

which means that Algorithm 1 is convergent. In fact,

‖ A τ k ‖ F 2 λ max ( A τ k A τ k T ) ≥ 1,

and

0 < λ min ( A T A ) ‖ A ‖ F 2 ≤ 1 ,

it follows that

1 − η ‖ A τ k ‖ F 2 λ max ( A τ k A τ k T ) λ min ( A T A ) ‖ A ‖ F 2 ≤ 1 − η λ min ( A T A ) ‖ A ‖ F 2 < 1.

In this section, we use Algorithm 1 for solving different types of consistent linear systems (1) and compare it with GPRK in [

The coefficient matrix A ∈ ℝ m × n is either generated by the MATLAB function r a n d n ( m , n ) or taken from the University of Florida sparse matrix collection [

S U = C P U o f G P R K C P U o f G G K .

The effectiveness of both methods is measured by the relative residual (RR) defined by

R R = ‖ b − A x k ‖ 2 ‖ b ‖ 2 .

We set the initial solution x 0 be 0 in all experiments, and the iteration does not terminate until R R < 10 − 6 . The numerical results of each method shown in this section are arithmetical average quantities with respect to 50 repeated trials. We set η = 0.3 in GGK for each example.

This subsection considers the linear systems (1) with sparse matrices. These matrices include some flat ones in

Matrix Name | Matrix Size | Density | GPRK | GGK | SU | ||
---|---|---|---|---|---|---|---|

IT | CPU | IT | CPU | ||||

bibd_16_8 | 120 × 12,870 | 23.33% | 2142 | 0.9867 | 566.5 | 0.3650 | 2.70 |

crew1 | 135 × 6469 | 5.38% | 6346.9 | 1.3223 | 804 | 0.1815 | 7.29 |

df2177 | 630 × 10,358 | 0.34% | 3186.4 | 0.4443 | 40 | 0.0078 | 56.96 |

us04 | 163 × 28,016 | 6.52% | 4134.1 | 5.9783 | 1635 | 1.8062 | 3.31 |

GL7d25 | 2789 × 21,074 | 0.14% | 15,160 | 11.2236 | 556.1 | 0.3721 | 30.16 |

stat96v5 | 2307 × 75,779 | 0.13% | 17,897 | 21.0132 | 149 | 0.1499 | 140.18 |

bibd_81_3 | 3240 × 85,320 | 0.09% | 14,044 | 23.0241 | 38 | 0.0371 | 620.60 |

abtaha1^{T} | 209 × 14,596 | 1.68% | 30,081 | 6.6795 | 1118.7 | 0.2533 | 26.37 |

abtaha2^{T} | 331 × 37,932 | 1.09% | 62,803 | 48.0985 | 967.7 | 0.6389 | 75.28 |

mk12-b2^{T} | 1485 × 13,860 | 0.20% | 6477.2 | 1.2226 | 33.1 | 0.0115 | 106.31 |

ch7-9-b2^{T} | 1512 × 17,640 | 0.20% | 6387.7 | 2.2341 | 31.7 | 0.0097 | 230.32 |

relat7^{T} | 1045 × 21,924 | 0.36% | 53,325 | 27.6583 | 1020.2 | 0.4063 | 68.07 |

Franz9^{T} | 4164 × 19,588 | 0.12% | 19,264 | 8.5559 | 171.8 | 0.0700 | 122.23 |

Franz10^{T} | 4164 × 19,588 | 0.12% | 19,322 | 10.7741 | 172.6 | 0.0673 | 160.09 |

ch7-6-b3^{T} | 4200 × 12,600 | 0.10% | 21,260 | 9.4338 | 48.4 | 0.0150 | 628.92 |

relat7b^{T} | 1045 × 21,924 | 0.36% | 65,284 | 27.8051 | 1005.9 | 0.3955 | 70.30 |

Matrix Name | Matrix Size | Density | GPRK | GGK | SU | ||
---|---|---|---|---|---|---|---|

IT | CPU | IT | CPU | ||||

bibd_16_8^{T} | 12,870 × 120 | 23.33% | 1040.3 | 1.0199 | 299.2 | 0.2809 | 3.63 |

crew1^{T} | 6469 × 135 | 5.38% | 2613 | 0.4065 | 522.4 | 0.1632 | 2.49 |

df2177^{T} | 10,358 × 630 | 0.34% | 2023.7 | 0.7251 | 43.3 | 0.0102 | 71.09 |

us04^{T} | 28,016 × 163 | 6.52% | 1898.3 | 2.6322 | 548.7 | 0.5170 | 5.09 |

GL7d25^{T} | 21,074 × 2789 | 0.14% | 12911 | 14.6954 | 443.7 | 0.1799 | 81.69 |

stat96v5^{T} | 75,779 × 2307 | 0.13% | 8190.1 | 19.1244 | 153.0 | 0.1722 | 111.06 |

bibd_81_3^{T} | 85,320 × 3240 | 0.09% | 9241.2 | 20.9510 | 61.6 | 0.0738 | 283.89 |

abtaha1 | 14,596 × 209 | 1.68% | 1819.6 | 0.6833 | 235.8 | 0.0498 | 13.72 |

abtaha2 | 37,932 × 331 | 1.09% | 1855.5 | 1.5998 | 140.9 | 0.0814 | 19.65 |

mk12-b2 | 13,860 × 1485 | 0.20% | 4828.3 | 2.4008 | 43.8 | 0.0131 | 183.27 |

ch7-9-b2 | 17,640 × 1512 | 0.20% | 4734.2 | 2.3630 | 44.1 | 0.0128 | 184.61 |

relat7 | 21,924 × 1045 | 0.36% | 46,674 | 35.1415 | 887.4 | 0.3432 | 102.39 |

Franz9 | 19,588 × 4164 | 0.12% | 19,363 | 14.9236 | 203.9 | 0.0861 | 173.33 |

Franz10 | 19,588 × 4164 | 0.12% | 19,354 | 16.4306 | 200.8 | 0.0842 | 195.14 |

ch7-6-b3 | 12,600 × 4200 | 0.10% | 16,394 | 8.9110 | 51.9 | 0.0139 | 641.08 |

relat7b | 21,924 × 1045 | 0.36% | 46,708 | 35.0971 | 920.3 | 0.3826 | 91.73 |

^{T}, respectively. From

In this subsection, the test matrices are dense normally distributed random matrices including thin and flat matrices. The sizes of rows and columns of the selected matrices vary from 2000 to 30,000.

Matrix Size | GPRK | GGK | SU | ||
---|---|---|---|---|---|

IT | CPU | IT | CPU | ||

2000 × 10,000 | 21,569 | 297.5576 | 79.3 | 1.9810 | 98.85 |

2000 × 20,000 | 14,052 | 331.0652 | 50 | 2.5747 | 73.73 |

2000 × 30,000 | 11,762 | 495.7764 | 42.2 | 3.2314 | 73.30 |

2200 × 10,000 | 25,696 | 496.4009 | 86.9 | 6.0598 | 111.93 |

2200 × 20,000 | 16,259 | 537.1076 | 53.5 | 5.7038 | 135.89 |

2200 × 30,000 | 13,282 | 359.6292 | 43.9 | 7.1961 | 112.97 |

2400 × 10,000 | 29,720 | 535.0830 | 96.1 | 3.6249 | 205.64 |

2400 × 20,000 | 18,336 | 351.9513 | 55.6 | 3.9913 | 203.36 |
---|---|---|---|---|---|

2400 × 30,000 | 15,271 | 546.2991 | 45.6 | 6.2161 | 182.75 |

2600 × 10,000 | 35,591 | 337.0178 | 105.7 | 7.0060 | 162.16 |

2600 × 20,000 | 20,914 | 390.1169 | 59.5 | 5.9782 | 194.22 |

2600 × 30,000 | 17,055 | 494.0944 | 47.6 | 6.3896 | 168.46 |

2800 × 10,000 | 41,358 | 527.2221 | 116.6 | 6.6916 | 171.33 |

2800 × 20,000 | 23,539 | 497.0007 | 61.6 | 6.0664 | 196.34 |

2800 × 30,000 | 19,092 | 599.8468 | 49.3 | 6.0754 | 204.63 |

3000 × 10,000 | 46,862 | 544.3403 | 127.2 | 2.5496 | 213.50 |

3000 × 20,000 | 26,704 | 603.3008 | 65.6 | 2.6693 | 226.02 |

3000 × 30,000 | 21,106 | 718.8279 | 51.5 | 3.1646 | 227.15 |

Matrix Size | GPRK | GGK | SU | ||
---|---|---|---|---|---|

IT | CPU | IT | CPU | ||

10,000 × 2000 | 12,234 | 171.3824 | 61.5 | 0.8017 | 213.77 |

20,000 × 2000 | 7567.4 | 140.4876 | 34.7 | 1.3755 | 102.14 |

30,000 × 2000 | 6470 | 176.8242 | 27.6 | 1.3860 | 127.58 |

10,000 × 2200 | 14,776 | 441.3869 | 69.8 | 1.0381 | 425.19 |

20,000 × 2200 | 8633.6 | 392.7304 | 37.8 | 1.0948 | 358.72 |

30,000 × 2200 | 7323.6 | 405.4806 | 29.3 | 1.3893 | 291.86 |

10,000 × 2400 | 17,784 | 556.8869 | 76.9 | 1.8008 | 309.24 |

20,000 × 2400 | 9879.4 | 490.8223 | 40.8 | 1.6075 | 305.33 |

30,000 × 2400 | 8213.8 | 521.1035 | 31.2 | 2.4817 | 209.98 |

10,000 × 2600 | 21,949 | 633.4774 | 86.1 | 3.6521 | 173.46 |

20,000 × 2600 | 11,227 | 660.0913 | 43.2 | 2.3399 | 282.10 |

30,000 × 2600 | 9188.2 | 684.8802 | 33.3 | 3.3046 | 207.25 |

10,000 × 2800 | 25,698 | 758.3984 | 96.7 | 4.6805 | 162.00 |

20,000 × 2800 | 12,680 | 712.5364 | 46.2 | 3.1025 | 229.67 |

30,000 × 2800 | 10,136 | 737.2275 | 35.3 | 2.1491 | 343.04 |

10,000 × 3000 | 30,249 | 662.1158 | 106.8 | 6.3898 | 103.62 |

20,000 × 3000 | 14,236 | 595.5270 | 50 | 5.3462 | 11.39 |

30,000 × 3000 | 11,201 | 639.9166 | 36.6 | 4.5954 | 139.25 |

see in all cases, GGK needs less IT and CPU time than GPRK does, and the speed-up of GGK against GPRK ranges from tens of times to hundreds of times, i.e., the speed-up of GGK against GPRK varies from 73.30 to 227.15 in the case of flat and from 11.39 to 425.19 in the case of thin. Similar results to

and

We develop a geometric Gauss-Kaczmarz (GGK) algorithm for solving large-scale consistent linear systems and the convergence is proved for this algorithm. Numerical experiments show that the GGK algorithm has better efficiency and effectiveness than the GPRK algorithm. In our future work, we will focus on block Kaczmarz methods to solve ill-posed problems.

The authors thank the reviewers for providing some helpful comments. Research by G.H. was supported in part by Application Fundamentals Foundation of STD of Sichuan (grant 2020YJ0366) and the Opening Project of Sichuan Province University Key Laboratory of Bridge Non-destructionDetecting and Engineering Computing (grant 2020QZJ03), and research by F.Y. was partially supported by NNSF (grant 11501392) and SUSE (grant 2019RC09, 2020RC25), and research by Y.M. Liao was partially supported by the Innovation Fund of Postgraduate of SUSE (grant y2021101).

The authors declare no conflicts of interest regarding the publication of this paper.

Wen, L., Yin, F., Liao, Y.M. and Huang, G.X. (2021) A Geometric Gaussian Kaczmarz Method for Large Scaled Consistent Linear Equations. Journal of Applied Mathematics and Physics, 9, 2954-2965. https://doi.org/10.4236/jamp.2021.911189