Probabilistic Approach to the Asynchronous Iteration

In this work we will 
consider asynchronous iteration algorithms. As is well known in multiprocessor 
computers the parallel application of iterative methods often shows poor 
scaling and less optimal parallel efficiency. The ordinary iterative 
asynchronous method often has much better parallel efficiency as they almost 
never need to wait to communicate between possessors. We will study 
probabilistic approach in asynchronous iteration algorithms and present a mathematical description 
of this computational process to the multiprocessor environment. The result of 
our simple numerical experiments shows a convergence and efficiency of 
asynchronous iterative processes for considered nonlinear problems.


Introduction
Iterations methods for the solution linear and nonlinear equations are widely used because of their simplicity, fault tolerance, ease of parallelization.Historically, iterative algorithms were created and studied for using single processor computers.In multiprocessor computers the parallel application of iterative methods often shows poor scaling and less optimal parallel efficiency.As opposed to ordinary iterative asynchronous method often have much better parallel efficiency as they almost never need to wait to communicate between possessors.
Our investigation concerns to study of asynchronous algorithms with probabilistic approach and presenting a mathematical description of this computational process to the multiprocessor environment.

Description of problem
Let X be a Banach space.In X we consider the following polynomial nonlinear equation: as the open/closed balls in X with the radius δ and centers at x. Let I be the identity operator, L(X,X) is the space of linear continuous operators acting from X to X, The proof of this lemma is given by [1].

Particularly, if
[ ] ) can be converted into an integral equation with a polynomial non-linearity, as considered in [2].
Let D i be the direct product of i copies of D. Suppose x = x 1 , multiply the obtained equation by ϕ(x 2 ), and then again by ϕ(x 3 ) and so on, we denote and we get the following infinite system of linear integral equations  Each function, ψ , is defined on D i .We will determine the operators ( m ) Now, we can formally Equation (1.1) with this infinite system of linear equations We consider closing the system (1.5) to some finite N, completing it with conditions Let the domain D represent a simple interval [a,b].We replace the integrals in system (1.5) with the cubature formulas using ν i mesh points, applied to each variable 1 1 We then get a system with 1 ∑ algebraic equations with the same number of unknowns.This system of linear algebraic equations has a specific structure.This is considered in detail in [1].This system of linear Equation (1.4) will be rewritten in the following form We can write the system (1.8) in operator form as ψ = F(ψ).

Convergence of the Asynchronous Method
Let's define [3,4] an asynchronous iterative method for solving the system (1.8).Let be a sequence of non-empty subsets of set {1, 2, •••, N} which is a called chaotic sequence of sets.
Let the given initial vector be ψ(0).We will construct the sequence of iterations by the following way [5-7] The method of chaotic iterative (1.10) is a generalization of sequential iterative methods.If J n = J = {1, 2, •••, N} then all components of the iteration vector are updated at the same time and as a result we get the simple Jacobi iterative method 1 0 1 2 , , ,  .The components of the iteration vector are refreshed the cyclically, if n J { n(mod N )} = .In computing the following component we use previous one, which has already been computed using a Gauss-Seidel iteration.The main property of chaotic iterations is the random updating of the component iterative vector, which allows us to implement the efficiency on multiprocessor systems.
The generalization of method of chaotic iterative is the method of asynchronous iterations.The method of asynchronous iterations (1.8) is constructed with the following rule.Let ψ(0) be given, then Here ψ i is a component of the vector ψ, and = is a sequence of sets of non-negative integers, satisfying the following conditions, for any and every element i occurs infinitely many often in the sets J n n = 1, 2, •••.The S i (n), i = 1, N are called delays or lag.The condition (1.12) says that only components of previous iterates can be used in the evaluation of a new iterate.The condition (1.13) eventually says that values of an early iterate cannot be used any more in further evaluations, and more and more recent values of the components have to be used instead.The last condition "every element i occurs infinitely many often in the sets , where 1 S is the maximum number of iterations saved, i.e. when computing iterations we use components of vectors of previous iterative with no more than 1 S in the past.Let all elements in the sequence of subsets J n have at least one element from {1, 2, •••, N}.Then the following statement is valid Lemma 2. For convergence of the asynchronous iterative method in R n to the solution of (1.8) it is necessary that the spectral radius ρ(F) < 1.
The proof of this lemma is given in [8, p. 100)].This condition also provides convergence of a simple iterative process for the system (1.8).It is known that [2], the system (1.8) is equivalent to Equation (1.1) in the following sense.If the iterative method converges for (1.1) with initial value ϕ (0) = f, then the method for (1.8) will converge, starting from the initial vector (f, 0, 0, •••, 0), and vice versa.Using the results of lemma 1and 2 for the convergence of asynchronous iterations (1.11) the following assertion will be true [1].
Theorem 1.If the conditions of lemma (1.1) are satisfied, then the asynchronous iterative process (1.11) will converge to the solution of the system (1.8).
The main difference between asynchronous iterative and the other iterative methods in parallel is the chaotic behavior of the vector components, which is expressed by the set of chaotic sequences J n .The chaotic iterative process has the following two main advantages: 1) it is possible to calculate each coordinate of the iteration vector independently from the others (like the Monte Carlo method), 2) the convergence rate is higher, because this method sometimes essentially becomes an implicit iterative method like Gauss-Seidel method.
Definition 1.The sediment of a chaotic sequence Let consider the equation ψ = F(ψ), where F is a nonlinear operator.Definition 3. Let D (F) be the domain of the operator F. We shall call the operator F a p-Lipschitzian contraction in D(F) if there exists on N x N matrix L with non-negative elements, satisfying the following inequality y, z D( F ); p( F( y ) F( z )) Lp( y z ) ∀ ∈ − ≤ − .The matrix L is called a Lipschitzian matrix for the operator F. Let be the sequence of nonempty subsets of set N ={1, •••, N}.Each element of J n is generated via a distribution from the set N .Let the elements of J n be generated in a following way: the element i ∈ N occurs in J n with probability p 1. .The probability of absence of i ∈ N in J n is equal to p 2 , where p 2 = 1 -p 1 .The average number of occurrences i ∈ N in a sequence can be computed.The average number of occurrences in n steps is equal to n p 1 .Every element i ∈ N thus occurs infinitely often in the sets J n as n → ∞ .Thus we can formulate the following statement without proof.
Lemma 3. If the probability of occurrence of an arbitrary element i in J n obeys the Bernoulli law of distribution with probability of success 0 < p 1 < 1, or obeys the Poisson law, then the chaotic sequence Theorem 2. Suppose the operator F: X→X is a p-contraction on X with contraction matrix L and the conditions of lemma 1.3 are satisfied.Then for any y(0) ∈ X, the sequence of iterations, built up by the asynchronous method (1.11), converges to the unique fixed point of operator F in X.
Proof.In the p-contraction of the operator F on X, there is a unique fixed point ξ and p(F(y)-F(z)) ≤ L p(y-z).Since L is contraction matrix, there is positive vector ν ∈ R n and positive scalar θ ∈ (0,1), such that Lν ≤ θν.Let y(0) be an initial approximation.There exists an α > 0, and q n n ≥ , where n q is a iterative number, so that we can build a sequence where the following will be true Then from inequality (1.13a) and taking the fact that θ < 1, it follows that p(y(n)) → ξ given that q → ∞ i.e.
We will show, assuming We use mathematical induction.We assume it true for n = k -1.From the definition of the asynchronous iterative process, it follows that components of the vector y(k) are given as Since θ < 1, the last inequality we say and therefore p( y( n ) ) v ξ α − ≤ .Let's suppose that iteration numbers n 1 , n 2 , •••, n q-1 are known and condition (1.13a) is satisfied.Let's find n q for which condition (1.13a) is satisfied.We first determine a number . Since i S ( n ) → ∞ , if n → ∞ , then there exists an r, such that 1 q p( y( r ) ) ξ αθ − − ≤ .Let n>r, from the p-contraction property of the operator F, we get and this holds for all i, if n ≥ n q .From the definition n q it follows that all components of iteration vector y(n q ) differ from the component of the iteration vector y(r), and the theorem has been proven.
There are various methods for implementing asynchronous iterations.
To conclude this section, we provide the following asynchronous iterative method with memory.Suppose ψ(0), •••, ψ(m-1) are known.The sequence of vectors is called a asynchronous iterative method with memory.
It is known [9] that if the operator F is a p-contraction operator and the chaotic sequence {J n } had maximal sediment, then the asynchronous iterative process with memory converges to the unique fixed point of the operator F.
The asynchronous iterative method with memory size m > 2 has a problem.The method requires saving a very large number of items in memory.However, this kind of asynchronous method is used in special cases.So, for example, let us use the method Newton-Kantorovich, for solving the equation ψ = F(ψ).This method is defined by formula 1 1 1 and This iterative process is not the same as the asynchronous iterative processes (1.11).To compute the current iterative, we use the result of two previous iterations.In general case, if an iterative process of m-th order is used, i.e.
then we can't use an asynchronous method (1.11).In this case, we must use an asynchronous iterative method with memory m.
To estimate the speed of convergence of the iterative method we usually use the value [6] 0 where ξ is a fixed point of the operator F, and An efficiency of an iterative method we usually use the value: where, c n is the cost associated with the evaluation of the first n iterates.Usually we choose c n , proportional to the number of arithmetic operations and are necessary to compute the first n iterations, or the computer time used for computing of n iterations.We note that if c n /n tends to some finite τ (which corresponds to the average cost per step), then the efficiency is simple given by E = R/τ.In the asynchronous iteration case, we can determine the value of c n in the following way: , where i J is the cardinality of the set J n , i.e. number of components evaluated at the n-th step of the iterative.In this case, the cost is better suited for a parallel implementation, and can be evaluated through the classical tools of queuing theory.Let's denote the number of macro iterations (accumulated iterative) in the asynchronous process with as n q .Then we obtain where L is a contraction matrix of the operator F. The results obtained in this section will be applied to solving a real life problem in the next section.

Asynchronous Algorithms Bernoulli and Poisson Distributions
In this section we give another way of organizing asynchronous computations for solving equations.Suppose a computer has S processors.The first l 1 components of system (1.8) will be solved on the first processor, and the second l2 components of the system (1.8) on the second processor etc.Each of the S processors computes its own group of components of the unknown vector ψ simultaneously.At the end of iteration the value of group components are copied to computer memory of and they become accessible to all processors.This procedure works for shared memory multiprocessors.In this case, asynchronous calculations can be represented in the following form: where 1 is an element of a vector of the current approximation at time T. ψ δ is a diagonal matrix, the components jj δ are equal to one if component j x is calculated by the i-th processor, otherwise they are zero.The coefficients 1 T i q , i ,S = will be set to one when the time interval T-T i is long enough to calculate and write the vector T i ψ into shared memory, of computer and otherwise it will be set to zero.
The time interval T-T i depends on the time T, and T i denotes the beginning of the next time step for calculating group components.T could be a random variable that has a specific distribution.
Let's divide T into time intervals.Equation (2.2) gives a new method of organizing this asynchronous computational process for the solving Equation (1.8).Analyzing and comparing these two asynchronous methods (2.2) and (1.11, 1.12, 1.13), we can conclude their equivalence.Indeed, in (2.2) we use values of the other components ψ T to compute component , which were calculated up to time the most recent values.This meets conditions (1.12).In (2.2), the number of iterative of each component of ψ − tends to infinity, when T → ∞ , which agrees with condition (1.13).In a asynchronous iterative process (1.11), actual task is manipulate the sequences But for real computations in multiprocessor and parallel systems, these sequences are unknown and they are determined implicitly by a process.This is essentially the difficulty of studying asynchronous computations.As we assumed in the beginning of this section and in this case, the sequences n ( i ) J and S ( n ) can be considered random with a specific distribution.In [10], a new result was obtained for convergence conditions for asynchronous computations in case (2.2).
We will now consider the asynchronous process (1.11) in more detail.We feel that probabilistic model describes the practical situation much better.We suppose, for example, a set J j is randomly chooses from the set {1, •••, N}.Further, we assume that the computation time for the next iterative for each processors is nondeterministic.The time computation is an independent random variable, and has some simple distribution.
At first, suppose F is a linear operator.Let's close system of linear algebraic Equations (1.5 -1.6).This system is then written in the following form A b ψ ψ = + , where A is an (N × N) matrix.Suppose there are m parallel processors.We will assume that for a random time interval, each processor is able to compute the n-th iterative with probability p, and is not able to compute in with probability q = 1 -p.When it cannot compute in time, on the next iteration step, we use a known component according to (1.11), since i S ( j ) j i = − .Let .represent some norm, and Proof.It's easy to see that the probability of reaching the n-th full iteration is p n .The probability P n (k), of reaching the n-th step, with k successful iterations is equal to P n (k) = k k n k n ( )p q − .Let's now compute the expected value of the random variable, ( n )  ξ . Because 1 A < , we have In this case the following theorem is valid: Theorem 4. Suppose our processors behave in accordance to the Poison distribution.If a vector ( n ) ξ is the random value of the iteration at the in n-th, step and 1 A < , then with a fixed p, we obtain and this proves the theorem.Now let's look at the nonlinear case for ψ = F ψ (2.5)Here F is a nonlinear p-Lipschitzian operator contraction on X with contraction matrix L. Suppose that we have N (N is a number of equations) processors, each with probability p successfully computing the next iteration for its components of k ( n )   ψ .Let a sequence of random variables, J n , have a Bernoulli distribution.If p ≠ 0, it will tend to have the maximum sediment.According to theorem 2 of the previous section, we can assert that the asynchronous iterative method converges to the unique fixed point, ξ, of the operator F. In this case q n = np i.e. q n is the average number of successes in n-th step of iteration.Let R be the rate of convergence.Then R, from (1.14) give the following inequality , since it's known that θ can be chosen as close to ρ(L) as possible.
We recall that in case of the simple iterative method, R log ( L ) ρ ≥ − .In the case when 1 .Thus, since the probability of successes 1 p ≤ , the asynchronous iterative method

mC ≥ , so that for any 1 2 mD
map, m = 1, 2•••, n, which is continuous and which is equivalent to their bounded-ness.Then there are exist constants 0 define ϕ (i) as the solution of the following equation ( x ) / D ( x ) δ δ ,..., y ) X , sequence (1.2) converges to ϕ * , where ϕ * is a solution of Equation (1.1) and has a minimal norm among all solutions from X to equation (1.1), and is unique in 0 D ( ) δ .Then the following inequality is true 1 1 n )} , i , ,N , n ),...,S ( n ),S ( n ),...,S ( n )} n = m, m + 1, ... are sequence of vectors from N 37 the current approximation which are calculated at time T i < T.Here T i denotes the beginning of the next calculation step of the group component T i .
time of the next time step, and the end of the previous step.Let T i p be an integer that takes values of either T

Theorem 3 .
Suppose our processors behave consistent to the Bernoulli distribution.If a vector ( n ) ξ is the random value of the n-th iteration and 1 ψ is the solution of the problem.

1 A
< , and this proves the theorem.Now consider the case when the time is distributed according to a Poison distribution with the parameter λ = n p, where n is the number of iterative and p is the probability of a successful computation in the next iteration step..( k )... ( n ) ( n ) p( , )...p( k , )...p( n, ) that all components of the iteration vector,q ( n ) ψ, differ from the components of iterative vector ( r ).
a constant c n characterizing the average complexity of the computation for n asynchronous iteration can be determined by i J is cardinality of the set i J .The efficiency of the asynchronous iterative process in this case is determined from (1.15) and