Advances in Theory of Neural Network and Its Application

In this article we introduce a large class of optimization problems that can be approximated by neural networks. Furthermore for some large category of optimization problems the action of the corresponding neural network will be reduced to linear or quadratic programming, therefore the global optimum could be obtained immediately.


Introduction
Many problems in the industry involved optimization of certain complicated function of several variables.Furthermore there are usually set of constrains to be satisfied.The complexity of the function and the given constrains make it almost impossible to use deterministic methods to solve the given optimization problem.Most often we have to approximate the solutions.The approximating methods are usually very diverse and particular for each case.Recent advances in theory of neural network are providing us with completely new approach.This approach is more comprehensive and can be applied to wide range of problems at the same time.In the preliminary section we are going to introduce the neural network methods that are based on the works of D. Hopfield, Cohen and Grossberg.One can see these results at (section-4) [1] and (section-14) [2].We are going to use the generalized version of the above methods to find the optimum points for some given problems.The results in this article are based on our common work with Greg Millbank of praxis group.Many of our products used neural network of some sort.Our experiences show that by choosing appropriate initial data and weights we are able to approximate the stability points very fast and efficiently.In section-2 and section-3, we introduce the extension of Cohen and Grossberg theorem to larger class of dynamic systems.For the good reference to linear programming, see [3], written by S. Gass.The appearance of new generation of super computers will give neural network much more vital role in the industry, machine intelligent and robotics.

On the Structure and Applicationt of Neural Networks
Neural networks are based on associative memory.We give a content to neural network and we get an address or identification back.Most of the classic neural networks have input nodes and output nodes.In other words every neural networks is associated with two integers m and n.Where the inputs are vectors in n R and outputs are vectors in m R .neural networks can also consist of deterministic process like linear programming.They can consist of complicated combination of other neural networks.There are two kind of neural networks.Neural networks with learning abilities and neural networks without learning abilities.The simplest neural networks with learning abilities are perceptrons.A given perceptron with input vectors in n , with g a logistic function usually given as ( ) ( ) with 1 0 β > > This neural network is trained using enough number of corresponding patterns until synaptical values stabilized.Then the perceptron is able to identify the unknown patterns in term of the patterns that have been used to train the neural network.For more details about this subject see for example (Section-5) [1].The neural network called back propagation is an extended version of simple perceptron.It has similar structure as simple perceptron.But it has one or more layers of neurons called hidden layers.It has very powerful ability to recognize unknown patterns and has more learning capacities.The only problem with this neural network is that the synaptical values do not always converge.There are more advanced versions of back propagation neural network called recurrent neural network and temporal neural network.They have more diverse architect and can perform time series, games, forecasting and travelling salesman problem.For more information on this topic see (section-6) [1].Neural networks without learning mechanism are often used for optimizations.The results of D.Hopfield, Cohen and Grossberg, see (section-14) [2] and (section-4) [1], on special category of dynamical systems provide us with neural networks that can solve optimization problems.The input and out put to this neural networks are vectors in m R for some integer m.The input vector will be chosen randomly.The action of neural network on some vector , , , , n X X X  .where ( ) ( ) . And output (if exist) will be the limit of of the above sequence of vectors.These neural networks are resulted from digitizing the corresponding differential equation and as it is has been proven that the limiting point of the above sequence of vector coincide with the limiting point of the trajectory passing by 1 X .Recent advances in theory of neural networks provide us with robots and comprehensive approach that can be applied to wide range of problems.At this end we can indicate some of the main differences between neural network and conventional algorithm.The back propagation neural networks, given the input will provide us the out put in no time.But the conventional algorithm has to do the same job over and over again.On the other hand in reality the algorithms driving the neural networks are quite massy and are never bug free.This means that the system can crash once given a new data.Hence the conventional methods will usually produce more precise outputs because they repeat the same process on the new data.Another defect of the neural networks is the fact that they are based on gradient descend method, but this method is slow at the time and often converge to the wrong vector.Recently other method called Kalman filter (see (section-15.9)[2]) which is more reliable and faster been suggested to replace the gradient descend method.

On the Nature of Dynamic Systems Induced from Energy Functions
In order to solve optimization problems using neural network machinery we first construct a corresponding energy function E, such that the optimum of E will coincide with the optimum point for the optimization problem.Next the energy function E, that is usually positive will induce the dynamic system E L .The trajectories of E L will converge hyperbolically to local optimums of our optimization problem.Finally we construct the neural network E NN which is the digitized version of E L , where depending on initial points it will converge to some local optimum.As we indicated in section-1, certain category of dynamic system which is called Hopfield and its generalization which is called Cohen and Grossberg dynamic system will induce a system of neural networks that are able to solve some well known NP problems.More recently the more advanced dynamic systems based on generalization of the above dynamic systems been used in [4] to to solve or prove many interesting problems including four color theorem.In the following sequence of lemmas and theorems, we are going to show that if the dynamic system L satisfies certain commuting condition, then it can be induced from an energy function L E , which is not usually positive and all its trajectories converge hyperbolically to a corresponding attractor points.Furthermore the attractor corresponding to the global optimum is located on the non trivial trajectory.Note that the energy function that is induced from optimization scenario is always positive and the corresponding dynamic system G L , is a commuting dynamic system.Suppose we are given dynamic system L, as in the following, ( ) ( ) , , , , , Definition 2.1 We say that the above system L satisfies the commuting condition if for each two indices This is very similar to the properties of commuting squares in the V.Jones index theory [5].
The advantage of commuting system as we will show later is that each trajectory ∈ , will converge to the critical point x ∞ , and x ∞ is asymptotically stable.
In particular note that if the dynamic system is induced from an energy function E, then the induced neural network E N , is robot and stable.In the sense that beginning from one point 0 n x R ∈ , the neural network will asymptotically will converge to a critical point.This property plus some other techniques make it possible to find the optimum value of E. The following lemma will lead us to the above conclusions.Lemma 2.2 Suppose the dynamic system L has a commuting property.Then there exists a function acting on n R such that for every integer i n ≤ we have ( ) . Furthermore for every trajectory . finally we have, . And the equality holds, i.e. d d 0 2 is not always a positive function.In the case that L E is a positive function we have the following lemma.
Lemma 2.3 Following the notation as in the above suppose L is a commuting system and that 0 be the critical point for the system L which is on non trivial trajectory with Proof.Following the definition of Liaponov function and using Lemma 2.2, the fact that implies that regarding to the trajectory passing through x ∞ , it is asymptotically stable.Q.E.D.
There are some cases that we can choose L E to be a positive function as we will show in the following lemma.
Lemma 2.4 Keeping the same notations as in the above, suppose that there exists a number Then there exists a positive energy function L F for the dynamic system L.
Proof.Let us define L F acting on n R by ( ) ( ) Suppose L is a commuting dynamic system.Let 0 L E ≥ be an induced energy function.The main goal of the corresponding neural network L N is to reach a point at which L E will get its optimum value.In Lemma 2.5, we proved that any non trivial trajectory passing through x ∞ will converges to x ∞ asymptotically.In the following we show that in general the above property holds for any attracting point of commuting dynamic systems.
Lemma 2.6 Suppose L is a commuting dynamic system and x ∞ an attractive point.Then the trajectory passing through x will converge asymptotically to x ∞ .
Proof Let . In order to complete the proof we have to show that with regard to the trajectory ( ) Suppose L is a commuting dynamic system and x ∞ is a point on some non trivial trajectory at which L E reaches its infimum.We want to find a conditions that guarantees the existence of a non trivial trajectory passing through x ∞ .
Definition 2.6 Keeping the same notation as in the above, for a commuting dynamic system L, we call 0 Before proceeding to the next theorem let us set the following notations.

Let
( ) ( ) be the first point at which the trajectory ( ) Lemma 2.7 Following the above notations suppose without loss of generality, 0 x ∞ = , and that there is no trajectory passing through a point ∞ ≠ x x and converging to x ∞ .Then there exists 0 ρ > and a sequences ( ) , such that for each i N ∈ , ( ) , . Now consider the trajectory Thus, applying mean value theorem implies (1) To complete the proof of the Theorem 2.8, we need the following lemma.Lemma 2.9 Keeping the same notations as in the above then there exists a 0 ω > such that for j N ∈ large enough, there exists a positive number 0 α > , with the property that for every ( ) Proof.Otherwise there exists a sequence of numbers ( ), This using the fact that L E is canonical will lead to contradiction.Q.E.D To complete the proof of theorem 2.8, note that for every point i x  , the points located on the trajectory ( ) , i X t  , can be expressed as a continuous ) x Φ  , of x  and t.The function ( ) , which is a compact set.Hence there exists a number Length such that Let us define the following points, , ,  be the points corresponding to the above partitions.Furthermore consider the following countable set of points.
( ) . Furthermore for each triple ( ) Lemma 2.10 Keeping the same notations as in the above, suppose for a given commuting system L the induced energy function L E is positive function.Then L E is canonical.Proof.If L E is not canonical then there exist an increasing sequences of positive numbers, ( ) Next we can assume that there exists a line ( ) U t in n R connecting sequence of points i u to the limiting point u ∞ , such that ( ) Let us set ( ) , then using Hopital lemma we have, ( . continue this process suppose using induction that ( ( ) , then we get by Hopital lemma, ( )  ( ) ≠ .Therefore the above arguments imply that ( ) , and this is a contradiction to the assumption.Q.E.D.
As a result of the Lemma 2.10 we get that if 0 L E ≥ the system L is always canonical, hence we have the the following corollary, Corollary 2.11 Keeping the same notations as in the above, The results of Theorem 2.8 holds as long as 0 L E ≥ .At this point we have to mention that non trivial trajectories will supply us with much more chance of hitting the global optimum, once we perform a random search to locate it.
For some dynamic systems L which is expressed in the usual form ( ) , the commuting condition does not hold.For example consider the Hopfield neural network and its corresponding dynamic system, ( ) where ( ) ( ) . It is clear that the above system does not posses commuting properties.Let us multiply both side of the i'th equation in the above by ( ) ( ) , to get the dynamic system 1 L , given in the following as, ( Let us denote ( ) , the system 1 L can be expressed as It is clear that if ( ) i U t is a trajectory of the system L, and provided ( ) U t is a trajectory of system 1 L too.Now the commuting property holds for the right side of Equation ( 2), hence using the same techniques as before we can construct the energy function ( ) E u for the system (2) such that, ( ) ) This implies that d d 0 E t < except on the attractive points.Now by the results of the Lemma 2.3, this implies that for any trajectory ( ) 0 , U t u , the convergent asymptotically to the corresponding attractive point u ∞ .
As we mentioned before more generalized version of Hopfield dynamic system which is called Cohen and Grossberg dynamic system is given as in the following. .Furthermore ( ) ( ) Likewise the system (2), system (3) is not a commuting system.But if we multiply both side of the ith equation in system (3) by then we get a dynamic system where its right side is commuting.Hence each of the trajectories converge asymptotically to corresponding attracting point.

Reduction of Certain Optimization Problems to Linear or Quadratic Programming
In solving optimization problems using neural network we first form an energy function ( ) E u , corresponding to the optimization problem.Next the above energy will induces the dynamic system L that its trajectories converge to local optimum solutions for the optimization problem.
Given the energy function ( ) E u acting on n R , the induced dynamic system L is given in the following, As we showed L is a commuting system.As an example consider the travelling salesman problem.As it has been expressed in section 4.2, page 77 of [1] the energy function E is expressed as in the following, 0.5 0.5 1 1 where we are represent the points to be visited by travelling salesman as ( ) Furthermore let us set the following energy function to be optimize, ( ) ( ) .At this point using the above arguments it is enough to find Q, and P, satisfying the above equalities and will optimize the following expression, ( ) Therefore the above equations together with optimization expression will form a system of linear programming that will converge to the optimal solution at no time.

Conclusion
In this article we introduced the methods of approximating the solution to optimization problems using neural networks machinery.In particular we proved that for certain large category of optimization problems the application of neural network methods guaranties that the above problems will be reduced to linear or quadratic programming.This will give us very important conclusion because the solution of the optimization problems in these categories can be reached immediately.
only on the corresponding critical point x ∞ on which ( ) the set S − the closure of S in n R is an non trivial trajectory passing through x ∞ But this is a con- tradiction to our assumption.Q.E.D.
is analytic function we get that ( ) 0 g t = .which is a contradiction.Hence there exists an integer m N ∈ , such that


meet the conditions of Case-1 and Case-2 over the indices.Now the above system is equivalent to find the optimum solution for So as before assuming,On the other hand it is easy to show that where the indices satisfy, the conditions of Case-1 and Case-2.then as t tends to infinity we have, is a point on a non trivial trajectory on which L E , achieves an optimum α , then the trajectory passing through x ∞ will converge asymptotically to x ∞ .
and over the trajectories d d 0 L E t ≤ , it is equal to zero only over the attracting points.This will complete the proof of the lemma.Q.E.D. L E ≥ be the induced energy function.Then if n x R ∞ ∈ E is an energy function for L and since x  will intersect C ρ first  will lie in O ρ .This implies the existence of the sequence  , be the trajectory through the point x  .Now let ρ , and LE is analytic this implies that ( )0 L E x = which is a contradiction.Q.E.D. L E x ∞ = .Then there exists a non trivial trajectory ( )X t , converging to x ∞ .j X t x  intersect C ρ .We have,