Hausdorff Dimension of Multi-Layer Neural Networks

This elucidation investigates the Hausdorff dimension of the output space of multi-layer neural networks. When the factor map from the covering space of the output space to the output space has a synchronizing word, the Hausdorff dimension of the output space relates to its topological entropy. This clarifies the geometrical structure of the output space in more details.


Introduction
The multi-layer neural networks (MNN, [1,2]) have received considerable attention and were successfully applied to many areas such as signal processing, pattern recognition ( [3,4]) and combinatorial optimization ( [5,6]) in the past few decades.The investigation of mosaic solution is the most essential in MNN models due to the learning algorithm and training processing.In [7][8][9], the authors proved that the output solutions space of a 2-layer MNN forms a so-called sofic shift space, which is a factor of a classical subshift of finite type.Thus, MNN model indeed produces abundant output patterns and makes learning algorithm more efficient.A useful quantity to classify the output solution space is the topological entropy ( [10]).We call the output solution space pattern formation if , and call it spatial chaos if .The indicates that the output patterns grow subexponentially and exponentially for .For positive entropy systems, the explicit value of h presents how chaotic the system is.In [7], Ban and Chang provided a method to compute explicit values of for a -layer MNN.The method is quite general and it makes the computation of possible for arbitrary 2 N    N , i.e., -layer MNN.N From the dynamical system (DS) point of view, the topological entropy reveals the complexity of the global patterns.However, it provides less information of the inner structure of a given DS, e.g., self-similarity or recurrent properties.The possible quantity reveals that such properties are the Hausdorff dimension (HD, [11]) since the Hausdorff dimension is an indicator of the geometrical structure.For most DS, the computation of Hausdorff dimension is not an easy task, and the box dimension (BD) is usually computed first to give the upper bound for HD.Due to the relationship of topological entropy and BD ( [12]) of a symbolic DS 1 , the previous work ( [7]) for topological entropy gives the upper bound for HD of -layer MNN.Nature question arises: Given a MNN, how to compute the explicit value for HD?The aim of this paper is to establish the HD formula for -layer MNNs.Using the tool of symbolic DS, the HD formula will be established for -layer MNNs which possesses a synchronizing word (Theorem 2.4).The result leads us to exploit the inner structure for a -layer MNN.We believe that further interesting applications of the results presented (or of the generalizations) can be obtained.
This paper is organized as follows.Section 2 contains a brief disscussion for the computation of topological entropy in [7].The main result is stated and proved therein.Section 3 presents an MNN model for which we can compute its HD.

Preliminaries and Main Results
A one-dimensional multi-layer neural network (MNN) is realized as for some , and .The finite subset indicates the neighborhood, and the piecewise linear map and threshold where , and for some .
of a mosaic solution is called a mosaic pattern, where The solution space of (1) stores the patterns Y y , and the output space   N Y of ( 1) is the collection of the output patterns; more precisely, . In [7], the authors showed that -layer MNNs with nearest neighborhood are essential for the investigation of MNNs.In the rest of this manuscript, we refer MNNs to -layer MNNs with nearest neighborhood unless otherwise stated.

Topological Entropy and Hausdorff Dimension
Since the neighborhood  is finite and is invariant for each , the output space is determined by the so-called basic set of admissible local patterns.Replace the pattern 1 and 1 by  and +, respectively; the basic set of admissible local patterns of the first and second layer is a subset of , respectively, where denotes To ease the notation, we denote . Given a template  , the basic set of admissible local pattern is determined, where and are the basic set of admissible local patterns of the first and second layer, respectively.Let denote the parameter space of (1).Theorem 2.1 asserts that 8 can be partitioned into finitely many subregions so that two templates in the same partition exhibit the same basic set of admissible local patterns. Theorem 2.1.(See [7]) There is a positive integer and unique set of open subregions   2) for some if and only if k Since the template of MNNs is spatially invariant, the so-called transition matrix is used to investigate the complexity of MNNs.The transition matrix is defined by 1, , and , ; , 0, otherwise; herein k is presented as Furthermore, the transition matrix of the second layer the transition matrix of the first layer is defined by is defined by (See [7]) Suppose is the transition matrix of (1), and 1 T and 2 are the transition matrices of the first and second layer, respectively.Let T T 1 T be defined as in (7).
where is a matrix with all entries being 1's; and are the Hadamard and Kronecker product, respectively.
As being demonstrated in [7][8][9]13], the solution space is a so-called shift of finite type (SFT, also known as a topological Markov shift) and the output space is a sofic shift.More specifically, a SFT can be represented as a directed graph and a sofic shift can be represented as a labeled graph for some labeling and finite alphabet  .A labeled graph is called right-resolving if the restriction of  to  is one-to-one for all I   , where I  consists of those edges starting from I .If is not right-solving, there exists a labeled graph , derived by applying subset construction method (SCM) to , such that the sofic shift represented by is identical to the original space.A detailed instruction is referred to [14].

   
One of the most frequently used quantum for the measure of the spatial complexity is the topological entropy.Let X be a symbolic space and let   n X  denote the collection of the patterns of length in n X .The topological entropy of X is defined by     log lim , provided the limit exists.

The topological entropies of and are
respectively, where H is the transition matrix of the labeled graph which is obtained by applying SCM to .

 
Aside from the topological entropy, the Hausdorff di-mension characterizes its geometrical structure.The concept of the Hausdorff dimension generalizes the notion of the dimension of a real vector space and helps to distinguish the difference of measure zero sets.Let be a finite set with cardinality , which we consider to be an alphabet of symbols.Without the loss of generality, we usually take The full -shift is the collection of all biinfinite sequences with entries from .More precisely, It is well-known that is a compact metric space endowed with the metric , , It follows that X  and X  can be embedded in the close interval   0,1 separately.Moreover,    and can be mapped onto the close interval . This makes the elucidation of the Hausdorff dimension of the output space comprehensible.(Recall that the alphabet of ).

Main Result
Suppose , X Y are shift spaces and : X Y   is a factor map.We say that  has a synchronizing word if there is a finite word admits the same terminal entry.More precisely, for any is a labeled graph representation of the output space of (1).Denote by the SFT represented by the graph if is right-resolving; otherwise, denote by W the SFT represented by the graph  is obtained by applying SCM to .It follows that is a covering space of and there is a factor map which is represented by the labeling (or ).Theorem 2.4 asserts that the Hausdorff dimension of the output space relates to the topological entropy of its cover-ing space Proof.Suppose X is a SFT and  is an invariant probability measure on X .The Variational Principle indicates that the topological entropy of X is the supremum of the measure-theoretic entropy of X ; more precisely,   be a Markov measure which is derived from the transition matrix of X .Then  is the unique measure that satisfies if X is topologically transitive (cf.[15]).Ban and Chang showed that, if has a synchronizing word, then the Hausdorff dimension of the output space is where     is a maximal measure of W W   (see [16], Theorem 2.6).Since is right-resolving, the factor map is finite-to-one.It follows that Theorem 2.3 demonstrates that the topological entropy of the output space Hence we have and . The transition matrices for the first and second layer are 1 2 0 1 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 and 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 respectively.Therefore, the transition matrix and the symbolic transition matrix of the MNN are 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 and a a a a a a a

*x
Ban is partially supported by the National Science Council (Contract1 Pesin showed that the box dimension of a shift space is the quo-X tient of the topological entropy and the metric on .More precisely, if the metric is defined by X No NSC 100-2115-M-259-009-MY2). # Chang is grateful for the partial support of the National Science Council (Contract No NSC 101-2115-M-035-002-).

. 1 T 2 Theorem 2 . 2 .
[7] decomposed T as the product of and T It is seen from the symbolic transition matrix S that the labeled graph is not right-resolving, and applying SMC to derives a right-resolving labeled graph (cf.

Figure 2 .
Figure 2. The fractal set of the output space   2 Y .
has a synchronizing word.