Dale's Principle is necessary for an optimal neuronal network's dynamics

We study a mathematical model of biological neuronal networks composed by any finite number $N \geq 2$ of non necessarily identical cells. The model is a deterministic dynamical system governed by finite-dimensional impulsive differential equations. The statical structure of the network is described by a directed and weighted graph whose nodes are certain subsets of neurons, and whose edges are the groups of synaptical connections among those subsets. First, we prove that among all the possible networks such that their respective graphs are mutually isomorphic, there exists a dynamical optimum. This optimal network exhibits the richest dynamics: namely, it is capable to show the most diverse set of responses (i.e. orbits in the future) under external stimulus or signals. Second, we prove that all the neurons of a dynamically optimal neuronal network necessarily satisfy Dale's Principle, i.e. each neuron must be either excitatory or inhibitory, but not mixed. So, Dale's Principle is a mathematical necessary consequence of a theoretic optimization process of the dynamics of the network. Finally, we prove that Dale's Principle is not sufficient for the dynamical optimization of the network.


Introduction
Based on experimental evidence, Dale's Principle in Neuroscience (see for instance [1,2]) postulates that most neurons of a biological neuronal network send the same set of biochemical substances (called neurotransmitters) to the other neurons that are connected with them. Most neurons release more than one neurotransmitter, which is called the "cotransmission" phenomenon [3,4], but the set of neurotransmitters is constant for each cell.
Nevertheless, during plastic phases of the nervous systems, the neurotransmitters that are released by certain groups of neurons change according to the development of the neuronal network. This plasticity allows the network perform diverse and adequate dynamical responses to external stimulus: "Evidence suggests that during both development (in utero) and the postnatal period, the neurotransmitter phenotype of neurons is plastic and can be adapted as a function of activity or various environmental signals" [4]. Also a certain phenotypic plasticity occurs in some cells of the nervous system of mature animals, "suggesting that a dormant phenotype can be put in play by external inputs" [4].
Some mathematical models of the neuronal networks represent them as deterministic dynamical systems (see for instance [5,6,7,8]). In particular, the dynamical evolution of the state of each neuron during the interspike intervals, and the dynamics of the bursting phenomenon, can be modelled by a finite-dimensional ordinary differential equation (see for instance [8,9] and in particular [10] for a mathematical model of a neuron as a dynamical system evolving on a multi-dimensional space). When considering a network of many neurons, the synaptical connections are frequently modelled by impulsive coupling terms between the equations of the many cells (see for instance [7,11,12,13]). In such a mathematical model, Dale's Principle is translated into the following statement: Dale's Principle: Each neuron is either inhibitory or excitatory. We recall that a neuron i is called inhibitory (resp. excitatory) if its spikes produce, through the electrobiochemical actions that are transmitted along the axons of i, only null or negative (resp. positive) changes in the membrane potentials of all the other neurons j = i of the network. The amplitudes of those changes may depend on many variables. For instance, they may depend on the membrane instantaneous permeability of the receiving cell j. But the sign of the postsynaptical actions is usually only attributed to the electro-chemical properties of the substances that are released by the sending cell i. In other words, the sign depends only on the set of neurotransmitters that are released by i. Since this set of substances is fixed for each neuron i (if i satisfies Dale's Principle), the sign of its synaptical actions on the other neurons j = i is fixed for each cell i, and thus, independent of the receiving neuron j.
In this paper we adopt a simplified mathematical model of the neuronal network with a finite number N ≥ 2 of neurons, by means of a system of deterministic impulsive differential equations. This model is taken from [11,13], with an adaptation that allows the state variable x i of each cell i be multidimensional. Precisely x i is a vector of finite dimension, or equivalently, a point in a finite-dimensional manifold of an Euclidean space. The finite dimension of the state variable x i is larger or equal than 1, and besides, may depend on the neuron i. The dynamical model of the network is the solution of a system of impulsive differential equations. This dynamics evolves on a product manifold whose dimension is the sum of the dimensions of the state variables of its N neurons.
We do not assume a priori that the neurons of the network satisfy Dale's Principle. In Theorem 16 we prove this principle a posteriori, as a necessary final consequence of a dynamical optimization process. We assume that during this process, a plastic phase of the neuronal network occurs, eventually changing the total numbers of neurons and synaptical connections, but such that the graph-scheme of the synaptic connections among groups of mutually identical cells remains unchanged. We assume that a maximal amount of dynamical richness is pursued during such a plastic development of the network. Then, by means of a rigourous deduction from the abstract mathematical model, we prove that, among all the mathematically theoretic networks N of such a model, those exhibiting an optimal dynamics (i.e. the richest or the most versatile dynamics) necessarily satisfy Dale's Principle (Theorem 16).
The mathematical criteria to decide the dynamical optimization is the following: First, in Definition 9, we classify all the theoretic neuronal networks (also those that hypothetically do not satisfy Dale's Principle) into non countably many equivalence classes. Each class is a family of mutually equivalent networks, with respect to their internal synaptical connections among groups of cells (we call those groups of cells synaptical units in Definition 6). Second, in Definitions 11 and 14, we agree to say that a network N has an optimal dynamics conditioned to its class, if the dynamical system modelling any other network N ′ in the same class as N , has a space of orbits in the future that is a subset of the space of orbits of N . In other words, N is the network capable to perform the richest dynamics, namely, the most diverse set of possible evolutions in the future among all the networks that are in the same class.

RESULTS TO BE PROVED:
In Theorem 15 we prove that the theoretic dynamical optimum exists in any equivalence class of networks that have isomorphic synaptical graphs. In Main Theorem 16 we prove that such an optimum is achieved only if the network N satisfies Dale's Principle.
In Main Theorem 17 we prove that the converse of Theorem 16 is false: Dale's Principle is not sufficient for a network exhibit the optimal dynamics within its synaptical equivalence class.
The results are abstract and theoretically deduced from the mathematical model. They are epistemologically suggestive since they give a possible answer to the following question: Epistemological question: Why does Dale's Principle hold for most cells in the nervous systems of animals?
Mathematically, the hypothesis of searching for an optimal dynamics implies (through Theorem 16) that at some step of the optimization process all the cells must satisfy Dale's Principle. In other words, this principle would be a consequence, instead of a cause, of an optimization process during the plastic phase of the network. This conclusion holds under the hypothesis that the dynamical optimization (i.e. the maximum dynamical richness) is one of the "natural" pursued aims during a certain changeable development of the network.
Finally, we notice that the converse of Theorem 16 is false: there exist mathematical examples of simple abstract networks whose cells satisfy Dale's Principle but are not dynamically optimal (Theorem 17). Thus, Dale's Principle is necessary but not sufficient for the dynamical optimization of the network.
Structure of the paper and purpose of each section: In Section 2 we write the hypothesis of Main Theorems 16 and 17. From Section 3 to 6 we prove Main Theorem 16. The proof is developed in four steps, one in each separate section. The first step (Section 3) is devoted to prove the intermediate result of Proposition 7. The second step (Section 4) is deduced from Proposition 7. The third step (Section 5) is logically independent from the first and second steps, and is devoted to obtain the two intermediate results of Proposition 13 and Theorem 15. Section 6 exposes the fourth step (the end) of the proof of Main Theorem 16, from the logic junction of the previous three steps, using  Finally, in Section 8 we write the conclusions obtained from all the mathematical results that are proved along the paper. 2 The hypothesis (The model by a system of impulsive differential equations) We assume a simplified (but very general) mathematical model of the neuronal network which is defined along this section. The model, up to an abstract reformulation, and a generalization that allows any finite dimension for the impulsive differential equation governing each neuron, is taken from [11] and [13]. In the following subsections we describe the mathematical assumptions of this model:

Model of an isolated neuron
Each neuron i, while it does not receive synaptical actions from the other cells of the network, and while its membrane potential is lower than a (maximum) threshold level θ i > 0, and larger than a lower bound L i < 0, is assumed to be governed by a finite-dimensional differential equation of the form where t is time, x i is a finite-dimensional vector (x i,1 , . . . , x i,k ) whose components are real variables that describe the instantaneous state of the cell i, and f i : R k → R k is a Lipschitz continuous function giving the velocity vector dx i /dt of the changes in the state of the cell i, as a function of its instantaneous vectorial value x i (t). The function f i is the so called vector field in the phase space of the cell i. This space is assumed to be a finite dimensional compact manifold. The advantages of considering that dim(x i ) ≥ 1 (not necessarily 1) are, among others, the possibility of showing dynamical bifurcations between different rythms and oscillations that appear in some biological neurons [10], that would not appear if the mathematical model of all the neurons were necessarily one-dimensional.
One of the components of the vectorial state variable x i (which with no loss of generality we take as the first component x i,1 ) is the instantaneous membrane potential x i,1 (t) = V i (t) of the cell i.
In the sequel, we denote . In addition to the differential equation (1), it is assumed the following spiking condition [9]: If there exists an instant t 1 such that the potential V i (t − 1 ) = x i,1 (t − 1 ) equals the threshold level θ i , then x i,1 (t + 1 ) = 0. In brief, the following logic assertion holds, by hypothesis: (2) Here, 0 is the reset value. It is normalized to be zero after a change of variables, if necessary, that refers the difference of membrane potential of the cell i to the reset value. A more realistic model would consider a positive relatively short time-delay ∆t 1 between the instant t 1 when the membrane potential arrives to the threshold level θ i , and the instant t 1 + ∆t 1 for which the potential takes its reset value 0. During this short time-delay, the membrane potential shows an abrupt pulse of large amplitude, which is called spike of the neuron i. The impulsive simplified model approximates the spike to an abrupt discontinuity jump, by taking the time-delay ∆t 1 equal to zero. Then, the spike becomes an instantaneous jump of the membrane potential x i,1 (t) from the level θ i = 0 to the reset value 0 which occurs at t = t 1 according to condition (2).
We denote by δ θ i (x i,1 ) the Dirac delta supported on θ i . Namely − t θ i dδ θ i (t) (x i,1 ) (via the abstract integration theory with respect to the Dirac delta probability measure) denotes a discontinuity step −θ i that occurs on the potential x i,1 (t) at each instant t = t 1 such that x i,1 (t − 1 ) = θ i . In other words: After the above notation is adopted, the dynamics of each cell i (while isolated from the other cells of the network) is modelled by the following "impulsive differential equation": In the above equality θ i = (θ i , 0, . . . , 0) is the jump vector with dimension equal to the dimension of the state variable x i . Namely, at each spiking instant, only the first component x i,1 (the membrane potential) is abruptly reset, since the jump vector has all the other components equal to zero. Strictly talking, the equation (2.1) is not a differential equation, but the hybrid between the differential equation dx i /dt = f i (x i ) plus a rule, denoted by dx i /dt = − θ i δ θ i (x i,1 ). This impulsive rule imposes a discontinuity jump of amplitude vector − θ i in the dependence of the state variable x i (t) on t. Therefore, x i (t) is not continuous, and thus it is not indeed differentiable. It is in fact discontinuous at each instant t = t 1 such that Nevertheless, the theory of impulsive differential equations follows similar rules than the theory of ordinary differential equations. It was early initiated by Milman and Myshkis [14], cited in [15]. In particular, the existence and uniqueness of solution for each initial condition, and theorems of stability, still hold for the impulsive differential equation (2.1), as if it were an ordinary differential equation [14,15].

Model of the synaptical interactions among the neurons
The synaptical interactions are modelled by the following rule: If the membrane potential x i,1 of some neuron i arrives to (or exceeds) its threshold level θ i at instant t 1 , then the cell i sends an action ∆ i,j to the other neurons j = i. In particular ∆ i,j may be zero if no synaptical connection exists from the cell i to the cell j. This action produces a discontinuity jump in the membrane potential x j,1 . We denote by ∆ i,j the signed amplitude of the discontinuity jump on the membrane potential x j,1 (t) of the neuron j, which is produced by the synaptical action from the neuron i, when i spikes. The real value ∆ i,j may depend on the instantaneous state x j of the receiving neuron j just before the synaptic action from neuron i arrives. For simplicity we do not explicitly write this dependence. Thus, the symbol ∆ i,j denotes a real function of x j , which we assume to be either identically null or with constant sign.
We denote by ∆ i,j = (∆ i,j , 0, . . . , 0) the discontinuity jump vector, with dimension equal to the dimension of the variable state x j of the cell j. In other words, the discontinuity jump in the instantaneous vector state x j of the cell j, that is produced when the cell i spikes, is null on all the components of x j except the first one x j,1 , i.e. except on the membrane potential of the neuron j. In formulae: Thus, the dynamics of the whole neuronal network is modelled by the following system of impulsive differential equations: where N is the number of cells in the network.

Definition 1 (Excitatory, inhibitory and mixed neurons) The synapses from cell
The cell i is called mixed if it is neither excitatory nor inhibitory. Dale's Principle (which we do not assume a priori to hold) states that no neuron is mixed.
Remark 2 It is not restrictive to assume that no cell i is indifferent, namely no cell i sends null synaptical actions to all the other cells, i.e.
∃ i ∈ {1, 2, . . . , m} such that ∆ i,j = 0 ∀j = i. In fact, if there existed such a cell i, it would not send any action to the other cells of the network N . So, the global dynamics of the network is not modified (except for having one less variable) if we take out the cell i from N . All along the paper we assume that the network N has at least 2 neurons and no neuron is indifferent.

The refractory rule
To obtain a well defined deterministic dynamics from the system (5), other complementary assumptions are adopted by the model. First, a refractory phenomenon (see for instance [16, page 725]) is considered as follows: If some fixed neuron j spikes at instant t 1 , then its potential x j,1 is reset to zero becoming indifferent to the synaptical actions that it may receive (at the same instant t 1 ) from the other neurons. Second, if for some fixed neuron j at some instant t 1 , the sum i =j max{0, ∆ i,j } of the excitatory actions that j simultaneously receives from the other neurons of the network, is larger or equal than θ j − x j,1 (t − 1 ), then j itself spikes at instant t 1 , regardless whether x j,1 (t − 1 ) = θ i or not. In this case, at instant t 1 the cell j sends synaptical actions ∆ j,h to the other neurons h of the network, and then, the respective potentials x h,1 will suffer a jump ∆ j,h at instant t 1 . This process may make new neurons h to spike in an avalanche process (see [13]). This avalanche is produced instantaneously, when some excitatory neuron spontaneously arrived to its threshold level.
But due to the refractory rule, once each neuron spikes, its membrane potential refracts all the excitations or inhibitions that come at the same instant. So, the avalanche phenomenon is produced instantaneously, but includes each neuron i at most once. Then, each interaction term ∆ i,j δ θ i (x i,1 ) in the sum at right of Equation (5) is added only once at each spiking instant t 1 .

First step of the proof (Graphs, parts and units)
The purpose of this section is to prove Proposition 7 and to state the existence of an "Inter-  Figure 1.) To unify the notation, we agree: • N denotes either the network or its graph; • i is either a cell of N or a node of the graph; • ∆ i,j denotes either the synaptical action from i to j, or the weight of the edge e i,j in the graph, or this edge itself.

Definition 4 (Structurally identical cells)
Two different cells i = j are structurally identical if F i = F j in the respective differential equations (2.1), ∆ i,j = ∆ j,i = 0, and ∆ h,i = ∆ h,j for all h = i, j. These conditions imply that the dynamical systems that governs neurons i and j are the same. So, their future dynamics may differ only because their initial states x i (0) and x j (0) may be different. Note that, if i and j are structurally identical, then by definition, the edges of the graph at the receiving nodes i and j (from any other fixed sending node h) are respectively equally weighted by ∆ h,i = ∆ h,j . Nevertheless, the edges from i and j, as sending nodes of the network, are not necessarily identically weighted, i.e. ∆ i,h may be different from ∆ j,h . In Figure 1 we represent a graph G with three mutually identical cells 1, 2 and 3, provided 3 for h = 4, 5. Besides, the graph G has two other nodes, which corresponds to the neurons 4 and 5. The cells 4 and 5 are not mutually identical because the synaptical actions that they receive from the other cells are not equal.
The above definitions and the following ones are just mathematical tools, with no other purpose than enabling us to prove Theorems 16 and 17. They are not aimed to explain physiological or functional roles of subsets of real biological neurons in the brain or in the nervous system. Nevertheless, it is rather surprizing that the following abstract mathematical tools, which we include here just to prove Theorems 16 and 17, have indeed a resemblance with concepts or phenomena that are studied by Neuroscience. In particular, the following Definitions 5 and 6 of homogeneous part and synaptical unit of a neuronal network, are roughly analogous to the concepts of regions, subnetworks or groups of many similar neurons, characterized by a certain structure and a collective physiological role. For instance some subnetworks or layers of biological or artificial neurons are defined according to the role of their synaptical interactions with other subnetworks or layers [17].
Definition 5 (Homogeneous Part) An homogeneous part of the neuronal network is a maximal subset of cells of the network that are mutually pairwise identical (cf. Definition 4). As a particular case, we agree to say that an homogeneous part is composed by a single neuron i when no other neuron is structurally identical to i. In Figure 1 we draw the graph of a network composed by three homogeneous parts A, B and C. The homogeneous part A is composed by the three identical neurons 1, 2 and 3, provided that F 1 = F 2 = F 3 and ∆ h,1 = ∆ h,2 = ∆ h,3 for h = 4, 5. The homogeneous parts B = {4} and C = {5} have a single neuron each because ∆ h,4 = ∆ h,5 for some h (for instance for h = 2).

Definition 6 Synaptical Unit
A synaptical unit is a subset U ⊂ A of an homogeneous part A of a neuronal network such that: • For any neuron h ∈ A there exists at most one neuron i ∈ U such that ∆ i,h = 0. • A is partitioned in a minimal number of sets U possessing the above property.
In particular, a synaptical unit may be composed by a single neuron. This occurs, for instance, when for some neuron h ∈ A and for any neuron i ∈ A the synaptical interaction ∆ i,h from i to h is nonzero. In Figure 1 we draw the graph of a network composed by three homogeneous parts A, B and C such that: A is composed by three identical neurons 1, 2 and 3, that form two synaptical units U 1 := {1} and U 2 = U 3 := {2, 3}. In fact, the cells 1 and 2 can not belong to the same unit because there exist nonzero actions departing from both of them to neuron 4. One can also form the two synaptical units of A by defining U 1 = U 3 := {1, 3} and U 2 := {2}. The homogeneous part B is composed by a single neuron 4, and thus, it is a singe synaptical unit U 4 = {4} = B. Analogously C is composed by a single neuron 5, and thus it is a single synaptical unit U 5 := {5} = C. The total number of neurons of the network in Figure 1 is 5, the total number of synaptical units is 4, the total number of homogeneous parts is 3, the total number of nonzero synaptical interactions among the neurons is 9, but the total number of synaptical interactions among different homogeneous parts is only 5 (see Figure 2).
When a synaptical unit U has more neurons, the following quotient Q U diminishes: Q U is the number of synaptical connections departing from the cells of U divided by the total number of neurons of U . In fact, by Definition 6, for each synaptical unit U there exists at most one nonzero synaptical action to any other fixed neuron h of the network, regardless how many cells compose U . So, if we enlarge the number of cells in U , the number of nonzero synaptical actions departing from the cells of U remains constant. Thus, the quotient Q U diminishes. Although this quotient Q U becomes smaller when the number of neurons of the synaptical unit U enlarges, in Theorem 15 we will rigourously prove the following result: The dynamical system governing a neuronal network N with the maximum number of neurons in each of its synaptical units, is the richest one, i.e. N will exhibit the largest set of different orbits in the future, and so it will be theoretically capable to perform the most diverse set of processes.
The following result proves that any neuronal network, according to the mathematical model of Section 2, is decomposed as the union of at least two homogeneous parts, and each of these parts is decomposed into pairwise disjoint synaptical units. It also states the existence of an upper bound for the number of neurons that any synaptical unit can have. Since the equivalence classes of any equivalence relation in any set determine a partition of this set, then the network, as a set of neurons, is the union of its pairwise disjoint homogeneous parts. Denote by l ≥ 1 the total number of different homogeneous parts that compose the network. Let us prove that l ≥ 2. In fact, if l were equal to 1, then, by Definition 4, ∆ i,j = 0 for any pair of cells, contradicting the assumption that no cell is indifferent (see the end of Remark 2). We have proved Assertion (i).
(ii) Fix an homogeneous part A, and fix some neuron i ∈ A. Consider the set of neurons S i := {h : ∆ i,h = 0}. The set S i is nonempty because the cell i is not indifferent (see Remark 2). Choose and fix a neuron h ∈ S i . We discuss two cases: either h ∈ S j for all j ∈ A, or the set {j ∈ A : h ∈ S j } is nonempty.
In the first case, for each neuron j ∈ A the singleton {j} (formed by the single element j), satisfies Definition 6. Thus, {j} is a synaptical unit for all j ∈ A and assertion (ii) is proved.
In the second case, consider the set Consider also (if they exist) all the singletons {j} where j ∈ A is such that h ∈ S j . These latter sets {j} satisfy Definition 6 and, thus, they are pairwise disjoint synaptical units, which are also disjoint with A ′ . Besides, their union with A ′ compose A. So, it is now enough to prove that A ′ is also the union of pairwise disjoint synaptical units. Now, we choose and fix a neuron i ′ ∈ A ′ such that i ′ = i. (Such a neuron exists because A ′ = {i}). By construction of the set A ′ , we have h ∈ S i ′ . But, since the neuron i ′ is not indifferent, there exists h ′ ∈ S i ′ . So, we can repeat the above argument putting i ′ in the role of i, h ′ in the role of h, and A ′ in the role of A.
Since the number of neurons is finite, after a finite number of steps (repeating the above argument at each step), we obtain a decomposition of A into a finite number of pairwise disjoint sets that are synaptical units, ending the proof of Assertion (ii).
Since any neuron i ∈ U ⊂ A is not indifferent, there exists at least one homogeneous part B = A such that ∆ i,B = 0. Besides, applying Definition 6, for each homogeneous part B = A, there exists at most one neuron i ∈ U such that ∆ i,B = 0. The last two assertions imply that there is a one-to-one correspondence (which is not necessarily surjective) from the set of neurons in U to the set of homogeneous parts that are different from A. Then, the number of neurons in U is not larger than the number of existing homogeneous parts B = A, i.e. it is not larger than l − 1. We have proved Assertion (iii).
(iv) Fix an arbitrary synaptical unit U ⊂ A (where A is the homogeneous part that contains U ) and an arbitrary homogeneous part B (in particular, B may be A). As in the above proof of Assertion (iii), for each neuron  Each synaptical unit acts, in the inter-units graph, as if it were a single neuron. The spatial statical structure of groups of synaptical connections is the only observed object by this graph. Besides, the inter-units graph does not change if the number of neurons composing each of the many synaptical units, change. In the following section, we will condition the study of the networks to those that have mutually isomorphic inter-units graphs, i.e. they have the same statical structure of synaptical connections among groups of identical cells.
In Section 5, we will look on the dynamical responses of the network that have the same (statical) inter-units graph of synaptical connections. Any change in the number of neurons will change the space of possible initial states, and so the space of possible orbits and the global dynamics. So, among all the networks that have isomorphic inter-units graphs, the network with more neurons should, a priori, exhibit a larger diversity of theoretic possible dynamical responses to external stimulus.
For instance, two identical neurons 1 and 2 in a synaptical unit U define a space of initial states (and so of orbits) that is composed by all the pairs (x 1 (0), x 2 (0)) of vectors in the phase space of each neuron. But three identical neurons 1, 2 and 3 in U , define a space of initial states composed by all the triples (x 1 (0), x 2 (0), x 3 (0)) of vectors. So, the diversity of orbits that a neuronal network can exhibit, enlarges if the number of neurons of each synaptical unit enlarges. In Section 5, we will study the theoretical optimum in the dynamical response of a family of networks that are synaptical equivalent. We will prove that this optimum exists and that it is achieved when the network has the maximum number of cells (Theorem 15).

Second step of the proof (Synaptical equivalence between networks)
The purpose of this section is to prove the existence of an equivalence relation (Definition 9) in the space of all the neuronal networks modelled by the mathematical hypothesis of Section 2. This is the intermediate result in the second step of the proof of Main Theorems 16 and 17. We will deduce this intermediate result from the previous ones obtained in Section 3. Let N and N ′ be two neuronal networks according to the model defined in Section 2. Denote: N and N ′ the numbers of neurons of N and N ′ respectively, i and i ′ a (general) neuron of N and N ′ respectively, l and l ′ the respective numbers of homogeneous parts of N and N ′ , according to Definition 5. s and s ′ the respective numbers of synaptical units according to Definition 6. B and B ′ a (general) homogeneous part of N and N ′ respectively. U and U ′ a (general) synaptical unit of N and N ′ respectively. ∆ U,B , ∆ U ′ ,B ′ the synaptical weights, according to part (iv) of Proposition 7, of N and N ′ respectively.

Definition 9 (Synaptically equivalent networks -Intermediate result in the proof of Main Theorems 16 and 17)
We say that N and N ′ are synaptically equivalent if: • l = l ′ , s = s ′ , according to the above notation.
• There exists a one-to-one and surjective correspondence ϕ from the set of synaptical units U of N and the set of synaptical units U ′ = ϕ(U ) of N ′ such that where B ′ = ϕ(B) is the homogeneous part of the network N ′ whose synaptical units are the images by ϕ of the synaptical units that compose B.
• For any synaptical unit U of N where F i and F i ′ are the second terms of the impulsive differential equations (2.1) that govern the dynamics of the neurons i and i ′ , respectively.
We note that, for synaptically equivalent networks, the number of neurons, and also the number of nonzero synaptical interactions, may vary. For instance, the networks of Figures  1 and 3 are synaptically equivalent, but their respective total numbers of of neurons and of synaptical interactions are mutually different. Comments: The equivalence relation between networks N and N ′ , according to Definition 9, implies that both N and N ′ will have exactly the same dynamical response (i.e. they will follow the same orbit), provided that, for any dynamical unit U of N , the initial states of all the neurons in U are mutually equal and also equal to the initial states of all the neurons in the dynamical unit U ′ = ϕ(U ) of the other network. In fact, since the impulsive differential equations (2.1) that govern the dynamics of all those neurons coincide, and since the synaptical jumps that each of those neurons receive from the other neurons of its respective network also coincide, their respective deterministic orbits in the future must coincide if the initial states are all the same.
Nevertheless, if not all those initial states are mutually equal, for instance if some external signal changes the instantaneous states of some but not all the neurons in a synaptical unit, then their respective orbits will differ, during at least some finite interval of time. In this sense, each synaptical unit with more than one neuron, is a group of identical cells that distributes the dynamical process among its cells, i.e. it has the capability of dynamically distributing the information.
In brief, two synaptically equivalent networks have, as a common feature, the same statical configuration or "anatomy" of the synaptical interactions between their units (i.e. between groups of identical cells, equally synaptically connected). Then, both networks would evolve equally, under the hypothetical assumption that all the initial states of the neurons of their respective synaptical units coincided. But the two networks may exhibit qualitatively different dynamical responses to external perturbations or signals, if these signals make different the instantaneous states of different neurons in some synaptical unit. Such a difference produces a diverse distribution of the dynamical response among the cells.

Third step of the proof (Dynamically optimal networks)
The purpose of this section is to prove Proposition 13 and Theorem 15. These are intermediate results (the third step) of the proof of Main Theorems 16 and 17. We will prove these intermediate results by logical deduction from several previous statements and hypothesis. So, we start by including the needed previous statements in the following series of mathematical definitions, remarks and notation agreements: We condition the study to the networks of any fixed single class, which we denote by C, of synaptically equivalent networks according to Definition 9. In this section we search for networks exhibiting an optimum dynamics conditioned to C.

Notation:
We consider the mathematical model of a general neuronal network N ∈ C, given by the system (5) of impulsive differential equations. We denote by The solution X(t), t ≥ 0 of the system (5) of impulsive differential equations that govern the dynamics of N , exists and is unique, provided that the initial condition X(0) = X 0 ∈ M is given (see for instance [14], cited in [15]). We denote: Φ(X 0 , t) := X(t) such that X(0) = X 0 , and call Φ the (deterministic) dynamical system (or, in brief, the dynamics) associated to the network N . It is an autonomous deterministic dynamical system.
For any autonomous deterministic dynamical system (also if it were not modelled by differential equations), we have the following properties: So, for any fixed instant T > 0 the state Φ(X 0 , T ) plays the role of a new "initial" state, from which the orbit {Φ(Φ(X 0 , T ), s)} s≥0 evolves for time s ≥ 0. This orbit coincides with the piece of orbit {Φ(X 0 , T + s)} T +s≥T (for time ≥ T ) that had the initial state X 0 .

Definition 11 (Partial Order in C)
Let N and N ′ be two networks in C and denote by Φ and Φ ′ the dynamics of N and N ′ respectively. Denote by M and M ′ the compact manifolds where Φ and Φ ′ respectively evolve.
We say that N is dynamically richer than N ′ , and write if there exists a continuous and one-to-one (non necessarily surjective) mapping for any initial state X ′ 0 ∈ M ′ . In other words, N ′ ⊑ N if and only if the dynamical system Φ ′ of N ′ is a subsystem of the dynamical system Φ of N , up to the continuous change ψ of the state variables.
From Definition 11 it is immediately deduced the following assertion: N ′ ⊑ N and N ⊑ N ′ if and only if their respective dynamical systems Φ ′ and Φ are topologically conjugated.
This means that the dynamics of N and N ′ are "the same topological dynamical system," up to an homeomorphic change in their variables which is called a conjugacy. So, we deduce: "⊑" is a partial order in the class C of synaptically equivalent networks up to conjugacies.
As an example, assume that the numbers N and N ′ of neurons of N and N ′ satisfy If this function ψ satisfies Equality (7), then each orbit {Φ ′ (X ′ 0 , t)} t≥0 of the dynamical system of N ′ , is identified with one orbit Φ(X 0 , t) of the dynamics of N . Along this orbit Φ(X 0 , t), each two consecutive identical neurons have the same initial states, and thus also have coincident instantaneous states for all t ≥ 0. Nevertheless, the whole dynamics Φ of the network N also includes many other different orbits, which are obtained if the initial states of some pair of consecutive identical neurons of N are mutually different.
Remark 12 From Definition 11, since ψ : M ′ → M is continuous and one-to-one, we deduce that the image ψ(M ′ ) is a submanifold of M which is homeomorphic to M ′ . This is a direct application of the "Domain Invariance Theorem" (see for instance [18]). Therefore: dim(ψ(M ′ )) = dim(M ′ ).
Besides, ψ(M ′ ) ⊂ M . So, M contains the submanifold ψ(M ′ ) that has the same dimension than M ′ . We deduce the following statement: Proof: Both networks are synaptically equivalent; so, each neuron i of N is structurally identical to some neuron (which we still call i) of N ′ . This implies that the finite dimension of the variable x i ∈ M i in the network N is equal to the finite dimension of the corresponding variable After Equality (6) applied to the networks N and N ′ respectively, we obtain: where N and N ′ are the number of neurons of N and N ′ respectively. From Inequality (8) we have: , Finally, substituting (10) and (11), we obtain: and joining with (9) we conclude N ′ ≤ N , as wanted.
Definition 14 (Dynamically optimal networks) We say that a network N ∈ C is a dynamical optimum conditioned to the synaptical equivalence class C (i.e. within the class Theorem 15 (Existence of the dynamical optimum -Intermediate result in the proof of Main Theorems 16 and 17) For any class C of synaptically equivalent neuronal networks there exists a dynamical optimum network conditioned to C. This optimal network has the maximum number of cells among all the networks of the class C.
Proof: The class C of synaptically equivalent networks is characterized by the numbers l and s of homogeneous parts and synaptical units respectively, and by the real values ∆ U,B of the synaptical connections between the dynamical units U and the homogeneous classes B ⊃ U .
For each dynamical unit U , we denote by l U ≤ l − 1 the number of homogeneous classes B ⊃ U such that ∆ U,B = 0. Thus, l U ≥ 1 because each cell i ∈ U is not indifferent, and so, there exists at least one nonzero synaptical action departing from i. (Recall that by Definitions 4 5, the nonzero synaptical actions only exist between cells belonging to different homogeneous parts.) Construct a network N as follows: First, compose each dynamical unit U with exactly l U cells. Then, there exists a surjective one-to-one correspondence Λ U between the set of cells i ∈ U and the set of homogeneous parts B ⊃ U satisfying ∆ U,B = 0.
Second, define the synaptical connections departing from each cell i of each dynamical unit U , by the following equalities: We will prove that the network N such constructed is dynamically optimal within the class C: Fix any network N ′ ∈ C. Consider the dynamical systems Φ and Φ ′ corresponding to the networks N and N ′ respectively. Denote by M and M ′ the compact manifolds where Φ and Φ ′ respectively evolve. According to Definition 11,to prove that N ′ ⊑ N it is enough to construct a continuous one-to-one mapping ψ : M ′ → M satisfying Equality (7). (7). To do so, we must define the initial state x i (0) of any cell i of the network N .
So, fix i ∈ N . Denote by U the synaptical unit to which i belongs, and denote by B = Λ U (i) the unique homogeneous class of the network N satisfying (12). We denote where i ∈ U and B = Λ U (i).
Since N ′ is synaptical equivalent to N (because both networks N and N ′ belong to the same class C), we apply Definition 9 to deduce the following equalities: From Definition 6, there exists a unique cell i ′ ∈ U ′ such that Summarizing, for any fixed neuron i ∈ U ⊂ N we have constructed a unique cell i ′ ∈ U ′ = ϕ(U ) ⊂ N ′ such that Equalities (14), (5) and (16) hold. In other words, we have constructed a mapping Π : i ′ = Π(i), defined from the synaptical equivalence between the networks N and N ′ , such that: where B = Λ U (i) is the unique homogeneous class in N satisfying (12). Assertion A: The mapping Π transforms each cell i of the network N into the cell i ′ = Π(i) in the network N ′ , which is structurally identical to i. In fact, assertion A follows from the fact that N and N ′ are synaptically equivalent (cf. Definition 9) and from Equality (17).
Let us prove that Π is surjective. In fact, for each i ′ ∈ N ′ , there exists at least one homogeneous part B ′ such that ∆ i ′ ,B ′ = 0, because i ′ is not indifferent. By Definition 9, B ′ = ϕ(B) where ϕ is a one-to-one and surjective transformation between the homogeneous parts of N and N ′ . Therefore, there exists a unique homogeneous part B of N such that By construction of the network N , if ∆ U,B = 0, then there exists a unique i ∈ U such that ∆ U,B = ∆ i,B . Then, we deduce that ∆ i ′ ,ϕ(B) = ∆ i,B = 0. Joining with (17), and recalling that for each synaptical unit U ′ there exists at most one cell i ′ ∈ U ′ such that ∆ i ′ ,B ′ = 0, we deduce i ′ = Π(i). This proves that Π is surjective.
We define the initial state x i (0) of the cell i ∈ N by x i (0) := x Π(i) (0), and the mapping ψ : M ′ → M by ψ(X ′ 0 ) = X 0 such that The mapping ψ is continuous because the components . Besides, the mapping ψ is one-to-one (but non necessarily surjective). In fact, if proving that ψ is one-to-one. To end the proof of the first part of Theorem 15, it is now enough to check that the mapping ψ satisfies Equality (7): From Equality (5) and from the surjectiveness of Π, for each initial state X ′ 0 of the network N ′ , and for each neuron i ′ ∈ N ′ , the corresponding set of neurons i ∈ Π −1 (i ′ ) ⊂ N have initial states x i (0) which equal x ′ i ′ (0). Besides, from Assertion A, i ∈ Π −1 (i ′ ) and i ′ are structurally identical. Now, we consider Equalities (14), (5) and (17), applied to any neuron j ∈ N and j ′ = Π(j) ∈ N ′ , in the respective roles of i and i ′ = Π(i). We deduce that the synaptical interaction jumps ∆ j ′ ,i ′ that i ′ receives from any other neuron j ′ ∈ N ′ coincides with the synaptical interaction jumps ∆ j,i that i ∈ Π −1 (i ′ ) receives from j ∈ Π −1 (j ′ ) in the network N . Therefore, both i ′ and Π −1 (i ′ ) satisfy the same impulsive differential equation (5). Besides, their respective initial conditions x ′ i ′ (0) and x Π −1 (i ′ ) (0) coincide, due to Equality (5). Since the solution of the impulsive differential equation (5) that satisfies a specified initial condition is unique, we deduce the following statement: For any instant t ≥ 0 the state x i ′ (t) coincides with the instantaneous state x i (t), where i ∈ Π −1 (i ′ ).
Recalling Definition 10 of the dynamics Φ and Φ ′ of the networks N and N ′ respectively, we deduce: . Applying again Equality (5), which defines the mapping ψ for each fixed instant t ≥ 0 as the new initial state, we conclude , proving Equality (7), as wanted.
We have proved that N ′ ⊑ N for all N ′ ∈ C. Thus, in each synaptical equivalence class C there exists a network N that is the dynamical optimum conditioned to C. Now, let us prove the second part of Theorem 15. We have to show that the number N of neurons in N is the maximum number of neurons of all the networks in the class C. In fact, since N ′ ⊑ N , after Proposition 13 we get N ′ ≤ N , where N ′ is the number of neurons of N ′ , for all N ′ ∈ C.

End of the proof of Dale's Principle
Let ℵ be the set of all the neuronal networks according to the mathematical model defined in Section 2. Let C ⊂ ℵ be a fixed class of synaptically equivalent networks. The purpose of this section is to end the proof of the following Main Theorem of the paper:

Theorem 16 (Dale's Principle is necessary for the dynamical optimization)
If N is the dynamical optimum network conditioned to C, then all the neurons of N satisfy Dale's Principle.
Namely, any neuron of N is either inhibitory or excitatory.
End of the proof of Theorem 16: Let N be the dynamical optimum among the networks in C. Therefore, Thus, applying Proposition 13, the numbers N and N ′ of neurons in N and N ′ respectively, satisfy Denote by ∆ j,h the synaptical action from the neuron j to the neuron h = j in N , for any j ∈ {1, . . . , N }. Assume by contradiction that there exists a neuron i ∈ N which is mixed, according to Definition 1. Let us fix such a value of i. Now, we construct a new network N ′ ∈ C as follows: First, include in N ′ all the neurons j ∈ N , in particular j = i. Define in N ′ the synaptical interactions ∆ ′ j,h as follows: Second, add one more neuron in N ′ , say the N + 1-th. neuron, which we make, by construction, structurally identical to the i−th. neuron. Define It is immediate to check that N ′ is synaptically equivalent to N . In fact, all the neurons, except the added one N + 1-th. cell in N ′ , are respectively structurally identical in the networks N and N ′ . Besides, all the synaptical interactions, except those that depart from i and N +1, are the same in both networks. Finally, also the nonzero synaptical interactions that depart from i in the network N , are equal, either to the synaptical interactions that depart from i in N ′ (if positive), or to those that depart from the new neuron N + 1 in N ′ (if negative). So, N ′ is synaptically equivalent to N . In other words, N ′ ∈ C. To end the proof, we note that the number N ′ of neurons of N ′ is N ′ = N + 1 > N , contradicting Inequality (19).

Counter example
The purpose of this section is to exhibit a counter-example that shows that the converse of Main Theorem 16 is false (Theorem 17).
Theorem 17 is the second Main Theorem of the paper. Its proof is deduced from the intermediate results that were previously obtained along the paper, and is ended by showing the explicit counter-example from Figures 1, 2 and 3.

Theorem 17 (Dale's Principle is not sufficient for the dynamical optimization)
There exist neuronal networks according to the mathematical model of Section 2 that satisfy Dale's Principle and are not dynamically optimal conditioned to their respective synaptical equivalence classes.
End of the proof of Theorem 17: We will show an explicit example of a dynamical suboptimal network N within a synaptical equivalence class C, such that N satisfies Dale's Principle. We will exhibit such an example with N = 5 neurons, but it can be repeated (after obvious adaptations) with any arbitrarily chosen number N ≥ 3.
Then, the neurons 2 and 4 are excitatory and the neurons 1, 3 and 5 are inhibitory. Thus, all the neurons of the network N satisfy Dale's Principle.
As shown in Section 4, the network N ′ of Figure 3 is synaptically equivalent to the network N of Figure 1. In other words, both networks N and N ′ belong to the same equivalence class C. Since N ′ has exactly 6 neurons and N has 5 neurons, applying Proposition 13 we deduce that N ′ ⊑ N . Thus N is not the optimal network of its class C.

Final Comments
In Section 2 we posed the mathematical simplified (but general) model of biological neuronal networks, by a system (5) of deterministic impulsive differential equations. In its essence, this model was taken from [11] (some particular conditions of the model were also taken from [9,8,12,13,10] and from the bibliography therein).
On the one hand, the mathematical model is an idealized simplification of the network, because the spiking of each neuron is reduced to an instantaneous reset, without delay, of its membrane potential. Also the synaptical actions are assumed to be instantaneous and have no delay.
On the other hand, the abstract mathematical model is general, since we require neither particular formulae, nor numerical specification, nor computational algorithms for the functions f i , F 1 and ∆ i,j of Equations (1), (2.1) and (5), nor specific values for the parameters.
In Section 3 we defined the homogeneous parts of the network, composed by mutually identical cells. The groups of neurons, which we call "synaptical units", are formed by structurally identical and synaptically representative neurons. In Proposition 7 we proved that any neuronal network, according to the mathematical model described in Section 2, is decomposable in more than one homogeneous part, and that each homogeneous part is decomposable into pairwise disjoint dynamical units. Then, a simplified graph, which we called "inter-units graph" mathematically represents the statical structure of the synaptical connections among the groups of neurons in the network. This theoretical approach have rough similitudes with empirical research in Neuroscience [17], for which the structure of synaptical connections among groups of neurons or regions in the brain is studied, regardless how many neurons exactly compose each region.
In Section 4 we conditioned the study to a fixed family of networks that are mutually synaptically equivalent. We denote this family by C, and call it a class. Even if this condition may appear as a restriction, it is not. In fact, first, any neuronal network (provided that it is mathematically modelled by the equations of Section 2), belongs to one such a class C. Second, all the results that we proved along the paper stand for any arbitrarily chosen class C of synpatically equivalent networks.
Each class C of synaptically equivalent networks gives a particular specification for the number of synaptical units and for the inter-units graph. This specification implies a particular statical "anatomy" in the synaptical structure of the network, described by the different groups of mutually identical neurons (and not by the neurons themselves). Each group of neurons is a synaptical unit that has a characteristic functional role in the complex synaptical structure of the network.
Roughly speaking, a class C of mutually synaptically equivalent neuronal networks works as an abstraction of the following analogous example: When a Neuroscientist studies the nervous system of certain species of animals, he is investigating a class of neuronal networks composed by a relatively large amount of particular cases that are indeed different networks (one particular case for each individual of the same species). But all the neuronal networks in that class share a certain structure, which is given, for instance, by the genetic neurological characteristics of the species. Some type of synaptical connections between particular groups of neurons with specific physiological roles, is shared by all the healthy individuals of the species. However, the exact number of neurons, and the exact number and weight of synaptical connections between particular neurons, may vary from one individual to another of the same species, or from an early age to a mature age of the same individual.
In Section 5 we studied the abstract dynamical system of any neuronal network defined by the mathematical model of Section 2, and conditioned to a certain fixed class C of mutually synaptically equivalent networks. In Theorem 15 we proved that (theoretically) a dynamically optimal network exists in each class C.
The proof of Theorem 15 is constructive: first, we defined a particular network N ∈ C, and second, we proved that N is the richest network of its class. This means that N would potentially exhibit the most diverse set of dynamical responses (orbits in the future) when external signals change the instantaneous state of some of its neurons.
Since the system is assumed to be deterministic, any network according to this model will reproduce a unique response if the same instantaneous state occurs for all its neurons. So, the space of responses is represented by the space of instantaneous states (or initial states, if time T is translated to become 0). Nevertheless, this space may change from one network to another of the same synaptical equivalence class C. If we assumed that the "natural pursued aim" in the development of a biological neuronal network were to optimize the space of dynamical responses under stimulus, preserving the same characteristic and functional structure between groups of cells, then, theoretically, the final (but maybe never arrived) network would be N , constructed in the proof of Theorem 15.
In Section 6 we proved Theorem 16, which is one of the main results of the paper. It states that the dynamically optimal network N in the class C must satisfy Dale's Principle (i.e. all its neurons are either excitatory or inhibitory but not mixed). So, if the natural pursued aim in the development of the neuronal network were to optimize the space of possible dynamical responses, then the tendency of the network during its plastic phases will provoke that as many neurons as possible satisfy Dale's Principle. From this point of view, Theorem 16 shows that Dale's Principle is a consequence of an optimization process. So, it gives a mathematically possible answer to the following epistemological question: Why does Dale's Principle hold for most neurons of most biological neuronal networks?
Mathematical answer: Because maybe biological networks evolve pursuing the theoretical optimum or richest dynamics, conditioned to preserve the synaptical connections among its different homogeneous groups of neurons.
Finally, in Theorem 17 we proved that Dale's Principle is not enough for the neuronal network be a dynamical optimum within its synaptically equivalence class. In other words, Dale's Principle would be just a stage of a plastic optimization process of the neuronal network, but its validity does not ensure that the end of that hypothetical process of optimization has been arrived.