_{1}

^{*}

Proposed here is a new framework for the analysis of complex systems as a non-explicitly programmed mathematical hierarchy of subsystems using only the fundamental principle of causality, the mathematics of groupoid symmetries, and a basic causal metric needed to support measurement in Physics. The complex system is described as a discrete set S of state variables. Causality is described by an acyclic partial order w on S, and is considered as a constraint on the set of allowed state transitions. Causal set (*S*, *w*) is the mathematical model of the system. The dynamics it describes is uncertain. Consequently, we focus on invariants, particularly group-theoretical block systems. The symmetry of S by itself is characterized by its symmetric group, which generates a trivial block system over S. The constraint of causality breaks this symmetry and degrades it to that of a groupoid, which may yield a non-trivial block system on S. In addition, partial order w determines a partial order for the blocks, and the set of blocks becomes a causal set with its own, smaller block system. Recursion yields a multilevel hierarchy of invariant blocks over S with the properties of a scale-free mathematical fractal. This is the invariant being sought. The finding hints at a deep connection between the principle of causality and a class of poorly understood phenomena characterized by the formation of hierarchies of patterns, such as emergence, selforganization, adaptation, intelligence, and semantics. The theory and a thought experiment are discussed and previous evidence is referenced. Several predictions in the human brain are confirmed with wide experimental bases. Applications are anticipated in many disciplines, including Biology, Neuroscience, Computation, Artificial Intelligence, and areas of Engineering such as system autonomy, robotics, systems integration, and image and voice recognition.

Groups and groupoids are very similar, but a critically important case where they behave very differently appears to have remained unnoticed or under-reported. We introduce this case first, and then we review its considerable physical meaning in the context of complex systems. For the present purposes, we are only interested in finite groups and groupoids.

A group is a finite set with a binary function to itself. A groupoid is a finite set with a partial binary function to itself. A block system over some finite set

The action of the group or groupoid on

The case of interest arises when set

Without the partial order, the procedure immediately breaks down. The initial permutation groupoid contains all

In summary, given

The partial order is what makes the difference. Function

So far, we have discussed only pure Mathematics with no physical content. In Mathematics, the groupoid symmetries will work with any partial order, whether it has a physical meaning or not, and the resulting hierarchies will be correct but not necessarily related to the natural hierarchies.

In Physics, instead, we recognize the fundamental principle of causality as the common entity that underlies all complex systems. We choose the partial order to be causal, as dictated by the principle, and we expect the resulting structures to have a physical meaning, to be observable and measurable, and to show a quantitative agreement with those observed. When the partial order is chosen to be causal, the mathematical object

Causal set

A basic metric for causal sets is introduced to support measurement. The metric is not needed in Mathematics, but is necessary in Physics because it makes the invariants observable and therefore measurable. A new theory grounded on the principle of causality naturally emerges for a discrete causal space where a dynamics is defined as a process of graph search and invariants with a semantic value are derived directly from the groupoid symmetries without any intervening dynamic law. The findings hint at a deep connection between the principle of causality and a class of little understood phenomena characterized by the formation of hierarchical patterns, including emergence, self-organization, adaptation, intelligence, and semantics. The causal groupoid symmetries of the complex system, described by function

The link between symmetry and invariance is well known in Mathematics and Physics. The theory formalizes this link. Set

From the point of view of Mathematical Logic, function

Causal groupoid symmetries were considered in [

Studies of symmetry, structure and invariance are traditionally carried in the context of groups, but a generalization to groupoids was proposed [

Studies of adaptive dynamics resulting from the constraint of causality were also conducted in a very different context, that of Cosmology [

In yet another very different context, similar results were derived from Price’s equation of evolutionary biology, suggesting that emergence is a property of causal information [

Our work did not follow any of the paths just reported. It was prompted by the experimental discovery, around 2005, that certain canonical matrices had extraordinary self-organizing properties [

This is a bottom-up theory that follows directly from the principle of causality and an abstract metric for causal sets that is necessary to support measurement in Physics, and nothing else. There are no heuristics, no arbitrary constants. As such, the theory should be expected to have considerable impact, scope, and unifying power across disciplines.

The next two sections cover the theory. The abstract algebraic aspects of the theory are discussed in Section 2. Physics is discussed on Section 3, where the basic metric and an action functional are introduced. We show that groupoids arise naturally from causal models of complex systems. We also show that fractals and scale invariance are a direct consequence of causality. Block systems and conditions for them to be non-trivial are discussed in Section 4. A thought experiment that serves as a simple example of application of the theory is covered in Section 5. Practical details for constructing causal models of physical systems are covered in Section 6. Practical implementations and predictions are discussed in Sections 7 and 8, respectively.

Supplementary material included with this publication consists of two MS Word files. Other published studies include a study of randomly generated small systems [

In this section, we examine the Mathematics of causal groupoid symmetries in more detail. Let

irreflexive:

acyclic if

transitive if

Define now

To begin our study of dynamical invariants, let now

where permutations are represented in two-line notation, and

is a groupoid, where

A block system is, in turn, a causal set where the blocks are the elements. When blocks are constructed, some of the original ordered pairs become encapsulated in the blocks, and the remaining ones are induced in the block system and form new ordered pairs linking blocks. The new causal set has its own smaller causal space, groupoid, and block system. Repeating the construction, a fractal hierarchy of invariant blocks is obtained, all determined by the original causal set

To study dynamical invariants, we introduce a partial metric and a functional. Both were experimentally observed by the author [

where

Functional

where the factor 2 is included for back-compatibility. This functional is called the action of trajectory

Functional

We now return to our original goal of finding groupoids with non-trivial block systems, while at the same time avoiding introducing any heuristics or adjustable parameters into the theory. We have, however, intentionally left one open parameter: set

For example, in the thought experiment of Section 5, the 10 input variables

Given a causal set

reachable goals for a given

When a complex system evolves from state

From the two most fundamental principles of Physics, we know of two cases where such processes exist. From the Second Law we know that isolated systems spontaneously evolve towards a state of maximum uncertainty, and from the Principle of Least-Action we know that a system that evolves between two states follows a stationary-action trajectory. Are these two processes appropriately accounted for by the causal theory? Is our assumption that the functional of Equation (5) represents action, justified? Are there or could there be any other processes that favor some form of selection of goals? And if so, what is the connection with emergent phenomena?

The plot in

We know that any

The

The

When viewed from the graph-theoretical point of view, the behavior of

Causal entropic forces always point in the direction of increasing entropy. An isolated system with a fixed energy will maximize its entropy and become an

But we still can’t predict when block systems are non-trivial. There is no known closed-form mathematical answer to that, and there is no proof for the existence of one either. For now, we must be satisfied with construction.

An imaginary scientist who lived before Newton and knew nothing about Euler’s equations, took measurements at fixed intervals of time using mirrors attached to rotating tops that reflected light onto fixed rulers. She noticed that shorter intervals resulted in better accuracy, but other than that, the measurements had no meaning for her. She measured 10 real-valued variables, named them with generic names:

The 21 lower-case variables become the 21 elements of set

The alphabetical order of the elements of

There is no need to determine the groupoid explicitly. Set

where we have re-inserted the algebraic signs. When the same step is completed for all 3 components, a pattern becomes apparent. To enhance the effect, we first rename the 10 input and 3 output variables as follows:

After renaming, the 3 final equations are:

The final step is to take the mathematical limit

The meaning of the variables is now familiar. The scientist’s problem is now solved, and the procedure used to solve it is of a general nature and completely unrelated to the specifics of the problem. The final expressions in Equation (12) are new facts derived directly from facts given in Equation (6). The symbols now have a semantic value because they carry meaning that did not originally exist. These results were not explicitly programmed. They follow directly from the Mathematics of groupoids and the fundamental principles of nature that describe the properties of physical matter.

A person who wanted to solve this problem would need skills possibly comparable to those of a mid-level Physics student. This person would start by examining the sequences in which the variables in Equation (6) are calculated. She would notice that variable

By contrast, it is possible to imagine an “intelligent machine” that would mechanize the entire process using causal sets and causal groupoid symmetries, and solve the problem without any human intervention. This machine would have no need for any of the skills expected from the Physics student. It would complete execution perhaps in microseconds instead of hours, simply because it will use far less information than the student, and because hardware runs much faster than wetware. Even the mathematical limit, needed to write the final Equation (12), is a simple causal operation in 2nd order causal logic (see Section 2.6 in [

Furthermore, the positive definite functional of Equation (5) can only be minimized by minimizing the positive contribution from each one of the causal pairs in the causal set. These contributions are only locally interdependent, and can be minimized locally, without the need for any information of a global nature. The property of locality naturally leads to a massively parallel computational architecture. A machine designed along these lines has been proposed in Section 4.3 of [

What is fascinating about such machine, is that it solves the problem without having been asked to do so. Nothing in the machine is specific to the given problem, nothing has been specifically programmed for any purpose other than minimizing the functional. The machine does not know any Physics, does not know what it is doing or why, and has not been trained with any skills other than the fundamental causal metric. The machine works only from input data containing a causal description of the topology of the problem. It simply decides to do what it does on its own, without any external direction. There is only one machine of this kind for all problems.

To the best of our knowledge, this is the first time that high cognitive behavior has been quantitatively shown to emerge from a non-explicitly programmed model.

Of course, the thought experiment is very small, and does not sufficiently represent the power of the ideas being exposed. In 1985, on page 633 of Metamagical Themas, and in his familiar style, Douglas Hofstadter asked: “The major question of AI is this: What in the world is going on to enable you to convert 100,000,000 retinal dots into one single word ‘mother’ in one tenth of a second?” Each retinal dot generates a causal pair, of which the cause is the beam of light, and the effect is the signal the dot sends to the brain. There is going to be a causal set with 100 million pairs (additional geometric information is necessary but is ignored here for simplicity). The causal set will likely partition into many components, each representing features of the image. But the question of which features and how to represent them and inter-relate them, is decided by the process itself. There is no need to prescribe any “feature vectors” specific to the problem at hand. There is also going to be a groupoid and symmetries and a hierarchy of structures for each component. Will those structures be sufficient to identify the image? Will the machine solve the recognition problem in microseconds? Will the theory scale appropriately to such a large problem? We believe it will. We know it will scale because the fractal property of the invariants guarantees a constant execution time, independent of the size of the problem. We believe it will solve the recognition problem in microseconds. But the only way to know for sure is to try. Pronouncing the word “mother” would be a little beyond scope here, but it can be reduced to groupoids as well.

As usual, the first step for the analysis of a complex system is to prepare a mathematical model of it. In the causal theory, this model is a causal set. Causal set models are ubiquitous. They can be found everywhere, and are part of our daily life and of all our scientific activities. But their direct use is relatively new in Science. To dispel concerns, we open the section with the following rules of thumb:

1. If you can write a computer program about it, then you can also write a causal set about it.

2. The output of any sensor is a causal set.

3. The input to any actuator is a causal set.

In fact, a computer program is a causal set written in a condensed human-readable notation. Everything in the program is finite, even “real” numbers. Everything in the program is causal and deterministic, even “random” numbers. The principle of causality posits that effects follow their causes, but does not require that causes be known for every effect or that “all” causal pairs for a given effect be known. The typical situation is one where we know some cause-effect pairs for some effects and want to test what we know, compare with observation, and eventually seek to obtain more information and repeat the process. Conversions from software notation to causal set notation are admittedly cumbersome, but will not be once they are automated for the most important programming languages. They have been discussed in Section III in [

The same considerations apply to algorithms, sets of equations, either algebraic or differential, theories, and many other forms of expression. This is a causal world, and causality underlies everything. The rules of thumb above provide some idea of how to go from underlying causality to a causal set model.

The entire body of Science is a constant quest for causes. We use them to create mental models that allow us to predict, and we constantly refine them by searching for more detail. We find these activities natural.

At this point, it is important to compare the causal theory with the methods of statistical physics. In statistical physics, we deal with complex systems where it would be too difficult to gather enough information to directly apply the laws of Physics to every interaction. We are forced to disregard details that appear to be irrelevant, and to apply probabilistic methods trying to approximately explain the properties of matter in aggregate. Once disregarded, we do not have a direct mechanism to evaluate the relevance of a particular detail without significant changes to the model.

In causal physics, particularly when we deal with a complex system, we may again run into a situation where information is limited and incomplete, and it may be difficult or unfeasible to gather more detail. We start with a coarse model, and we still want to test what we have and compare with observations in the hope that the test will either be satisfactory or at least tell us where more detail may be necessary. When more detail becomes available, then that detail itself is a causal set, and we simply merge it with the original causal set and re-calculate the invariants. This is the causal version of the procedure known as machine learning in Artificial Intelligence. See Sections IV and V.F in [

The critical difference between causal analysis and statistical analysis is that statistical methods discard detail, while causal methods preserve detail, and in fact there is no limit to the degree of detail they can carry. And because adding detail is a simple merge, it can be mechanized, with the additional detail obtained either by experiments carried by human scientists or directly by sensors. Permanent causal models can be created that evolve automatically by learning from their environments.

In such sense, it can be said that the causal theory offers an alternative to Statistical Physics, with very different features.

The most interesting application of the causal theory to complex systems is expected to be the brain. It is our working hypothesis that the brain acquires causal information directly from the sensory organs and transforms it through a causal process. Other applications will include Artificial Intelligence and Computer Science.

The Supplementary Material contains examples of causal sets and of the procedures used to obtain them. Additional literature includes [

In this Section we review some practical details for a prototype computer implementation of causal groupoid symmetries and the causal theory that needs them. The host-guest process described above is a system with various basic components. It needs an input module that can receive causal data from a variety of sources such as sensors, computer files, existing algorithms and computer programs optionally in various languages, etc., and deliver the corresponding causal set. It also needs an output module with presentation capabilities and support for the user to view and use the results. And it needs a central processing unit that can find the least-action or maximum entropy groupoid symmetries and corresponding structural hierarchies for the given causal set. The core unit consists of a linear array of simple computational units, called neurons, each of which can communicate only with its two nearest neighbors, but which are otherwise completely independent. All neurons work at the same time, each minimizing its own contribution, and resulting in massive parallelism. The execution time for a given problem is constant and independent of the size of the problem provided the number of neurons equals or exceeds the number of elements in the causal set.

This machine is not specific to any particular problem. It is general, the same machine for all problems. There is only one such machine, although many different implementations on the same or different substrates are possible. The difference will be in the kind of training it receives and in the size and computational power needed to address that kind of training. Once a machine is trained in some specific area, then copies can be made and individual machines created that specialize in subareas, as needed. Except for the “copies” part, the reader may have noticed that the present paragraph would apply to humans almost without change.

The functional in Equation (5) is positive definite, defined as the sum of positive contributions from each one of the causal pairs. It can only be minimized by minimizing each one of the contributions. The positive definiteness property connects local to global and is essential for emergence because it allows a collection of independent agents to achieve a global effect that exceeds the sum of their individual contributions. This positive definiteness is also critically necessary to allow the massively parallel implementation of the action minimization algorithm. Based on the independence of the agents, the functional is minimized when each agent locally minimizes its own contribution without the need for any global information or sense of purpose.

These elements of design have been discussed in Section 4 of [

This design uses a large number of very simple processors, each with only a tiny amount of memory. The number of processors is required to be at least as large as the number of elements of the causal set to be processed, and the total execution time is constant, independent of the number of elements, provided the condition is satisfied that each processor can minimize its local contribution in a period of time that does not exceed a certain constant value. These conditions are strongly reminiscent of the proverbial massive parallelism of the human brain and of its apparently very fast response time even to the largest of problems. In fact, they suggest an explanation for the large size of the human brain. However, the hardware prototype would be expected to run perhaps 10^{6} times faster than the brain just because electronics is much faster than wetware. In time, we expect that microchips with millions, perhaps billions of processors will be built.

No unit of the type proposed here has yet been build, at least not in our knowledge. All computational work and examples discussed in this paper were performed on a personal computer.

In [

Based on the working hypothesis, a causal set model of the brain was proposed. The model is a network where nodes correspond to elements and links to relations in the causal set. The physical length of the connections corresponds to the abstract measures

One of the predictions states that, if the brain does have the ability to minimize action, then dendritic trees must be as short as possible. At that time, the belief in Neuroscience was that the total length of dendritic trees followed a 4/3 power law, which is not optimally short. Later, however, a team of neuroscientists working independently from us proposed an optimally short 2/3 power law with a wide experimental base including the human brain and even across species [

Another prediction was also made in [

Still in the same working hypothesis, we predicted that the brain should operate like a host-guest dynamical system, where the host is unconscious and of a thermodynamic nature, while the guest is conscious and algorithmic. We predicted this independently, from the model, and later found out that in mid 19th century, in his studies of the visual system in humans, Physicist and Physiologist Hermann von Helmholtz had predicted that a process of inference should exist in order to complete the processing of an image. He named this process “Unconscious Inference”, because we weren’t aware it was taking place. We also learned of the Dual Process theory in Psychology, which proposes an implicit, unconscious process, which is automatic and we can’t control, and an explicit conscious process that we control. This convergence of concepts from three very different sources is remarkable, but even more in particular because the causal theory is not a theory of the brain and contains nothing indicating that brains even exist.

The European Example in the Supplemental Material is a black-box brain experiment, where the results from a task usually performed only by humans and considered as high brain function, are compared with results directly predicted by the causal theory. The task in question is object-oriented design—the design of the class and inheritance structure for the architecture of a computer program, usually done by a human analyst from a given problem statement. To do the experiment, we start from the finished product, an OO design and program developed by human analysts. We remove all structure and order, leaving only the causal relationships in the program, but no information at all about the classes. This step leaves only a causal set, nothing else. Then, we apply the usual causal theory, which has not been explicitly programmed to do OO design and knows nothing about computers or programs, and certainly not about software engineering. The result is a class structure nearly identical to the original man-made one.

Another example in the Supplementary Material, the Point Separation example, is concerned with the problem of image recognition, and corresponds to

It should be noted at this point that the limitation is one of architecture, and would not be solved by a supercomputer. The correct architecture is discussed in Section 4.3 of [

Finally, the thought experiment in Section 5 starts from (imaginary) experimental information supposedly measured for a rotating top, and leads to the Euler equations for rotating bodies obtained directly from the causal theory and the notion of mathematical limit. Once more, the causal theory knows nothing about tops or theories of Physics and has not been told to derive any equations.

The value of these predictions and verifications stems from the fact that the same non explicitly-programmed system was used for all cases, without any modifications and without any programming specific to any particular problem.

A causal theory of Physics is proposed where a causal set is considered as a mathematical model of a complex system and groupoid symmetries existing in the causal set are applied directly to derive a hierarchy of invariants for that system. The theory is fundamental because it follows directly from the principle of causality and a basic causal metric, and does not include any heuristics or adjustable parameters. It applies to all causal systems, even under conditions where only partial information is available, because any model can always be refined by adding more information as it becomes available. As such, it provides an alternative to traditional statistical methods.

A discrete causal space independent of traditional space and time is introduced, where states, trajectories and invariants exist, and invariance is proposed as the base for semantics. Groupoid symmetries with invariant hierarchies of block systems naturally exist in causal space as a consequence of the partial order of trajectories, and many of them may be imprimitive, meaning their invariant hierarchies are non-trivial. These hierarchies are the invariants being sought.

The metric naturally partitions the causal space first into connected components, then into macrostates, and finally into block systems that are invariant under the action of the groupoid. The partition gives rise to exactly three types of macrostates that can be specified by optimization and are therefore fundamental. Any of the other macrostates would have to be specified by parameter and would not be fundamental. In correspondence with the fundamental macrostates, three very different classes of physical systems can be identified, designated as

These considerations hint at a deep connection between causality and invariance with self-organization, emergence, semantics, and, ultimately, intelligence and adaptation. The theory is proposed as a formalization of this connection. Preliminary computational experiments and predictions made for the brain, both quantitative and qualitative, some already confirmed, suggest a correlation with observed cognitive structures.

To the best of our knowledge, this is the first time that high cognitive behavior has been quantitatively shown to emerge from a non-explicitly programmed model. The results are remarkable because the software has never been told to do what it did, and was never given any problem-specific goals. It simply decided to do it. We said in the Introduction that our focus was on groupoids with non-trivial block systems. That’s our focus, but our vision goes much farther: it is to make intelligence a problem of Physics and Applied Mathematics.