Motoyosi Sugita — A ”Widely Unknown” Japanese Thermodynamicist who Explored the 4th Law of Thermodynamics for Creation of the Theory of Life

The purpose of this paper is to introduce to you, the Western people, nowadays a ”widely unknown” Japanese thermodynamicist by the name of Motoyosi Sugita and his study on the thermodynamics of transient phenomena and his theory of life. This is because although he was one of the top theoretical physicists in Japan before, during and after WWII and after WWII he promoted the establishment of the biophysical society of Japan as one of the founding members, he himself and his studies themselves have seemed to be totally forgotten nowadays in spite that his study was absolutely important for the study of life. Therefore, in this paper I would like to present what kind of person he was and what he studied in physics as a review on the physics work of Motoyosi Sugita for the first time. I will follow his past studies to introduce his ideas in theoretical physics as well as in biophysics as follows: He proposed the bright ideas such as the quasi-static change in the broad sense, the virtual heat, and the field of chemical potential etc. in order to establish his own theory of thermodynamics of transient phenomena, as the generalization of the Onsager-Prigogine’s theory of the irreversible processes. By the concept of the field of chemical potential that acquires the nonlinear transport, he was seemingly successful to exceed and go beyond the scope of Onsager and Prigogine. Once he established his thermodynamics, he explored the existence of the 4th law of thermodynamics for the foundation of theory of life. He applied it to broad categories of transient phenomena including life and life being such as the theory of metabolism. He regarded the 4th law of thermodynamics as the maximum principle in transient phenomena. He tried to prove it all life long. Since I have recently found that his maximum principle can be included in more general maximum principle, which was known as the Pontryagin’s maximum principle in the theory of optimal control, I would like to explain such theories produced by Motoyosi Sugita


Who knows Motoyosi Sugita Who is Motoyosi Sugita
The name of Motoyosi Sugita (see Figure 1) is "widely unknown" all over the world today. It is so as well even in Japan nowadays. In this paper I would like to introduce you to this important Japanese theoretical physicist. I did neither know his name nor his work until this spring in 2016. As I started recently writing a paper on the theory of thermodynamics in the irreversible processes, I found him, this brightest fellow in the early stage of Japan after WWII. Actually he was one of the top figures in the theoretical physicists in Japan right after WWII.
Motoyosi Sugita founded the Japanese Biophysics society as one of the first founding members. To start up the society, in order to show how the scientific society of Biophysics is important to the Japanese Government, they presented a book on Biophysics as the proceedings of the first meeting among the Japanese Biophysicists [1]. In the part III of this book, Motoyosi Sugita wrote a review, "The Biological Open Systems and Fluid Equilibrium", where his use of the words "Fluid equilibrium" is meant the so-called "Dynamical equilibrium". It was his life-time objective to construct "Thermodynamics of life".
As I read any one of his articles, I have been impressed by his very deep thought on Life as well as Thermodynamics. His ideas seem to me very important and crucial for understanding the physical aspects of life, and therefore, very prompt for making a breakthrough in the research of theoretical biology. Thus, in this paper I would like to summarize what I have studied from his works.  In September in 1956, he became a professor both for the Tokyo Commercial University and Hitotsubashi University until it was repealed by the school system change by the Government in 1962.
In April in 1959, he became a lecturer for the intensive lecture for Modern Technology at the Department of Management Sicence, the Konan University and a lecturer for the intensive lecture for the General Commercial Engineering, the Department of Economics, the Ooita University.

Marrige
Motoyosi Sugita married Ms. Grace Sakae Oyama in 1933 (see Figure 4). Grace is her Canadian name. She was a Japanese originated Canadian whose ancestry was Christian and immigrated from Hirosaki, Aomori, Japan. She was born in Vancouver, British Columbia, Canada. She graduated from the Victoria High School in Vancouver, Canada. She entered the Nursing School in Lamont, Alberta, Canada in 1928 and graduated at the top of the school. She became the first nurse of the Japanese-Canadians in Canada. The picture was taken at his age of 56 in front of her parents' home in Hirosaki, Aomori, Japan in July, 1961, when they visited there for their greeting to the family right before they went to attend the Conferences in Canada and USA. During the visit abroad she was able to meet her family and relatives in Canada for the first time in 28 years since she came to Japan for her marriage with him. (by courtesy of Ms. Setsu Honda).
She came to Japan for marriage with Motoyosi Sugita, leaving her family in Canada in 1933. During WWII, she spent very sad time because Japan and Canada became enemy each other, and her family in Canada was forced to be sent to the concentration camps in Canada. Long after WWII, when they visited USA and Canada for attending the International conferences for biophysics and bioengineering, she was able to meet her family members in Canada for the first time in 28 years.
They had one son, Yūkiti. Yūkiti went to Indonesia for his business after Motoyosi and Sakae died. However, he failed his business and he returned back to Japan. He spent his final days in his family's summer house in Hokuto-shi, Yamanashi, Japan. Yūkiti died on 2 August 2012. Only one relative of Motoyosi Sugita's family is Ms. Setsu Honda who lives in Hirosaki, Aomori, Japan. Other relatives are now living only in Canada.
The above information was sent as a letter from Ms. Setsu Honda as her courtesy. I really appreciate it from the bottom of my heart.

Visits abroad
On July in 1961, he visited the United States of America and Canada for three months. He attended the 4th International Conference on the Medical Electronics held at New York and the International Conference on Mathematical Biology held at North Carolina.
In August in 1965, he visited the U.S.S.R., Austria, Italy, France, England, West Germany, and Denmark for three months. He attended the International Conference on Molecular Biology held at Napoli, Italy and the second International Conference on Biometrics held at Helgoländ, West Germany.
At the time, he became the president for Bioengineering of the Japanese Society for Medical and Biological Engineering (until 1967).
In 1967, he visited the U.S.S.R., Sweden, West Germany, Netherlands, Belgium, France, Swiss, Austria for three months. He attended the 7th International Conference on Medical Electronics held at Stockholm, Sweden and the 3rd International Conference on Biometrics held at Helgoländ, West Germany.
In 1969, he retired from the Hitotsubashi University and became a professor emeritus (see Figure 5). On the 14th day of January in 1990, he passed away at the age of 85.

The Research History of Motoyosi Sugita
As early as in 1930's before WWII, he started to study physics. During this time, at first he seemed to spend much time to translate German physics papers written in Germany (Deutschland) such as Carl Wagner [93] and Georg Siemens [94] into the Japanese and published the articles to the Journal of the Mathematical and Physical Society of Japan (Su-butsu Gakkai Shi). Once he found the concept of the virtual heat, he applied it to the thermodynamics of transient phenomena, and in doing so, he published papers in German in the Japanese journals [14,15,16,17,18,19,20,21].
Thus he seemed to be an expert for the German language in the Japanese physics society at that time before WWII, since in the Japanese education system at that time the Japanese education system had been admirringly adopted from the German system as the first foreign language in the schools in Japan. And surely before WWII, Germany was one of the top countries in sciences including Chemistry and Physics at that time.
Although it has been perfectly forgotten already, Japan was a leading long-time economical supporting country for Germany that was economically totally broken by the WWI. Many Japanese business men privately supported the German society as well. A famous example was Hajime Hoshi who was one of the richest fellows in Japan at that time and he was the founder of the Hoshi Pharmaceutical Company and the Hoshi College of Pharmacy. Hajime Hoshi had supported the Chemical Society of Germany for a long time until Germany would recover [95] up to the era of the Adolf Hitler's Third Reich of Germany.
Nearly ten years before WWII, Sugita published several famous papers in German as well as in English in the Japanese journals [14,15,16,17,18,19,20,21] as mentioned above. However, after WWII his works seemed to be ignored in the Japanese physics society. Because since then, the Japanese education systems totally changed to fit with the English-based society of U.S.A. from the German-based society of Deutschland before WWII. This changed to adopt English as the first foreign language instead of German in the schools.
Motoyosi Sugita studied the foundation of thermodynamics for biological systems [2], and continued it after WWII. From the line of his German physics study which was the top physics country at that time mentioned above, he studied the theory of the German physicists, Becker and Döring [96] and Volmer [97] and an American physicist Frenkel [98] on the cluster growth in the metastable phases in supersaturated vapors.
As early as in 1948 right after the damage of WWII slightly reduced in the society, Motoyosi Sugita published an important paper that discussed the relationship between the metastable(or quasi-static) phenomena in thermodynamics and biological phenomena in the Japanese journal, Kagaku [23]. It was also published in the textbook entitled by Thermodynamics of Transient Phenomena [3].
As the Japanese society was coming back till 1950 he published a more fundamental paper in a Japanese journal, Seibutsu Kagaku [29]. After a long study on the theory of thermodynamics in the transient phenomena such as life, he first postulated that there might exist the 4th law of thermodynamics; otherwise one cannot understand biological phenomena. He stated his considerations on it in §5 entitled by "Can one consider the 4th law of thermodynamics?". I would like to quote here in the corresponding part from its English version [60] as follows: · · · · · · By the way, let us think here the circumstance deeply. According to the 2nd law of thermodynamics, Gibbs' free energy, G, of the world has tendency to decrease in isothermal and isometric change. On the other hand, we find the tendency that the velocity of decreasing of G, i.e.,Ġ wants to take a large value as far as possible. This might be a general principle of nature which I should like to call temporarily the 4th law of thermodynamics.
The foundation of such a large principle will be discussed later, and we can suggest here that it is very important and beneficial idea that the nature of the transient phenomena as well as the living system may be clarified and explained uniformly by this principle.
There are many delicate problems concerning human thought if we propose to clarify the nature of life on the basis of physical and chemistry. In any way the matter looks like as if it were concerned in the 4th law. · · · · · · Hence, following the line of thought of Motoyosi Sugita [29], I can summarize the laws of thermodynamics as follows: (i) The first law (W. Thomson's principle): The Gibbs free energy G is conserved in a closed system; G = 0.
(iii) The third law (Nernst's theorem): The entropy approaches zero as the absolute temperature T approaches zero; S(T = 0) = 0.
(iv) The 4th law (M. Sugita's postulate): The decreasing rate of the Gibbs free energy always takes the maximum in any process; |Ġ| = max, where G ≡ −|G| ≤ 0.
Fortunately Sugita published the above paper one year later in English [60]. But it was unfortunate since the journal of the Hitotsubashi university (to which he belonged) that he published was not famous at all among Western physicists as well as the Japanese physicists. And also it has not been available to the public for so long until recently after the internet service was provided.
In 1953 Motoyosi Sugita has found the way to apply the theory of thermodynamics of transient phenomena to more realistic biosystems such as metabolic systems [41,42,43,44,45,76,77,78,79]. From this stage his research entered the second stage to construct the thermodynamics of life.
Step by step his way of thinking became cybernetics-like, where the feedback control systems played an important role in his theory [46,47,48,49,50,51,52,53,54,55,56,57,58,59,81,83,84,85,86,87,88,89,90,91]. One of them was cited in Steuart Kauffman's famous book, The Origins of Order [99]. Since no computer system was easily available for the bio-systematic calculations in Japan at that time, Motoyosi Sugita collaborated with electrical engineers to construct analog-digital computer circuits for their calculations. They simulated the circuits to obtain the solutions of their-own models of the metabolic control systems. These ideas were summarized as books [1,5,6].
After retiring the Hitotsubashi University, Motoyosi Sugita became to write and publish many general books to the community. [7,8,9,10,10,12,13] Although he published many papers and textbooks in science as well as many general books in Japanese, he published only about ten papers in English by unknown reasons. That is why he was so unknown in the Western countries as well as in Japan. Hence, nobody knew him nowadays, and so did I, in spite of his extremely important contributions to the thermodynamics theory.
In this paper I would like to review some important consequences of his theory and discuss the maximum principle in the open non-equilibrium systems as the foundation for the 4th law of thermodynamics to the readers especially in the Western countries.
In Section 2, I will show the bright ideas of Motoyosi Sugita such as the concepts of the broad quasi-static change, the irreversible cycle, and the virtual heat.
In Section 3, I will discuss the Motoyosi Sugita's approach to the diffusion phenomena as the first successful application of his concepts.
In Section 4, I will review the theory of phase change and condensation as a preparation for understanding the following sections.
In Section 5, I will present the theory of thermodynamics of transient phenomena of Motoyosi Sugita, where his theory of chemical reactions will be shown using the concept of the field of chemical potential.
In Section 6, I will show the Motoyosi Sugita's concept of the maximum principle in the transient phenomena. Here |Ġ| = max conjecture will be discussed, which is a demonstration for the existence of the 4th law of thermodynamics.
In Section 7, I will compare the work of Motoyosi Sugita and those of Lars Onsager and Ilya Prigogine. I hope that the content of this section will be shared with the Western people.
In Section 8, I will discuss the maximum principle of Motoyosi Sugita and that of Pontryagin as well as the Bellman's principle of optimality. This includes my own theory of the application of the Pontryagin's maximum principle to thermodynamics. Therefore, I believe that this section is as my emphasis most important among other things.
In Section 9, I will show the Motoyosi Sugita's theory of metabolism which is the first application of his maximum principle to theory of life. He spent many years as many as 20 years for studying this problem from many sides repeatedly.
In Section 10, I will present the Motoyosi Sugita's way of thinking on the theory of life. This opens up the thermodynamics of life or life being as well as the network thermodynamics.
In Section 11, as the final section, a simple summary will be made.

The Bright Ideas of Motoyosi Sugita
A couple of years after Onsager published his seminal papers on the reciprocal relations in the irreversible processes in 1931 [100,101], Motoyosi Sugita published the theory of thermoelectric effects and the Kelvin's relation in 1933 [17]. This was much later published in Japanese in the Japanese journal during WWII [20,21] and included in his new text book of theromodynamics, "Netsu Rikigaku Shinko", meaning Thermodynamics New Lecture [2]. In this research he introduced the concept of broad quasi-static change, the virtual heat, and the irreversible cyclic processes in order to describe the irreversible changes in thermodynamics of transient phenomena.

Sugita's Concept of the Broad Quasi-static Change
Motoyosi Sugita [2,3] meant the quasi-static change "in the broad sense" by the naming of the broad quasistatic change. The broad quasi-static change is defined when the following conditions may be assumed: where f 1 and f 2 are some functions of V and T , respectively. These are a generalization of the conditions for the case of ideal gases where P = kT /V and U = 3 2 kT are satisfied with k being the Boltzmann constant. When these conditions are assumed to be satisfied, one can almost follow the standard approach of the quasi-static processes in thermodynamics which means that the process is indefinitely very slow. Indeed, even in the non-idealistic cases the treatment of the quasi-static change in the normal sense has been applied and given plausible results.
This means that the local equilibrium can be satisfied even in the non-equilibrium states, where the broad quasi-static change makes sense if one excludes the relaxation phenomena in which conditions (i) and (ii) are not satisfied. Unless the relaxation phenomena are considered, the broad quasi-static change can be applied to most of irreversible processes. The concept of the broad quasi-static change has the amazing possibility of development, once it is combined to statistical mechanics.
Suppose that many macroscopic parts are in the equilibrium state. Let us denote by S i the entropy of that i-th small part and by F i its Helmholtz free energy. The values of the entropy S and the Helmholtz free energy F of the whole system are given by Therefore, if we use This means that when one considers the broad quasi-static change in the irreversible processes, one should not take or need not take the W or Z over the entire phase space. One must cut off the phase space into the small pieces that are in local equilibrium, and concatenate them to cover the whole phase space. This is the meaning of the local equilibrium in the point of view of Motoyosi Sugita.

Sugita's Concept of the Virtual Heat
Another assumption that forms the theory of irreversible processes is the concept of the virtual heat. Let us consider the case when the heat δQ e ejected from the reservoir that is outside the system. Let us suppose that the reservoir follows the same change of the broad quasi-static change as the system under consideration. Let us denote by δQ the heat absorbed by the thermodynamic system and by T its temperature. Let us assume that the origin of the irreversibility lies inside the system and consider the friction inside the system. Then, is satisfied, where dU is the change in the internal energy and dW is the work to do the outside system. If we consider the expansion of the piston-cylinder system, dW = P dV . Suppose that there exists the friction between the wall of the cylinder and the piston and suppose that the work f dV is consumed by the friction.
What is important here is the distinction between δQ and δQ e and that between Eq.(2.A) and Eq.(2.B). In Eq. (2.5), if the work f dV is consumed and become the heat outside of the cylinder, then for the inside of the system (e.g., the gas) dU + P dV = δQ e = δQ = T dS. (2.12) For the outside of the system, the reservoir gives the heat dQ e to the gas system and at the same time it receives as a heat the energy lost from the system by the work f dV . Hence, Therefore, for the whole system we have Thus, the Sugita's concept of the virtual heat is very natural and important when one considers the thermodynamics of transient phenomena.

Sugita's Concept of the Irreversible Cycle
Let us now consider the Sugita's concept of the irreversible cycle. Let us denote by T 0 the lowest possible temperature available of the thermal reservoir. Let us denote by dS e the entropy change of the reservoir with temperature T e and denote by ∆Q e its lost heat, in the course of the process when the irreversible cyclic process is preformed. Then we have since the reservoir gives the heat ∆Q e to the system. Let us denote respectively by T and S the temperature and the entropy of the system (working material) that performs the cycle. Then we have dS = ∆Q e + ∆H T , (2.16) since the system receives the heat ∆Q e in addition to the virtual heat ∆H of the process, where T < T e . Hence, for the whole system, we have After finishing the cyclic process, the system has to come back to the initial stage of the process (i.e., the initial condition) such that it yields Hence, the total entropy change in the cycle is given by On the other hand, for reversible processes, since Thus, the entropy change occurs only in the thermal reservoir outside the system in the process of the irreversible cycle, such that Eq.(2.18) cannot conflict with the second law of thermodynamics. The equality of Eq.(2.18) has a great meaning that the integration of dQ T vanishes when the path is closed around the course of the irreversible cycle. Therefore, one can treat quantitatively the thermodynamics of the irreversible cycle in the same as that of the reversible cycle.
I would like to note that this aspect of the Sugita's concept of the irreversible cycle is different from that of the Prigogine's concept of the irreversible cycle, where the entropy change of the system is treated as a quantity that always increases during the process such that dS > 0 (in their notation d i S > 0) [102,103,104,105,106,107,108,109,110,111,112]. This is the consequence of what they never consider the irreversible cycles. Now let us consider the change in the thermal reservoir. The heat dQ e is ejected to the system. Therefore, its entropy change dS e is given by since dQ e is the heat adsorbed by the reservoir. Here the quantity dQe Te has been called the reduced heat. However, the reduced heat is not a real heat for the reservoir, since the sign of it is reverse to that of entropy of the reservoir [Eq. (2.22)]. Thus, although the concept of the reduced heat plays a historical role, it is not so important as a physical quantity. From Eq.(2.19) together with Eq.(2.15) the following holds This is nothing but the Clausius' inequality for the irreversible cycle, where the equality holds true for reversible processes. This can be regarded as the generalization of the standard proof for the Clausius' inequality for the irreversible cycles [2,3,4,113]. Historically speaking, Clausius accomplished to derive this relation for the first time. From the fact that the equality holds true in the reversible processes, he showed that ∆Q T becomes an exact differential and then he derived the entropy. In his approach he represented a closed curve representing any thermodynamic cycle by a staircase with adiabatic curves and isothermal curves. Regarding the cycle as a combination of many infinitesimal Carnot cycles, for the high temperature sources, denote by T n1 and ∆Q n1 the high temperature and the received heat of the n-th thermal source, respectively, and for the low temperature sources, denote by T n2 and ∆Q n2 the low temperature and the eject heat of the n-th thermal source, respectively. Taking into account the sign of each heat, he proved the relation by using Now if the intervals could be made so as to be infinitesimally small, then Eq.(2.24) becomes This is the Clausius' inequality derived from himself [113]. However, once we look at the expressions in Eq.(2.24), the sign of the reduced heat ∆Qn1 Tn1 or − ∆Qn2 Tn2 is opposite to that of the entropy of the thermal source. In the quasi-static reversible processes it becomes exactly the entropy change of the working material and hence the equality of the above equation (2.25) can hold. From this situation, usually it has been thought that "one cannot treat the theory quantitatively since the inequality holds in the irreversible processes" or "when one seeks for the entropy by dQ T , dQ must be the heat received under the reversible process". It is because one escapes from the complexity of the irreversible processes such that one need not take into account the working material, and because one discusses the cycle process only considering the lost heats. On the other hand, although not always but when the process can be regarded as the broad quasi-static change, and when the true character of the virtual heat is clearly known such as the friction heat or thermal conduction, we can take ∆Q = ∆Q e + ∆H instead of ∆Q e . And as discussed in the above, even in the irreversible processes we can define the entropy by Eq.(2.17) and can use the equality Eq.(2.18) as well.
Since the state goes back to the initial state after the completion of the cycle, it seems trivial that "the entropy goes back to its initial value as well". Since it was said that in the irreversible processes, "one cannot say anything about the entropy in the midst of the process", we could not have said anything about like the above. Now we should note that dQ is neither obtained by ignoring the irreversibility nor is given by intentionally regarding the system as being reversible.
In order to derive the entropy, one must consider the quasi-static reversible processes and define the "heat" of the processes in the standard point of view as usual. But once one defines the entropy under the quasi-static change, one can use the relation dQ = T dS in the broad quasi-static change, whether or not the process is reversible or irreversible. This point seems to us neither clear nor emphasized in the standard texts. In other words, the distinction of the use of dQ and dQ e is not thorough according to one's needs. That is, if one takes dQ e then the inequality of Eq.(2.23) holds; if one takes dQ then the equality of Eq.(2.18) holds. One can generalize the above argument to the more general cases such as the case when there exists the thermal conduction in the working material as well as the case when the thermal temperature is not uniform.

Application to the Kelvin's Relation in the Thermoelectric Effect
As an application of the above results, we become able to treat the problem of thermoelectricity (see Figure  6), which had been thought to be difficult to consider. Now, let us denote by (1) the high temperature part with the temperature T 1 and by (2) the low temperature part with the temperature T 2 [3]. Denote by H 1 the heat ejected from (1) and denote by W 1 and w 1 the heats dissipating in wire A and B, respectively. Seeing from the outside of (1), the ejected heat per secondQ e is given by Next, let us denote the resistance by r 1 . The Joule heat (regarding the virtual heat) due to the electric current i is r 1 i 2 . Therefore, the heat that this part actually absorbs per secondQ is given by This is the so-called Peltier heat. If we take π as the propotional constant, then dQ = πidt. Therefore, we now have For the part (2), we similarly obtain where the minus sign in the left hand side comes from the reverse direction of the current i in the circuit. Next consider the part ∆x in the wire A. Define by W the heat that is conducting the wire and flowing in the part and define by W ′ the heat that is conducting the wire and flowing out the part. Denote by dH the heat emitted from the part ∆x to the surroundings. Denote by r A ∆x its resistance of the wire A. Then, the heat that this part absorbs is given by This is corresponding to the quantity dQ. This is called the Thomson heat, where σ A is a constant, dT the temperature difference in this part. Similarly for the wire B and considering the part ∆y, we obtain where the sign in the right hand side is due to the condition that the direction of the temperature is reversed to the direction of the current i.
In the stationary state, the sum of the heats that this system absorbs equals the work that the electromotive force E can do. Hence, one has (2.32) which is the equation corresponding to the relation (2.B) but its integrated form, where the integration of P dV corresponds to Ei. On the other hand, the equation corresponding to the relation (2.A) is given by Next, suppose that the relation dQ = T dS (assuming T = T ′ ) of Eq.(2.8) holds, where in order to satisfy the relation (2.B) the assumptions of (i) and (ii) are necessary. For (i), it is easy for u to satisfy. But for (ii), it cannot be easy for P to compared with the electromotive force in equilibrium. Therefore, we assume dQ = T dS instead of (i) for the foundation for the proof. By this, we obtain It is obvious when the thermoelectric couple is in the stationary state. Then the entropy change occurs only in the thermal reservoir (or one may think the one-round of electron through the circuit as a cycle.) From Eq.(2.32) and Eq.(2.33), the Kelvin's relation is given by So far we are based on the Clausius' inequality: The derivation of the above result Eq.(2.37) is as follows: Denote by a A , a B the cross sections of the wires A and B and by λ A and λ B the thermal conductivities of the wires A and B, respectively. Using the above notations, we first hold the following relations For the wire A we find By integration by parts we obtain On the other hand, for the wire B we similarly obtain respectively. Here we think that r 1 i 2 and r 2 i 2 are included in the sums of r A ∆xi 2 and r B ∆yi 2 . Therefore, based on the inequality (2.37), one cannot derive the Kelvin's relation, as long as the thermal conduction and Joule's heat can be neglected. Namely, to take Eq.(2.37) as the base is not wrong but not sufficient [3]. As Boltzmann [114] had shown the following relation   [2,3,4,20,21] pointed out the insufficiency of the Tolman's argument [115] in the same problem.

Sugita's Approach to the Diffusion Phenomena
Let us next consider the Motoyosi Sugita's approach to the diffusion phenomena [3]. In order to see how the concepts of the broad quasi-static change and the virtual heat work in each physical problem, he applied them to the diffusion problem [3]. The diffusion phenomena are really important phenomena when we construct the theory of thermodynamics in transient phenomena. To do so, I would like to follow his argument in his test book [3].

Langevin Equation
Now let us consider an ideal gas that is constructed from the mixing of the two species of molecules 1 and 2. For the sake of simplicity, we assume that the density gradient only exists in x direction. This is equivalent to consider the one-dimensional diffusion problem along x direction.
Let us denote by m 1 (m 2 ) the mass of each molecule of species 1 (species 2). Let us denote by n 1i /cm 3 (n 2i /cm 3 ) the number of molecules per cm 3 of species 1 (species 2) which have the velocity v 1i (v 2i ) along x direction. The mass of species 1 that passes the interface of 1cm 3 piercing x in a second is Now, let us denote by n ′ 1 , n ′ 2 the mole numbers per cm 3 of molecules of species 1 and 2. We have where N a = 6.06 × 10 23 , the Avogadro number. We now have Now taking as m 1 ≈ m 2 and if one can ignore the shift of the center of gravity due to the diffusion, then from Fick's law, one can obtain where i = 1, 2. On the other hand, if it cannot be ignored then one has to put the viscous term of the fluid. If we put as then the above equation looks like the Langevin equation type such as So far we have considered the case of ideal gases. However, for more general cases, if we use the right hand side of Eq.(3.6) can be rewritten as or if we use J i then we can rewrite as As Sugita pointed out, the Langevin equation is the equation of motion that one thinks as if the stationary motion is considered statistically so that the effect of acceleration cannot play a role to the averaged velocity, and the "random forces" come from like-particles in thermal motions are statistically averaged.
Next let us explain that when the work done by the resistance of friction K i v i is added to the system as the virtual heat, the increase of entropy by this process becomes that of entropy by the mixing of the molecules of species 1 and 2.

Mixing Entropy and Free Energy
where µ i0 is assumed to be constant in the system and R = N a k. Then, the Gibbs free energy of the entire system should be given as If the equation is reversely seen, then the chemical potential has to be defined by Now, since the G is decreasing by the mixing, from Eq.(3.9) we have This means that in the mixing system of ideal gases, the mixing entropy is increasing by the mixing of the molecules and by it G is decreasing. And although the decrease of G is the decrease per second in the irreversible process, we cannot necessarily know but surely know the change in time of G in the midst of the process.
Based on the Langevin equation Eq.(3.6 ′ ), suppose that the molecule is forced to move by −grad µ i (x) and it transports against the friction. Suppose that we can regard the friction the virtual heat and therefore the entropy of the system can increase by the virtual heat.
For the sake of simplicity let us limit ourselves to the ideal gases. Then, only the energy per molecule K i v 2 i is given as a heat per second from the virtual heat. This amount of the quantity per volume becomes n i K i v 2 i . For the whole system it provides only the heat that is defined by ∑ If we assume the process is an isothermal change, then replace K i v i by Eq.(3.6 ′ ). We have The first term in the right hand side vanishes if the boundary condition is taken. In the second term in the right hand side, since we have the continuity equation: if we change n ′ i as n i in Eq.(3.1 ′ ), then Eq.(3.13) can be represented as On the other hand, in the ideal system the change in G only occurs in the change of entropy. Therefore, if we rewrite ∆S asṠ, then we obtainĠ Hence, this agrees with Eq.(3.11). From considering the above situation, we can recognize that the assumption that the force acting on the diffusion particle is given by Eq.(3.6 ′ ) is not inconvenient. Let us consider the case of the non-ideal systems [28]. In this case the chemical potential µ i0 becomes density dependent. And in the case of the isothermal and isobaric mixing, the volume change and the flow in and out of the heat can occur. And therefore the entropy change is no longer given by that of the ideal mixing, R ∑ i n i log c i . For this time, Eq.(3.9) and Eq.(3.11) hold true and the entropy increase due to the receive of the virtual heat is given by Eq.(3.15) as well. How can one interpret this? The mixing entropy in this case can change since there occurs the thermal heat exchange other than the virtual heat. If temperature T and pressure P are kept constant, and if the work done for the outside of the system is given by P dV and the change in the internal energy is given by dU , then the input and output of the heat is given by Therefore, once we add the heats due to Eq.(3.15) and Eq.(3.16) to the above increase of entropy due to δQ e , we findṠ This is not inconsistency but rather everything is consistent very much. If we compare the above with Eq.(3.16), thenĠ This means thatĠ occurs by ∆S (≡Ṡ). Thus, even for this case of non-ideal gases the entropy increase due to the friction by the mixing can be regarded as the heat received from the virtual heat source. This is also thermodynamical support for the phenomenological construction of the theory where the quantity such as grad µ i can be regarded as a "field". Motoyosi Sugita called this approach the Onsager-Meixner-Sugita's method [3].

How to Count the Number of Partition
For this title, Motoyosi Sugita used the word the number of complexion. The number of complexion is nothing more than the number of partition in classical statistical mechanics or the number of states in quantum statistical mechanics in the modern terminology [116,117]. When the system is in equilibrium, the number of ways that particles interchange their positions in the system is nothing but the number of complexion in his terminology. In this equilibrium case, one can definitely define the chemical potential such as Eq.(3.8).
Can such a treatment be allowed even in the midst of the non-equilibrium process? This was the Motoyosi Sugita's problem. Obviously, it is allowed for the equilibrium state of the process. But it is not trivial for the non-equilibrium state in the irreversible or transient phenomena.
Suppose that the previous argument that provides Eq.(3.8) is correct [28]. Assume that µ i0 is constant. Then one finds where R = N a k, the gas constant and N a the Avogadro number. Or regarding the above integral for dx as the sum over ∆x, we can rewrite it as in a different form: Hence, by taking the functional derivative for the above with respect to n i (x)∆x, the chemical potential µ i (x) is given by The second term in the right hand side comes from the mixing entropy of the system. Therefore, it corresponds to what one assumes that the number of partition at an instantaneous time in the midst of the process is given by is derived directly. Therefore, the meaning of W i is the partition number that is considered within the local part of ∆x i . Now, suppose that the whole region of space is divided into small parts of ∆x i and that the molecule traffic between the small parts is assumed to be forbidden. In the time-interval of ∆t for the irreversible process, each part accomplishes its local equilibrium with the concentration c i (x j ), which means that the concentration of the i-th molecule at x j within ∆x j , such that the number of partition is given by Eq. (3.23) and G is given by Eq. (3.21). Corresponding to this state, some region in the phase space has to be considered statistical mechanically.
Next, if we remove the partitions and wait for a short time so that the diffusion occurs, and if we place the partitions in the system once again, then the distribution of the molecules becomes different concentrations of c i (x j ) from those before, since the diffusion mixes molecules within such a short time. And the system becomes a new local equilibrium since the partitions are entered in the system. If we looked at the corresponding area in the phase space, then it would move to the position slightly away from the previous one as long as the time interval is very short.
In this way, as the distribution of concentrations is continuously changing by the diffusion, the corresponding region in phase space is also moving continuously through the region corresponding to each state in the midst of the irreversible process and going into the region corresponding to its final equilibrium state. Once the true equilibrium is achieved, not changing or mixing the molecules within the region ∆x j but the changing or mixing the molecules in the entire space becomes necessary. Hence, the number of partition for the entire system has to be calculated in equilibrium.
There is no limit for time in the equilibrium state, since each process is reversible so that an infinite time can be spent. Therefore, even if molecules are far apart to each other, the state that the molecules exchange their positions can be realized by the ergodic assumption. When the diffusion occurs, c i (x j ) at each point can change continuously in time. If c i (x j ) is fixed at some moment, then for such a short time the molecules within ∆x j can exchange their positions and the macroscopic mixing state in the so-called relaxation time can be realized. The above result supports this sort of consideration.
In other words, the molecules within ∆x j are jumping the positions ceaselessly one after another and transporting microscopically different states by thermal fluctuations. Considering this situation for the phase space, we think that the representative point is doing the Brownian motion within the limited partial region of the phase space and for the time interval considered above it is wandering each point in that region. This approximation would be a rough approximation than the request from the ergodic assumption. But it wanders to the extent that the macroscopic state near the equilibrium state can be realized. Hence, the number of partition is calculated within this region such as Eq. (3.23). Thus, if we explicitly describe this, then with gradually shifting the region where the Brownian motion is carried out, the region is going to expand, and finally becomes an extremely broad region that corresponds to the equilibrium. Although this speculation must be justified statistical mechanically [118,119], it is required from the macroscopic argument inevitably.
On the other hand, for each diffusion molecule, it should not move with the constant velocity v i in Eq.(3.6 ′ ) but the central point of the molecules moves with each molecule following the Brownian motion. In many molecules there are higher speed molecules and the lower speed molecules at the same time, and even there exist molecules that reverse their motion. The averaged velocity among such molecules like the molecule of species 1 is given by v 1 . In the same way, the chemical potential µ i is meaningful as an averaged value per molecule. This is related to v i . Thus, we consider the averaged quantities without considering each motion of the molecule such as the kinetic theory of gases. Although it is a phenomenological theory, in order to generalize the treatment of complex statistical mechanics kinematically and to relate it to thermodynamics, there exists nothing else so far.
In summary, the above argument is the Motoyosi Sugita's argument on the application of the concepts of the broad quasi-static change and the virtual heat to the diffusion phenomena [3]. He also applied his brilliant concepts to many other systems such as the osmotic pressure for the cell membranes. However, I would like to skip this here.

The Theory of Phase Change and Condensation
Before going to discuss the Motoyosi Sugita's theory of thermodynamics of transient phenomena in detail, let us look at the theory of nucleation or condensation in 1930's [96,97,98] as an excursion.
His theory was motivated to be apply it to construct the thermodynamic theory or thermodynamics of life or life being [3]. In order to construct thermodynamics of life or life being, he presented mainly three big concepts: (i) the "field" of chemical potential; (ii) the generalized nonlinear Ohm's law; (iii) the maximum principle in transient phenomena.
He extracted these concepts from intensively studying the theory of phase change and condensation or nucleation originated by the German physicists, Becker and Döring [96] and Volmer [97] and an American Physicist, Frenkel [98] before WWII. Therefore let us first review some of the early theory of nucleation phenomena as a prototype or precursor of the theory of thermodynamics of transient phenomena of Motoyosi Sugita.

Frenkel's Theory on Nucleation in the Supersaturated State
When Mayer [120] studied statistical mechanically the nature of vapor, he considered clusters that are aggregated into groups of several molecules in addition to single molecules. Frenkel [98] simplified this idea to phenomenologically represent such groups of several molecules as spherical clusters with a certain radius.
Let us now denote by (n) a cluster with n molecules. Denote by ϕ n its free energy. Let us define by µ l the chemical potential per molecule in the liquid state. Let us define by σ the surface energy per area of a small drop whose radius is defined by r. In this setting, the free energy of the drop with radius r is given by (4.1) If we define by ν the number of molecules in 1cm 3 in the liquid state, and if we write as Next, define by N n the number of (n)(n-clusters) in the vapor and we define as then the Gibbs free energy G of the whole vapor is given by where T is the temperature of the system, k is the Boltzmann constant. Here ϕ 1 = µ g is the chemical potential for a single molecule in the vapor and the second term in Eq.(4.4) comes from the mixing entropy of the molecules. When the mixing entropy between single molecules and clusters with different sizes should be evaluated, the expression of mixing entropy for molecules of identical sizes is assumed and used to this case. This seems a problem. And as n becomes large, the expression for the mixing entropy must be modified. Furthermore, we cannot imagine that big liquid drops can float in the vapor. Therefore, the above expression of the Gibbs free energy is not exact quantitatively. However, if we restrict ourselves to the phenomenological argument, then it seems sufficient to approximately represent the system by Eq.(4.4).
On the other hand, let us denote by N ′ the total number of molecules in the whole system. We now have Suppose that this mixture of clusters in vapor lies in equilibrium. In this case, seeking for the maximum of G with taking the variations of δN ′ = 0 and δG = 0, we obtain after some manipulation This is the equation that determines the distribution of (n)-clusters first derived by Frenkel [98]. This is the distribution of concentrations of (n) clusters in the equilibrium state in the saturated and the supersaturated vapors. It means as follows: When some part of the gas conforms the mixture of (n)-clusters and when the mixture is included in the system, the entropy becomes larger than the one when all the molecules stay as single molecules in the vapor. If we look into each molecule, ϕ n > nµ g means that due to the effect of surface energy, the free energy of (n) cluster, ϕ n , is larger than nµ g , n× the free energy of each single molecule. However, if we look into the system as a whole, then the free energy of the whole system may be lowered by the mixing entropy of the distribution of the clusters of various sizes. Statistically speaking, the probability that the system lies in a state that groups of (n)-clusters are mixing exceeds the probability that all molecules remain to be single molecules.

Supersaturated Vapors
Next following the argument of Frenkel [98], let us consider the supersaturated vapor. When the vapor is supersaturated, since µ l (liquid) < µ g (gas), from Eq.(4.1) we obtain where if we denote by ν the number of molecules in 1cm 3 in the liquid phase, then α is given by Considering Eq.(4.8), as n becomes large, ϕ n − nµ g becomes negatively large. Therefore, according to Eq.(4.6) it results in that there exist only the clusters of large n. This means that an infinitely large singe cluster exists in the vapor. Hence, it is nothing but the liquid phase. Thus one can understand the behavior of the supersaturated vapor system phenomenologically. If µ l = µ g , then both the vapor phase and the liquid phase are in equilibrium and the system is a supersaturated vapor; If µ l > µ g , then the system is a heating liquid; If µ l < µ g , then the system is stable for a liquid phase. However, since the changeover is disturbed, the system has to transiently exist as a vapor, i.e., the gas phase, which is the supersaturated vapor. If one treats the system as being in equilibrium, then one obtains the result of the liquid phase only. So, the supersaturated vapor phase has never been obtained by the standard approach of equilibrium thermodynamics.
Motoyosi Sugita pointed out the following [3,23]: As stated in the previous section, one can assume that even in the transient phenomena, the expression of c n in Eq.(4.7) holds true and one can apply the c n to the expression for the Gibbs free energy G using Eq.(4.4) in the midst of the transient phenomena. This means that one cannot take the partition number over the entire space. If we do such a thing, then the supersaturated state never appears theoretically, as mentioned in the above. In this case we are considering of cutting off the clusters larger than a certain size and freezing the present c n temporarily. Now, the cause that interrupts the changeover from the vapor phase to the liquid phase is the lack of the surface area which leads to the condensation. In order that the liquid phase emerges in the supersaturated vapor phase, it has to pass through the state of extremely small liquid drops in the midst of the process. Large clusters are not included in the midst of the growth. Very small is the rate that clusters gradually grow as the condensation occurs at the surface of small (n) clusters. Therefore, the system is at a standstill as a vapor. According to Frenkel [98], this is the metastable state. The distribution of N n with small n reaches the equilibrium and it is described by Eq.(4.6).
Motoyosi Sugita noted here the following: When the Gibbs free energy for the entire system G is decreasing as the second law of thermodynamics shows, there appears some part which has larger Gibbs free energy, and this supports the place where condensation occurs, and it determines the rate of the change in the transient phenomena.
The reason why clusters have large ϕ n is due to the fact that the arrangement of molecules is very crowded almost same as in liquid, while the position energy between molecules is larger than that in the part of the same volume in the normal liquid. It has a small entropy corresponding to the order of the arrangement, which is also important when we consider life or life being. While n is smaller than some size of nuclei, since there exist the intermolecular forces, the probability that n molecules get together to make a group of n molecules is much larger than the probability of the chance of n-tuple collisions of molecules in an ideal gas where molecules move freely. This is also important when we consider life or life being. It states that when the volume of phase space, that is, the thermodynamical probability, corresponding to the existence of a complex coacervate (the phase space when we consider only the molecules that construct a coacervate) is small enough, its entropy is also small correspondingly.
Next, let us differentiate Eq.(4.8) with respect to n and evaluate the maximum. Substituting the value of α into it, and taking ν = ρ Na M where ρ is the density of the liquid, M the molecular weight and N a is the Avogadro number given by N a = 6.02 × 10 23 , respectively, we have On the other hand, let us denote by p a the pressure of the saturated vapor over the horizontally flat interface and by p the pressure of the supersaturated vapor. Now we have From this we find the Kelvin's equation: And if the maximum of ϕ n − nµ g is taken as ϕ K − n K µ g , from Eq.(4.10) we obtain where K is the maximum size of the clusters. According to Frenkel [98], the cluster of this size is the "nucleus of condensation". The range that ϕ n − nµ g is increasing as n is increasing is the region of n < n K . At n = n K the saturated pressure p in equilibrium with nuclei is equal to the supersaturated pressure p a . Hence, (due to the fact that ϕ n − nµ g becomes the maximum) the "unstable equilibrium" is realized between the small liquid drops which becomes the nuclei and the vapor. Here it is not until many clusters exist in the vapor that the clusters can become in equilibrium with single molecules in the vapor. In the nuclei only one nucleus can keep equilibrium although it is unstable.
In the above argument, we have assumed that in the metastable state the rate of growth of cluster is small enough. And although the system is not in equilibrium, we have treated it as if it were in equilibrium. This provides the variation principle δG = 0. This led to the distribution of clusters N n that is given by Eq.(4.6) or Eq.(4.7) for n small. Let us next discuss the rate of cluster growth in order to understand the essence of the metastable (quasi-static) state or the transient state. This concept seems common in fundamental aspects of the problem when we consider thermodynamics of life or life being, although theoretical appearance seems rather different.

Kinetics of Cluster Growth
According to Becker and Döring [96], the growth of the cluster is carried out in the following process: (n) + 1 → (n + 1). (4.14) Here 1 stands for a single molecule and it collides with (n), the cluster consisting of n molecules and condensates into (n + 1), the cluster consisting of n + 1 molecules. Although there may exist possibly many other process such as (n − 1) + 1 + 1 → (n + 1), and so on, the probability of these processes must be very small enough compared with that of the collision process of Eq.(4.14), since triple collisions or the collision between (3) and 1 are molecular dynamically very few events such that they loose the meaning to form (n + 1). Thus, in the kinetics or the rate theory one has to pick up the rate-determining process only, while one ignores the other not-rate-determining processes. This way of thinking is also helpful when we investigate the life phenomena. Volmer [97] slightly improved the calculation of Becker and Döring [96]. In his calculation he denotes by J the rate of the process of Eq.(4.14), which is the total number of clusters per time from (n) + 1 to (n + 1). Then he writes it as 16) where N n is the number of the clusters of (n), W I is the condensation rate per unit area, and W II(n+1) the evaporation rate per unit area and O ′ n means the surface area of sphere with the radius which is given by the sum of the radius of cluster (n) and the radius of a single molecule. Now in equilibrium of the growth process, both the forward and backward processes are attained such as This provides J = 0. From Eq.(4.16) together with this condition, it yields the following relation: where N is the total number of molecules in the system such that N = ∑ n nN n . This can be also derived directly by applying Eq.(4.6) to N n and N n+1 . Hence this is nothing but the law of mass action for the chemical process of Eq.(4.17).
In the saturated vapor, it is no wander that Eq.(4.18) must hold valid as well. However, even in the supersaturated vapor, the rate J is expected to be small such that J ≈ 0. Therefore, in the latter case we expect that both Eq.(4.17) and Eq.(4.18) hold valid approximately. This is very important when we consider the transient phenomena in the stationary state. On the one hand there exist main processes and under current processes, on the other hand the processes of (n) + 1 → (n + 1) and (n + 1) → (n) + 1 are performed and the equilibrium between them is attained. This point is very important when we investigate the dynamic equilibrium.
In summary the above theory is the theory of condensation in the supersaturated state originated by Becker and Döring [96] and completed by Volmer [97] and Frenkel [98] in the Western countries before WWII.
In considering the kinetics of the cluster growth thermodynamically, since the half of Eq.(4.18) could be valid even for the non-equilibrium state and then it holds as well, he rewrote Eq.(4.18) as follows: Denote now by c n = Nn N and c 1 = N1 N the concentration of cluster (n) and the concentration 1, respectively. Then J is given as ) .

(4.19)
Let us introduce the concept of the chemical resistance R c such as (4.20) and let us define the chemical potential for a single molecule in cluster (n) in the midst of the growth process by Then Eq.(4.19) can be rewritten as The above equation looks like the Ohm's law of J = 1 Rc ∆µ, where ∆µ = µ n + µ g − µ n+1 . The quantity in { } of Eq.(4.22) represents the "field of force". It is a kind of force that promotes the growth of clusters such that the change in the Gibbs free energy G goes towards its minimum. It corresponds to the "Chemical force" of Ostwald and hence it was called the "chemical resistance" [121]. Now, from this point of view the growth process of clusters can be regarded as a system of resistances in series in an electrical circuit following the Ohm's law. In the nucleation process the resistance R c at the size of nucleus is the largest, and indeed the rate J of the process is determined by it. Motoyosi Sugita noted the following: The mathematical form of Eq.(4.22) is much more convenient than that of Becker etc. It is not the equation to show the destiny of the growth and the decomposition of clusters. Each cluster emerges as a result of "statistical and thermal fluctuations" and the system moves towards its dynamic equilibrium by following the field of force as the needles of a clock move. This idea becomes very important when we think of crystal growth and life being. What is important here is not the quantitative treatment but the qualitative treatment of the theory. In order to introduce the concept of the "field of chemical potential", i.e., the "µ-field", the expression of J was written in the form of Eq.(4.22) in terms of the chemical potentials µ n and the chemical resistance R c . Here R c has been assumed to be determined empirically by an experiment. Thus, the above theory of the Ohm's law like theory is a phenomenological one or a rough preparation for the more exact and detailed theory that will appear in the future. When R c will be calculated mathematically using molecular statistics, one has to take into account fluctuations and improve the mathematical method of statistical mechanics. But at this stage we restricted ourselves within the phenomenological or rough preparation theory.

Motoyosi Sugita's Theory of Chemical Reactions
Let us consider the chemical reaction of the following form [104]: where A i (B j ) stand for the reacting (produced) molecules and a i (b j ) are positive integers representing stoichiometry of the reaction. For example, a simple reaction case looks like the following: On the other hand, Motoyosi Sugita [3,60] frequently used the following expression for it: where A i (A j ) stand for the reacting (produced) molecules and ν i (ν j ) are positive integers representing stoichiometry of the reaction. He distinguished the reacting or produced molecules by i or j. Denote by N i (N j ) the number of reacting (produced) molecules of ith (jth) species in mole (simply called the mole numbers). Let us denote by The changes in mole numbers of the reacting and produced molecules are represented by where a i and b i are all positive integers. Here dξ and dξ dt are called the extent of reaction or degree of achievement and the reaction rate by Theophile De Donder [102,103,104], respectively.
On the other hand, Motoyosi Sugita used ∆n instead of dξ and regarded it as the increment in molecular numbers such that where i and j stand for the reacting and produced molecules. In this way Motoyosi Sugita emphasized the change of molecular numbers in molecular statistics [3,5,60], while the De Donder's school emphasized the change in the reaction equations [102,103,104]. Following Motoyosi Sugita's theory [5], let us define the Gibbs free energy G as where N k and µ k are the number and the chemical potential of the molecule A k , respectively, and all indices for reacting (i) and produced (j) molecules have been included in the same index k. Equivalently, this induces to the following definition of the chemical potential: It means that if the amount of ∆N k of the kth molecules is added to the system from outside, then the work of is done such that the Gibbs free energy increases. Since the sum in the Gibbs free energy G is linear in N i , it shows a homogeneous equation. Hence, we have Using the reaction equation of Eq.(5.4), we have where if ∆n > 0 then the reaction moves from the left to the right. Substituting Eq.(5.9) into Eq.(5.8), we can derive the following: This is nothing but the Affinity introduced by De Donder [102] and Progogine [103,104] before WWII as well as Marcelin [122] and Jouguet [123] even before WWI in Western countries. From the knowledge of thermodynamics, G becomes the minimum when the system goes to chemical equilibrium. Hence, in chemical equilibrium, G = min; in other words δG = 0. Therefore, if we regard ∆G as a variation, then we find ∆G ≥ 0. (5.12) This yields the following criterion for the direction of the reaction of Eq.(5.2): This is the criterion given through the concept of affinity. Let us denote by S M the mixing entropy [see Eq.(3.23)] of the system of chemical molecules. The Gibbs free energy G can be written as Simply suppose that the mixing entropy is approximately given as where R = N a k is the gas constant as before. Substituting Eq.(5.15) into Eq.(5.14), the chemical potential µ k is given by Substituting these into ∆µ = 0 in Eq.(5.11), we obtain ∑ where we have defined as Rewriting Eq.(5.18), the law of mass action in the chemical equilibrium is given by where K(T ) is called the equilibrium constant at temperature T . Now let us consider when the system is not in equilibrium. In this case, Eq.(5.4) yields the rate equation: where k f means the reaction coefficient for the forward (from the left to the right) process while k b means the reaction coefficient for the backward (from the right to the left) process. We now define the relationship between k f , k b and K as Here if we may follow the argument of Prigogine et al. [103,104], then we may define as

Motoyosi Sugita's Concept of the Generalized Nonlinear Ohm's Law
Following the theory of Motoyosi Sugita [5], let us define the chemical resistance R c to the chemical reaction by Here I would like to note the following: If we rewrite as respectively, then we can derive the relation of detailed balance: Thus, the equation for R c [Eq.(5.25)] indicates a kind of the detailed balance equation. Motoyosi Sugita jumped over the standard way of thought and he assumed that the chemical potential is meaningful even when the system is not in equilibrium yet. Therefore, he assumed for all components of the molecules Using Eq.(5.28) together with Eq.(5.25), he was able to rewrite Eq.(5.24) as follows: Applying these into Eq.(5.21) or Eq.(5.23) and rewriting J = dn dt , he was able to derive the following: This expression has the generalized nonlinear form of the Ohm's law: Now we are able to know the following relation: where ⇐⇒ means that the left hand side is equivalent to the right hand side. Thus we can have the same criterion as that in Eq.(5.13) by considering Eq.(5.32):

Motoyosi Sugita's Concept of the Field of Chemical Potential
Eq.(5.31) was first introduced by Motoyosi Sugita long long time ago [23,29,60]. He was inspired by the expresson of Eq.(4.22). It was derived by Becker and Döring [96], Volmer [97] and Frenkel [98](shortly represent BDVF). They studied the theory of condensation considering the nucleation of clusters in the supersaturated state. Therefore, it describes the non-equilibrium state in the irreversible process of condensation. Motoyosi Sugita recognized that when chemical equilibrium is slightly broken or when chemical reactions are going on, the physical conditions in chemical reactions are the same as those in condensation as well as nucleation. Thus I would like to express schematically the relationship between the Sugita's J and the BDVF's J in the following: The mathematical form in { } in the above represents a kind of "chemical force". This concept is different from another concept of "chemical force", Affinity of De Donder [102,103,104]. The affinity is defined as ∆µ in Eq.(5.11). On the other hand, the Sugita's concept of a kind of "chemical force" is related to the change in molecular numbers ∆n in Eq.(5.10). Motoyosi Sugita called it the "µ-field" or the "field of chemical potential" [3]. Both types of "chemical forces" acting on either ∆µ or ∆n (∝ J) take place in the irreversible processes. Thus, he emphasized the use of the concept of the "field of chemical potential" to most of all transient phenomena.
As was discussed before in the previous section, in Eq.(5.35) one can derive the above expression for J without any problem. However, the expression for R c cannot be derived from the theoretical framework of equilibrium statistical mechanics. It is a phenomenological expression that is supposed to be determined in experiment. Neverthless, later some kind of its justification was performed by Eyring et al [124,125] using the statistical method. And Becker and Döring [96] and Volmer [97] required the kinematical treatment in the theory. Therefore, Motoyosi Sugita called R c in this situation kinematical situation. On the other hand, he stated that when one went deep into the details, one became stuck without stepping forward any further since one met very difficult problems lying down in statistical mechanics. In order to escape from these difficulties, one has to be satisfied with only considering the so-called the quasi-thermodynamics. It is slightly generalized to adjust with the transient phenomena, using the concept of µ-field. By this approach, one becomes able to apply the idea and concept to many biological systems. He called this situation the equilibrium theoretical situation.
As an example, let us apply the concept to the condensation on the surface (of either liquid or solid). Denote by J the number of molecules per second that collide with the surface. Denote by µ ′ the chemical potential for the phase I and µ ′′ that for the phase II. This is given by Here P (P a ) means the pressure of the phase I (II), where µ ′′ = kT log P (µ ′ = kT log P a ). T is the temperature of the system. M means the molecular weight of a molecule. S 0 is the total area of the surface. And α stands for the permeability coefficient, which means that molecules colliding with the surface of phase II can be easily absorbed in the phase. Therefore, if µ ′ = µ ′′ then both phases are in equilibrium, if µ ′ > µ ′′ then the phase I goes to the phase II, and if µ ′ < µ ′′ then the phase II goes to the phase I. Motoyosi Sugita recognized that this kind of rule for phase change seems very similar to that of the Gibbs' phase rule for the equilibrium state [126]. In the former phase change occurs as a consequence of the broad quasi-static change in the irreversible process of transient phenomena, while in the latter phase change occurs as a consequence of the realization of equilibrium state. Thus there is a conceptual difference between them such as the former is time-dependent and the latter is not time-dependent. However, Motoyosi Sugita postulated the validity of the application of the concept of the field of chemical potential to many biological nonuniform systems such as polymers or macromolecules in protoplasm in cells.
Denote by K a part in a nonuniform system and denote by µ K i its chemical potential for component i in K and so forth. If there are three parts K, K ′ and K ′′ in the biological nonuniform system, then we can assign chemical potentials µ K i , µ K ′ i , and µ K ′′ i , respectively. Now if the system is in the equilibrium, then the chemical potentials satisfy the following condition: and if the chemical reaction in equilibrium in the part K is given by must be satisfied. On the other hand, if the equilibrium is not yet attained, then the kinetic rate equation like Eq.(5.35) and Eq.(5.36) should be applied to describe the system. Even for dynamic systems such as the system of life or life being, if the system is in dynamic equilibrium such as fluid equilibrium or chemical equilibrium, then we may assume that Eq.(5.39) is approximately satisfied in the sense of broad quasi-static change and of local equilibrium. This is the concept of the field of chemical potential introduced by Motoyosi Sugita long ago. Now, let us turn back to the reason how Motoyosi Sugita noticed the concept of field of chemical potential, for a while. As mentioned in the previous sections, he studied the the quasi-static change in classical thermodynamics in Japan before WWII, where Japan was very isolated from Western countries occasionally dashing to the war. He found the way of exceeding it and called it the broad quasi-static change in quasithermodynamics. He understood that chemical potential µ has to be defined locally as a function of a field of the coordinates of the system such as µ(x, t). Otherwise he was not able to derive the Kelvin's relation for the thermoelectric effect.
In my opinion, it is obvious that his way of thought came from this experience in his physics study, and then he applied the concept to other physical and chemical examples such as chemical reactions. The first look of Motoyosi Sugita for this discovery seems to be the following simple equation: where ϕ i (x) means the electric potential acting on the molecule of species i and others are the same as before.
Then he extended his way of thinking so that even if no electric potential exist, then the chemical potential as a field is meaningful. Onsager [100,101], Debye-Hückel [127] and Onsager-Samaras [128] used the similar ideas before. Why not for other systems? What's wrong with this? So, Motoyosi Sugita stepped forward to go beyond the equilibrium thermodynamics to the quasi thermodynamics of transient phenomena. Thus, although we think of chemical potential such as a numerical value for the equilibrium state in the standard point of view of thermodynamics, Motoyosi Sugita never thought like this but he always thought that chemical potential is a field defined on space-time such as the field in field theory even for dynamic, nonuniform, irreversible, non-reproducible and transient phenomena. This is his philosophy on the field of chemical potential.

Relationship between Cooperative Phenomena and Chemical Potential
Motoyosi Sugita further mentioned that the chemical potential plays an important role when the system undergoes phase change. It is not well-known in the recent modern text books in thermodynamics [103,104,105,106,108,116]. This is the cooperative effect when the system undergoes a phase change in the irreversible process in transient phenomena. In other words, it can be dubbed the much more modern word, induction-association principle for the phenomena. The word "inducetion-association principle" was first introduced by Gilbert N. Ling [129,130] and has been advocated by Gerald H. Pollack [131,132] for a long time.
This very particular aspect of the phenomena is the following: When the system faces a phase change, if it undergoes the phase change, then it cannot occur so literally, however. Rather, it never occurs even when the temperature of the system already goes below the critical temperature at which the phase change is supposed to occur. This is the supersaturation phenomena, discussed before. At this moment, in order to make the system undergo the phase change, a nucleus or a stimulus, i.e., a kind of trigger is needed for the phase change. Otherwise, it stays still. Conversely speaking, even if the stimulus of a trigger is very very small or microscopically small, the entire system undergoes the phase change. Hence, the effect is very cooperative. This phenomenon is not perfectly understood yet even nowadays.
In order to understand this type of phenomenon, Motoyosi Sugita emphasized the importance of the concept of the "field of chemical potential", namely the "µ-field". According to Frenkel [98], even in solid, partially melt parts are included near melting point. But the parts do not develop so easily even below the melting point. He put it the name "pre-melting". Motoyosi Sugita postulated that in such a case the µ-field plays an important role. He imagined that the µ-field is always fluctuating thermally or locally under various conditions. One good example is the cluster growth in the previous section. Local raise of the value of µ-field promotes the local phase change in the system until the system goes to equilibrium. A local variation of µ-field initiates the action at a distance to another point in the system. It is long-range interaction. Thus, local variation of µ-field acts as a trigger for the phase change through the long-range interaction of µ-field. This is the nucleation in the supersaturated phases.
In this way, the system can communicate through the µ-field in the system just like when the electrical potential does. This means that µ-field extends the lines of force as an induction of the "generalized potential", the µ-field. Mathematically, it may be considered as the gradient of the µ-field. Therefore, the idea of induction-association principle of Gilbert N. Ling seems very similar to that of cooperativeness of µ-field of Motoyosi Siguta.
The cause of such lines of force comes from the µ-field. The µ-field consists of the mixing entropy term. Hence, the lines of force or the long-range interaction between the parts in the system appears as the consequence of the mixing entropy, e.g., the last term in Eq.(5.40). In the standard viewpoint this nature of long-range interaction emerges as a consequence of electric effect. However, Motoyosi Sugita extended the concept such that so is true for the µ-field as well. On the other hand, the action of energy is not long-range but local or short-range. Thus, the induction-action principle of Gilbert N. Ling and the cooperative effect of Motoyosi Sugita comes from the mixing entropy, not from the energy of the system. "Can one think of the field in life being as the µ-field?" He sometimes asked such a question. In such biomaterial or life or life being, the µ-field is not made of a single component but is constructed by a huge number of components. Obviously, the structure of the µ-field becomes very complex. If so, then it would be very very difficult to calculate the entropy term, calculating the partition number such as in Eq.(3.23), since no simple formula exist for such complicated molecular systems. However, in principle and ideally, we can think of the Gibbs free energy G, the number of partition W and the sum of states Z, etc. Then we can expect that the system can move towards the direction of δG < 0 and the motive force for it is related to the µ-field from the point of view of quasi-thermodynamics.
His conclusion is as follows: The µ-field is very long-range enough to affect each other. A slight variation such as temperature fluctuation can be eliminated by the cooperativeness of the entire system. Therefore, one has to apply thermodynamics to the system in considering the entire system as a whole. Especially the second law of thermodynamics is such a law. One must be very careful when he applies the second law of thermodynamics to the partial systems. It is sometimes said the following: "Since life is not a closed system, we cannot apply the second law of thermodynamics to life." This seems a ridiculous idea, since it misses the point. While one takes into account the Gibbs free energy, one may consider the partial systems, under the condition that temperature is constant, one may consider the other parts as thermal sources by which some parts can be affected with other parts through the thermal communication to each other.

The µ-field, as an Invisible Force
In the previous example of the cluster growth in the supersaturated phase, the clusters grow like this: Here in between the nearest states of clusters (n − 1) and (n) Eq.(4.22) holds The field of chemical potential of { } in the above expression acts as an invisible force and statistically dominates this cluster growth. It has a tendency of action that the mixing entropy is increased as large as possible and the Gibbs free energy is decreased as small as possible. This tendency appears to be the hidden invisible force to make clusters grow. As seen in Eq.(5.41), the J n−1,n is acting in series, which corresponds to the current between the nodes in electrical circuit. On the other hand, chemical resistance R c n−1,n is in series as well, where chemical resistance corresponds to resistance in electrical circuit. The current J is stuck to be very small right before the nucleation size, being disturbed by the very large chemical resistance R c . Therefore, the cluster with maximum size (n) near the critical size of nucleus (n K ) has its maximum resistance and hence the rate J n,n K ≈ 0 (although J n,n K ̸ = 0). The system wants to go to the nucleus but cannot exceed; this means quasi-stability. This nearly equilibrium state in the supersaturated phase right before the phase change is sometimes called false equilibrium.
In Eq.(4.8) as n becomes large, ϕ n − nµ g becomes negatively large such that the entropy becomes locally small. The reason why such a state appears is that if such singular parts (i.e., large clusters) are included in the system, it becomes more convenient to make the total G small. Such an action that makes entropy small is also due to the invisible force, the µ-field. Apparently the second law of thermodynamics seems to be broken, but it is not so. Once we see the entire system, the second law of thermodynamics has never been broken.
After the growth processes of non-equilibrium thermodynamics are finished, once such clusters or complicated life being develop their structures, the fully developed structures of such clusters or fully developed complicated structures of life being consist of very large Gibbs free energy by definition. This complicated situation seems to contradict the laws of thermodynamics.
Thus, Motoyosi Sugita noticed that there might exist some kind of the "hidden law of thermodynamics" in addition to the three laws of thermodynamics. He postulated that there might exist the 4th law of thermodynamics which dominates the speed of the transient phenomena as mentioned in the introduction.

Motoyosi Sugita's Concept of the Maximum Principle in Transient Phenomena
In this section, let us consider the most important contribution of Motoyosi Sugita in my viewpoint. As discussed in the previous section, the concept of the field of chemical potential is quite important when we consider the non-equilibrium processes.

Motoyosi Sugita's |Ġ| =Max Conjecture and the 4th Law of Thermodynamics
As early as in 1950 Motoyosi Sugita wrote a paper entitled in Japanese, "Biological Thermodynamics and its Method", which was published one year later from the Annals of the Hitotsubashi University, entitled in English "Thermodynamical Method in Biology" [60]. He stated in Japanese on the existence of the 4th law of thermodynamics [29] as the paragraph quoted in Introduction. And also he first applied his theory of thermodynamics in the transient phenomena to the theory of metabolism.
For the reason why Motoyosi Sugita believed the existence of the 4th law of thermodynamics, he listed several examples that seem to be related to this 4th law as follows: Here let us see many instances suggesting this large principle of thermodynamics. (i) The cascade principle(Stufenregel) found by W. Ostwald [97] shows that the nature has the tendencies as if it wanted to take the pass of smaller resistance or make a de tour and want to establish the equilibrium as fast as possible.
(ii) Generalizing further the rule described above, it might be said that the nature prefers the line of the least resistance, if there are ways side by side for the equilibrium.
(a) According to Volmer [97], for instance, the crystal formation shows that such a pass is taken actually.
(b) Eyring and others [124,125] called such a process rate determining.
(c) Electric current in conductor takes the distribution that heat loss is minimum if the total current takes a given value. Therefore the heat generation must be maximum if the potential difference will be taken as constant. Therefore, if a cell is applied to drive the current, it will take the distribution to dissipate the free energy of the cell as fast as possible.
(d) Onsager [100,101] has derived his reciprocal relation from the principle of least dissipation function. This principle might be considered to the maximum velocity of entropy increase which will be discussed later.
(iii) If a new passage is built independently which has less resistance than others already existing, then the circumstance above described, that might be the 4th law of thermodynamics, may also be seen from our common sense.
(a) The new way may be considered having delicate catalytic action, therefore, large free energy of activation or small entropy. The free energy of activation determines the rate of development of such a passage acting as if the initial cost is to construct a highway. That is why the construction of the way of small resistance is retarded. Nevertheless, it becomes rate determining when it is performed and the old ways become only bypass or will be ruined.
(b) The idea of natural selection or struggle for life of biology may be considered as having the relation to this principle. That is the free energy discharged through the old passage is used to the free energy of activation of new way, and the material itself constituting the old way may be useful also as the material of construction (see (v) of VI).
(c) Such a circumstance like natural selection can be seen also in the inorganic worlds. For instance, let us observe the nuclear formation of ice in supersaturated water vapor under freezing point, and containing super-cooled water droplets. If the crystal nucleus is formed, not only the condensation occurs on this nucleus, but the super-cooled droplets vaporize and disappear. This is the consequence of the 4th law and the same phenomena can be seen on the discharged plate of PbSO 4 of battery and also in the case of recrystallization of metals and others, and they are playing a role to promote the tendency to the thermodynamic equilibrium.
Thus from the early beginning of his research he recognized and imagined the existence of the 4th law of thermodynamics, where he expected that some kind of the generalization of the least dissipation of energy of Onsager could be necessary. Therefore, I would like to call his expectation the Motoyosi Sugita's |Ġ| =max conjecture.
In the next section of that paper [29,60], "VI. Mathematical Theory and Conclusion", he sketched the outline of the 4th law of thermodynamics as follows: (i) First, on the base of microscopic reversibility, Onsager [100,101] has shown that ∆(Ṡ −Φ) ≥ 0. Exactly speaking, it is given by whereṠ is the velocity of entropy increase of the total system and Φ is the dissipation function in the sense of Rayleigh [133] as usual or such.
(ii) On the same base as Onsager, Landau and Lifshitz have shown thaṫ in their statistical physics [134], where ϕ = T Φ.
(iii) Let us denote by N k the parameter expressing the transient state, and let us assume that G is expressed by N k using the cut off method, thenĠ where µ k is the chemical potential of the kth component, µ k0 is its constant part, c k is its concentration anḋ N k is the reaction velocity.
(iv) Let us consider quasi-chemical processes between the components. This describes a set of chemical reactions such as chain reaction, whose set is denoted by s. It means that there are many chemical reactions that consist of a finite number set of molecules A i and A j labelled by i and j.
(v) Let us denote byṅ s the reaction velocity of the process s from the left to the right. The reaction velocityṄ k can be written in the form: where we have defined the affinity ∆µ s of the reaction s as I would like to note here that there is a sign mistake for Eq.(6.79) in the English version of this paper [60].
(vii) From the rate theory of chemical reaction,ṅ s can usually be written in the form: where R s is the chemical resistance of the process s, which corresponds with the circumstance of the theory of rate process, and the quantity in the bracket represents the µ-field, that corresponds with the circumstance of the theory of equilibrium.
(viii) Inserting Eq.(6.9) into Eq.(6.7), we can see thatĠ is equal to kTḢ, where H is the Boltzmann's H-function. If the mean value of H is taken in the momentum space and if it is assumed that s represents only the rate determining precesses in the individual processes and that the higher term ofḢ is negligible, The procedure, which neglected the higher term, corresponds with the cut off method discussed in Section 2. The summation of the right hand side of Eq.(6.10) may be interpreted as 2Φ, where Φ is the dissipation function of the quasi-chemical processes, and it may be considered as the virtual heat source discussed in Section 2.
(ix) The reversal of the µ-field can be interpreted if we consider the transition from the stage ∑ In the above paper [29,60] Motoyosi Sugita was not able to present the detail of the proof of the conjecture. It was limited to suggest the existence. However, in the succeeding papers [40,41,42,43,44,45,46,48,61,75,77,78,79] he argued the sketch of the conjecture and frequently tried to prove it.

Relationship between the Boltzmann's H-function and the µ-Field
In order to investigate the Motoyosi Sugita's conjecture, the so-called Boltzmann's H-function plays an important role. So, let us first consider the relationship between the Boltzmann's H-function and the µ-Field for the molecular statistics. For this purpose I would like to restrict ourselves to consider the system of chemical reactions only. However, this way of thinking can be generalized to other physical, chemical and biological systems as well.
Since there is a basic idea for proving the conjecture in [75], I would like to follow it here. Motoyosi Sugita first defines the Boltzmann's H-function for chemical reactions by Where the equality holds true only when c i =c i . This means that the H-function in the equilibrium state is always minimum.
Next, let us consider the derivative of H-function with respect to time along the course of time development. Then, Motoyosi Sugita considers the following: Let us consider the chemical reaction equations such as Eq.(5.21) for chemical reactions of Eq.(6.5), for our case here. Associated with the choice of the reaction Eq.(6.6), we can write the chemical reaction equations in the following: where P and R mean production and reduction in the chemical reactions, respectively, and J s is defined by where R s ,c i andc j are defined by respectively. On the other hand the affinity for each chemical reaction is defined by Eq.
This is nothing but Eq.(6.10), where J s ∆µ s /k can be regarded as the virtual heats in the transient chemical reactions. Motoyosi Sugita shows that this satisfies the following theorem: Let us now prove the H-theorem. Following the similar argument of Motoyosi Sugita [75], we find the following: where J s is given by Eq. (6.17). Now if we denote as then the summand looks like Since [A s − B s ] ln Bs As ≤ 0, the last expression is identically less than or equal to 0. The equality holds only for the equilibrium. Hence, the theorem is proved.

Motoyosi Sugita's Idea for the Proof of the Conjecture
Motoyosi Sugita also considered more general case of the nonlinear processes, which may be represented by the following equations:ẋ where x means a vector of x = (x 1 , · · · , x n ) and f i (x) stands for any function of x. The stationary statē x = (x 1 , · · · ,x n ) of these equations are assumed to be given bẏ The stability of this system is investigated by the Lyapunov theorem. Denote byẋ = y. Let us discuss the stability around the stationary state y = 0. The equation of motion for y is given bẏ where we have assumed that ∂fi(x) ∂xj can be represented in terms of y i such that J ij (y) ≡ ∂fi(x) ∂xj | y=ẋ . Now the simplest Liapunov function is defined as By differentiating this with respect to t, we obtaiṅ If all real parts of the eigenvalues of the Jacobian matrix J(y) ≡ (J ij (y)) are negative, then the stationary stateȳ = 0 becomes asymptotically stable. This is a satisfactory condition for the theorem. When J ij is symmetric, then the theorem always holds true; Otherwise, it is not necessarily so. Motoyosi Sugita applied this theorem to the Boltzmann's H-function. And he proved the H-theorem is valid if the Lyapunov's theorem holds.
Let us define the general Boltzmann's H-function: .
(6.32) By differentiating the above equation, we immediately obtaiṅ Motoyosi Sugita proved a mathematical theorem: can be Taylor expanded around the stationary statex, then it can be rewritten as where λ ij (x) are the parameter integrals defined by Let us follow his proof, which is short. By assumption, we expand f i (x) around the stationary state x =x j in Taylor series. We obtain Let us define Λ i,ν1···νn and f i,ν1···νn as Using Eq.(6.37), the Jacobian can be Taylor expanded in the following: The remained procedure is to use Eq.(6.35) and Eq.(6.39) and to derive the right hand of Eq.(6.34).
Hence, the theorem is proved. This is nothing more than the mean value theorem in the analysis for the analytic functions with many variables. Using this theorem, Eq.(6.33) turns out to be the following: Now if all the real parts of λ ij (x) are negative, then since ln xī xi and (x j −x j ) have the same sign, therefore, we can obtain the following: By differentiating Eq.(6.33) with respect to t, we obtain The second term in the above is always positive. Let us now substitute Eq.(6.29) into the first term, we find By definition ln xī xiẋ i < 0. And in our assumption that J ij (x) is diagonalizable and the Liapunov theorem is valid, all the real parts of eigenvalues of the Jacobian are negative. Hence, by multiplication, Eq.(6.44) is always positive. From this fact, we obtain the following property of the H-function: Let us define the entropy production σ(x) of the system by where k is the Boltzmann constant. Therefore, from Theorem 3 we immediately yield the following theorem: This is nothing but the theorem of the minimum entropy production or the Prigogine's principle of minimum entropy production [103,104].
In summary, this is the outline for the proof of the conjecture proposed by Motoyosi Sugita [75]. He tried again and again to prove this conjecture from various point of view. However, the general proof has never been done in his life time.

The Ideas of Motoyosi Sugita as a Specific Development of Lars Onsager's Lifework
In 1951 Motoyosi Sugita first presented the theory of the maximum principle in transient phenomena such as those discussed in the previous section [38]. This paper was entitled as "The Maximum Principle in the Transient Phenomena and the Application to Biology", in Japanese. In this paper he first stated his vision and idea on the maximum principle in the transient phenomena. He argued the relationship between his idea of maximum principle and the existing old ideas such as the maximum-minimum principle in the Joule heat, the Boltzmann's principle in the theory of gases, and the Onsager's principle of the least dissipation of energy in the theory of irreversible processes [100,101]. He finally applied his idea to many biological systems such as the thermodynamics of metabolism, the relationship between the maximum principle and the metabolism, the origins of life, and the dynamic equilibrium, the relaxation oscillations, the wholeness of life, etc.
In the succeeding paper in 1952, he further studied the maximum principle in relation to the Boltzmann's H-theorem [40]. This paper was entitled as "The Relationship between the Boltzmann's H-Theorem and the Dissipation Function", in Japanese. This paper is a really instructive one. As is discussed in the previous section, his theory preceded the times of Prigogine [103,104]. So, in this section I would like to present his comparison between the Motoyosi Sugita's theory and the Prigogine's theory as well as Onsager's theory [100,101] and Katchalsky's theory [105,106,108]. Fortunately for the Western people, these Japanese papers were summarized as the English versions [77,78,79].

Relationship between the Boltzmann's H-function and the Irreversible Work
As is shown in the previous section, we have obtained the Boltzmann's H-function, especially for the case of chemical reactions. Motoyosi Sugita first applied his idea of the virtual heat that has been discussed in the section II to the irreversible work of the system.
In order to see the difference between the method of Motoyosi Sugita and that of Ilya Prigogine more easily, let us change the notation of Motoyosi Sugita to adjust with that of Prigogine. Let us denote by i the internal system which is doing the irreversible work. Let us denote by e the external thermal reservoir, where we assume that no irreversible work has been done. By definition, we have If the process is the broad quasi-static change (under the isothermal and isopressure), then we have which is equivalent toĠ This is not satisfied when the irreversible work exists. In this case we havė which is the isothermal irreversible work. Or equivalently, On the other hand, since the heatU i + PV i comes out from the reservoir e, we have for the reservoir ė Now we assume that there is no heat exchange otherwise, the total entropy of the system is given bẏ Since there is no irreversible work in the reservoir e,Ġ e = 0. And since G is always decreasing, we can state thatĠ i < 0. ThisĠ i is the irreversible work for the entire system and it is nothing more than "virtual heat" introduced by Motoyosi Sugita long ago. Then we havė This means that the entropy of the internal system S i is always increasing. There is no explicit expression like Eq.(7.5) in the theory of Prigogine based on and tracing the origin back to the school of De Donder [102,103,104]. As was discussed in the section II, the above Prigogine's equation leads to a confusion and a mistake when one considers the irreversible cyclic processes. Because in the viewpoint of De Donder's school the entropy of the cycle can vanish only when the cycle is reversible; otherwise it must be positive such that However, as Motoyosi Sugita discussed long long ago, no matter what the irreversible process is taken into account, the following must be satisfied when the internal process is cyclic since the final state must come back to the initial state after one cycle:

Relationship between the Boltzmann's H-function and the Dissipation Function
This argument can be generalized to the systems of flow dynamics or fluid dynamics. In this case there is matter exchange between the reservoir e and the system i. Let us denote by G e the external part of G and by G i the internal part of G, respectively. In the stationary state of the internal system i, the time derivative of G i vanishes (i.e.,Ġ i = 0). So, we have

Relationship between Motoyosi Sugita's Theory and Lars Onsager's Theory
Following the idea of Lars Onsager [100,101,118,119], the entropy change dS can be divided into two parts such as dS = dS e + dS i , (7.17) where dS i means the entropy change inside the system and dS e means the entropy change due to the interaction between the system and the environment. Let us define the state variables, x ≡ (x 1 , · · · , x n ). And the changes in entropy are supposed to be represented in terms of x's. If one can expand the stationary entropy around the equilibrium entropy inside the system with respect to x's, then we must have where σ stands for the entropy production in the system at time t and τ means an infinitesimal time. If the process that we are considering is an irreversible process of the states x i 's, then the total derivative in time of S i provides Eq.(7.19) together with Eq.(7.18) yields where Here J k are called the "generalized flows", while X k are called the "generalized forces". Considering Eq.(7.20) together with Eq.(7.19), Onsager postulates the following variation principle: Or equivalently, Here Onsager assumes that the quadratic dissipation function Φ is given by where ϕ is called the Rayleigh's dissipation function. By the variation principle for Eq.(7.22) or Eq.(7.22 ′ ), we have to consider the following variational equation: From this, we obtain Substituting Eq.(7.23) into Eq.(7.25), we obtain the famous linear relation: where the coefficients R ij satisfy R ij = R ji . (7.27) This is the Onsager's reciprocal theorem. Solving Eq.(7.26) for J i , we obtain where L ij = R −1 ij = L ji ; the reciprocity holds true for L ij . This yields for the dissipation function: The variational principle of the above equation is given as Apart from the energy dissipationṠ * (J n ) through the surface, we finally obtaiṅ Since for the isothermal system the internal energy is kept constant, the rate of the Gibbs free energyĠ is related to the entropy changeṠ i such thatĠ = −TṠ i . Hence, we obtaiṅ G + 2ϕ = 0 (7.32) as expected, where ϕ is the Rayleigh's dissipation function. Now I would like to note that the variational principle of either Eq.(7.24) or Eq.(7.30) falls into the Motoyosi Sugita's maximum principle discussed before [see Eq.(7.16)]. In the above case of Onsager's minimum or maximum principle, Onsager implicitly assumed that there exists a constant M such that M = |Ġ|. Instead of showing that, Onsager also implicitly assumed that X k = const. for the variation of Eq.(7.24) and J k = const. for the variation of Eq.(7.30), respectively. Since the constraint either X k = const. or J k = const. is assumed, the extremum of either J k = extremum or X k = extremum after the variation is also constant. Hence, M = |Ġ| = |2ϕ| is constant as well. Thus, Onsager's principle of the least dissipation of energy falls into the Motoyosi Sugita's maximum principle as a special case.
Next let us consider the relationship between the Motoyosi Sugita's µ-field and the the above Onsager's theory. As is shown in the above Onsager's relation between the generalized forces and the generalized flows(or currents) are linear [see Eq.(7.26) and Eq.(7.28)]. However, in the Motoyosi Sugita's theory it is not so but it is nonlinear. Going back to Eq.(6.10), we hold the following relation: where from Eq.(5.31) or Eq.(6.17), J s is given by Let us suppose that the system is nearly in the thermodynamic equilibrium as was considered by Onsager. If we assume µ i = µ 0i + kT ln c i , (7.35) where µ 0i is the enthalpy and c i the concentration (or activity) of the component i, then where k f s and k bs are the reaction constants for the forward and backward processes given by respectively. Then, Or inversely, Formally solving the above for ∆µs kT , we obtain which can be written in the quadratic form like Eq.(7.23) if the terms of the higher order of J s are neglected and J s is transformed into X k by where γ ks are constants. Then the reciprocal relation of the coefficient ofẋ iẋj is easily derived. Thus, Onsager's functional and the linearity can be derived from the Motoyosi Sugita's µ-field theory as a special limit [77,78,79].

Relationship between Motoyosi Sugita's Theory and Ilya Prigogine's Theory
Following the idea of Prigogine [102,103,104], Prigogine assumes that the entropy increase of the systemṠ i is given byṠ where Prigogine's A k and ξ k are the affinity and the degree of the rate of the process of De Donder [102] and they correspond to the Motoyosi Sugita's ∆µ k and ∆n, respectively. Generalizing this idea to more general chemical reactions, Prigogine [104] formulated almost the same method as that Motoyosi Sugita did long ago. Prigogine wrote as 1 while the affinity A k can be rewritten by where R = N a k is the gas constant as before and T the temperature of the system. Substituting Eq.(7.44) and Eq.(7.45) into Eq.(7.43), we obtain Thus I can conclude that Motoyosi Sugita succeeded in formulating the theory of non equilibrium thermodynamics long before it was reformulated again and intensively applied by the Prigogine's school. Unfortunately, since such papers were written first in Japanese and published in Japanese journals such as the Bulletin of Kobayasi Institute and the Journals of the Hitotsubashi University, the contents of his theory have never been appreciated worldwide. This was really unfortunate for us to study his theories. This is one of the reasons why I am writing this paper.

The Relationship between the Motoyosi Sugita's Maximum Principle and the Pontryagin's Maximum Principle
Now I would like to prove the conjecture of Motoyosi Sugita's Maximum Principle, using the optimal control theory [135,136,137,138,139,140,141]. From this, the relationship between the Motoyosi Sugita's maximum principle and the Pontryagin's maximum principle becomes clear. I would be able to conclude that the Motoyosi Sugita's Maximum Principle is nothing but the Pontryagin's maximum principle in the theory of optimal control.
As I have written in the introduction, I became aware of Motoyosi Sugita's work in this Spring in 2016. Two years before this year, I have written a couple of papers on the application of the optimal control theory to thermodynamics [142,143]. Therefore, at that time I did not know the research work of Motoyosi Sugita at all. However, once I became familiar with his work on the maximum principle that is given in Eq.(7.16), I became sure that his maximum principle is nothing more than that of Pontryagin [135,136] such that I can prove it using the theory of optimal control. This approach will be a generalization of the Motoyosi Sugita's proof given in Section 6 . And it will fill in the lack of proof for the conjecture of Motoyosi Sugita with a rigorous one.
Before doing so, I would like to present my philosophy for the problem, since my motivation has come from the very different viewpoint from that of Motoyosi Sugita. I would like to show it in the next subsection first.

Attractiveness of the Formulation of Classical Mechanics
What is most attractive in the theoretical framework of classical mechanics is as follows: We believe that the energy is conserved in any mechanical problem, unless there is no dissipation of energy. This is the concept of energy conservation law. Based upon this energy conservation law, we assume that the initial energy is given in the problem for the mechanical system such as a pendulum or a spring. So, as long as there is no dissipation of energy, once the initial energy is given to the system, then it moves automatically and forever. This is our understanding on the physics of macroscopic mechanical objects.
As we know in classical mechanics, all variables in the system are mechanical variables such as the coordinates whose vector is given as ⃗ x and its momenta whose vector is given as ⃗ p. The set of the vectors (⃗ x, ⃗ p) forms so-called the phase space for the Hamilton dynamics which is given by the Hamilton equations of motion.
On the other hand, in our problem of non-equilibrium thermodynamics for the systems of life or living things, the system is described by dynamical change of the densities of ions, atoms, molecules, etc. under chemical reactions. So, the densities are given as a sum of the sets of classical particles or objects. At a given time the system is determined by the instantaneous values of the densities in the system. Since the system is dominated by the densities, we may call them the state variables. Thus, we have to treat the macroscopic state variables as the new type of mechanical variables in the dynamical systems.
This means that we regard the biologically living macroscopic system as a classical mechanical system given by regarding the state variables as the mechanical variables. This point of view is interesting, since we can regard the living objects as classical mechanical objects. As if a pendulum moved automatically following the energy conservation law, the macroscopic biological system would move automatically following some unknown law of physics. If such a new type of law exists, it will be very nice. I would like to find such new principle of conservation law. This is our goal here.

Modern Control Theory and Pontryagin's Maximum Principle
The above vision of mine seems quite similar to that of Motoyosi Sugita. What I stated as the unknown law of physics is absolutely what Motoyosi Sugita stated as the 4th law of thermodynamics long ago. Since there are many detailed mathematical proofs for the Pontryagin's maximum principle, I would like to skip such proofs in this paper, but only show the essence of the proof. If you want to see such proofs, then I would like to recommend you to consult other books [135,136,137,138,139,140,141] and my papers [142,143].
In this section, we are going to consider the essential concepts and the formalism of the so-called Pontryagin's theory of optimal control [135,136] for the later purposes. This theory is the totally new type of extensions of the standard control theory [144] which is based upon the negative feedback mechanisms before 1960. Since then, the Pontryagin's theory was called the modern control theory, while the old control theory was called the classical control theory. This reminds us of what happened in the discovery of quantum mechanics.
On the other hand, theoretically speaking, the Pontryagin's theory of optimal control is the natural extension of the formalisms of Hamilton's principle and the least action principle in classical mechanics [134]. It was totally a revolution in theoretical physics as well. However, much has long been not so well-known in physics society. It seems because the revolution has occurred in the optimal control theory and the automatic control theory in engineering community around the year of 1960 and because the value of scientists of USSR was intentionally and absolutely ignored by the Western scientists at that time in the era of the cold war between USSR and USA. This was very unfortunate.

Equations of Motion for the Open Dynamical System
Let us denote by x = (x 1 , . . . , x n ) the n-dimensional state vector for the state variables x i . Let us denote by u = (u 1 , . . . , u r ) the r-dimensional control vector for the control variables u i . The equation of motion for the dynamical system is given by · · · , x n , u 1 , · · · , u r , t) . . . . . .
x n = f n (x 1 , · · · , x n , u 1 , · · · , u r , t) As in the case of classical mechanics, once we regard the state variables x as the classical variables, we can define a Hamiltonian. Let us denote by ψ = (ψ 1 , . . . , ψ n ) the adjoint vector for the adjoint variables ψ i . Let us define the Hamiltonian: According to the Pontryagin's theory of the optimal control [135,136], we can prove the Hamilton equation: In order to escape from the confusion between the standard Hamiltonian in classical mechanics due to Hamilton and the Pontryagin's Hamiltonian in the optimal control theory, we would like to use the Pontryaginian or Pontryagin's Hamiltonian for the latter. This is because they are totally different from each other in a physical unit. Hamiltonian is given in units of energy Substituting this in the above original Pontryaginian of H 0 , we obtain This is nothing but the standard time-derivative of Gibbs free energy: Since G = ∑ j µ j N j , if we rewrite the set (N 1 , · · · , N n ) as x ≡ (N 1 , · · · , N n ) and µ ≡ (µ 1 , · · · , µ n ), we obtain

Proof of the New Conservation Law
The general proof of the conservation of the Pontryagin's Hamiltonian is quite complex. It is not so convenient to describe the detail in short here. Since the proof is given in the text book of Pontryagin et. al. [135,136], we skip the detail. Therefore, I would like to describe the essence of the proof.
As before, we start with the dynamics given by Eq.(8.1) [or Eq. (8.2)]. Let us find the equilibrium state taking the variation δx i such as (i = 1, . . . , n), (8.9) where ε is a small positive value and we assume that the initial condition for δx i such that it starts with the value: Substituting the above into Eq.(8.1), we can expand the original dynamical equations with respect to ε. Then, we can obtain the linearlized equations of motion: The matrixĴ ≡ (J ij ) is called the Jacobian or Jacobi matrix in the linear stability analysis [103,104] [see Eq.(6.29) in Section 6].
Next, let us define the adjoint matrix,J: And let us define the following dynamical equations for the new functions, ψ i :  u). (8.14) Let us prove that the above Pontryaginian is a constant of motion in the nonlinear dynamical systems for the state variables. Please do not confuse that this problem is a problem for mechanical variables in classical mechanics. Although the Pontryagin's Hamiltonian is mathematically analogous to the Hamiltonian, it is not the same physical quantity; the former represents the work rate (i.e., the power) and the latter the energy in our choice, as was mentioned before.
Differentiating with respect to time, we have Here we have assumed that the extremum condition for f i (x, u) with respect to u j such that . . . , r). (8.18) By definition, this is equivalent to the following optimal condition: . . . , r), (8.19) where the maximum condition [141] is also given by = 1, . . . , r). (8.20) If the condition is for the minimum then the inequality has to be reversed. Let us now impose which is nothing but Eq.(8.13), since if we take its transpose then we have This Pontryaginian in the nonlinear systems with the state variables plays an important role of the Hamiltonian in classical mechanics. Physically speaking, this means that as long as the Power is fixed as a conserved quantity, there exists an optimal process that preserves the power.

Comparison with the Prigogine's Method
The above approach is quite analogous to the Prigogine's method in the nonlinear systems [103,104]. The Prigogine's method for the stability of the nonlinear dynamics is nothing more than the Lyapunov's method in mathematics.
In this method, we first assume that the left hand sides of Eq.(8.1) or Eq.(8.2) are all zeros. This provides the following: 0 = f 1 (x 1 , · · · , x n , u 1 , · · · , u r , t) 0 = f 2 (x 1 , · · · , x n , u 1 , · · · , u r , t) . . . . . .  (k = 1, . . . , r) are constants, we obtain the dynamically equilibrium states or stationary states: Similar to Eq.(8.9), we expand the state vector as Substituting this into Eq.(8.2), we similarly obtain By investigating the characteristics of ω, we can find the stability condition of the equilibrium state such that if all real parts of the eigenvalues are negative, then the system is stable. This approach is the essence of the Pontryagin's method. Therefore, it is nothing more than the Lyapunov method in the linear stability analysis in mathematics. This was also discussed by Motoyosi Sugita long ago such as in the subsection 6.3.
In this way, we can understand that the Pontryagin's method in the optimal control theory is a natural generalization of the Prigogine's method in nonlinear theory.

Generalization of the Pontryagin's Hamiltonian to the System with a Constraint
In the above, we have proven that the Pontryagin's Hamiltonian with state variables in nonlinear dynamics plays the role of the Hamiltonian of mechanical variables in classical dynamics. And we have shown that the Pontryagin's Hamiltonian is a constant of motion of the dynamical system, i.e., a conserved quantity. However, we have not yet proven that the Pontryagin's Hamiltonian takes its maximum value in the region of the admissible control parameter vectors. And we have not yet show that the principle works as well, even when there is a constraint of the system. This constraint is analogous to the constraint that we know as the least action principle through the Lagrangian L in classical mechanics. We are now going to consider these problems.
Suppose that there is a constraint in the system such as (8.32) where the time-development of the system obeys Eq.(8.1) and x(t) is the n-dimensional state vector and u(t) the r-dimensional control vector.
Let us now impose that this constraint takes the minimum value in the course of the time-development of the system in between t 0 and t 1 . In other words, we expect that we are able to find the control parameter vector u(t) so that always the constraint is minimized in the course of the time-development of the system in between t 0 and t 1 . This simply means δJ = 0. (8.33) The physical meaning of this is the following: We evaluate the functional J of the state variables x i (i = 1, . . . , n) as if it were the action functional S in classical mechanics. Then, we expect that the value of the functional is always minimum possible in the course of the time-development. This constraint provides an extremum problem. In this context the functional J is sometimes called the evaluation functional or the performance index (PI) in the theory of optimal control [135,136]. So, we have to find the orbit of the state variables that obey the nonlinear dynamics Eq.(8.1) such that the PI-functional J must take minimum under the condition that the admissible control variables u k (k = 1, . . . , r) provide the maximum for the Pontryagin's Hamiltonian. This is analogous to the least action principle for the Lagrangian under the Hamilton dynamics for mechanical variables in classical mechanics.
In this more general case than the previous one, we can define Pontryagin's Hamiltonian H 1 as Let us now suppose the following new variable x 0 (t) by  u, t). (8.38) However, at this time the system must obey the following nonlinear dynamics: As before, we then have the equations of motion similar to Eq.(8.4) and Eq.(8.5): where x(t) ≡ (x 0 (t), x 1 (t), · · · , x n (t)). Here we would like to note that the first equaltion of Eq.(8.41) for ψ 0 reduces toψ 0 = 0, since f 0 (⃗ x, ⃗ u, t) does not depend upon x 0 at all. We also have the following constraints as before: ∂H 1 ∂u j = 0, (j = 1, . . . , r), (8.42) where the maximum condition [141] is also given by = 1, . . . , r). (8.43) If we take ψ 0 = 1, then the above condition for the maximum principle turns out to be the one for the minimum principle.

Pontryagin's Maximum Principle
Now we can summarize the very important theorem which is known as the Pontryagin's maximum principle in the optimal control theory [135,136]. This theorem is described as follows: Theorem 6 (Pontryagin's Maximum Principle). Let us suppose that the dynamical system is described by the nonlinear dynamical equations:ẋ Let u(t) be an admissible r-dimensional control vector in the admissible region of U given in the time interval t 0 ≤ t ≤ t 1 such that the solution x(t) starts from the initial vector x(t 0 ) = x 0 at time t 0 and passes a point in the line Π at time t 1 . Here the line Π is defined as a line that is parallel to the x 0 -axis and passes the point (0, x 1 ) in (n + 1)-dimensional phase space X.
One necessary condition that control u(t) and trajectory x(t) are optimal is that according to the functions u(t) and x(t) there must exist the following non-zero continuous vectors ψ(t) = (ψ 0 (t), ψ 1 (t), · · · , ψ n (t)): (1) For all t in time interval t 0 ≤ t ≤ t 1 , the function of variables u in the admissible region U (u ∈ U ), H 1 (ψ(t), x(t), u) takes the maximum at u = u(t); namely, (M 2) (2) ψ(t) also satisfies the following condition:

In practice, if ψ(t), x(t), u(t) satisfy the coupled equations (M1) and
The proof of the Pontryagin's maximum principle is very complicated but is given in detail in the literature [135,136]. So, we have omitted the proof here. However, the result is quite simple enough for us to apply to physical problems.
Let us go back to the case of the nonlinear dynamics with a constraint J in the subsection 8.6. In this case, ψ 0 = −1 is taken. Since this is nothing but the first condition in Eq.(M6), we hold the second condition: Hence, we have the following Pontryagin's maximum principle for this case: at some control vector value  (8.44) where the state variables are assumed to obey the following dynamical equations:

The 4th Law of Thermodynamics as the Motoyosi Sugita's Maximum Principle
x i = f i (x, u, t), (i = 0, 1, . . . , n). (8.45) Then, if the system advances under the optimal control of the control variables u, then there exists a maximum of the Hamiltonian H 1 such that |H 1 | max = 0. (8.46) And in this moment, always the following equations hold: The equations of motion:

47)
for i = 1, . . . , n, the optimality condition: 48) and the maximum condition [141]: Here in Eq.(8.49) if ψ 0 = 1 then we change the inequality to ≥ 0 such that the maximum condition becomes the minimum condition. I believe that this principle is exactly nothing more than the 4th law of thermodynamics in terms of the language of the modern control theory.
The above approach of Pontryagin's maximum principle is very general and therefore it should not be restricted within thermodynamics. However, I would like to see the relationship between the Motoyosi Sugita's maximum principle and the Pontryagin's maximum principle.
Let us apply the above method to thermodynamics especially for the isothermal system where T = const. In order to do it, we must assume that the dissipated energy from the system is becoming the virtual heat such thatṠ = 2Φ, (8.50) where Φ is the dissipation function of Rayleigh [133]. This is the heart of the Onsager's principle of the least dissipation of energy. It plays the role of Lagrangian in classical mechanics. Because when we impose that the action of energy change from the dissipation energy to the virtual heat is as fast as possible, then we take variation for it. This restriction imposes the Onsager's variation principle: Therefore, once we regard this variational constraint as f 0 in Eq.(8.44), then we obtain the Pontryagin's Hamiltonian for isothermal system as where we have taken as ψ 0 = −T . If we adjust with the definition of Gibbs free energy, then we must regard the Hamiltonian H 1 in the left hand as the power P = dE dt . This yields This Hamiltonian was first found by the author two yeas ago [142,143]. Thus, as long as we take the extremum using Eq.(8.46), we have to obtain the following simple relation . (8.55) This is the most general expression for the Gibbs equation generalized from the standard one in the textbook of thermodynamics: Furthermore, if we impose the quadratic relation for the dissipation function of Eq.(8.50) then we substitute it into the above. We finally obtain the following relation:

Relationship between the Pontryagin's Maximum Principle and the Bellman's Principle of Optimality
Now I would like to make a comment on the relationship between the Pontryagin's maximum principle and the Bellman's principle of optimality [see the details in Appendix A]. Although both theories seem to treat the same kind of optimal problem, the apparent looking of the results is very different. Even though I can say that they are almost equivalent concepts, it is far from being trivial. Therefore, I would like to clarify this problem. This was first done by Pontryagin et al. [135,136]. Bellman simply assumes that there is a dynamical process whose the time development of the system is given by a time t in between the initial time t = t 0 and the final time t = t 1 . Then, he divides the interval to two regions from t = t 0 to t and from t to t = t 1 to establish the principle of optimality. However, it is not trivial. Rather, it should be unknown till we can solve the system of nonlinear differential equations Eq.(8.2). In general, to solve the nonlinear equations is very difficult. Therefore, it becomes a challenging problem in physics. Suppose that the system of the nonlinear equation of Eq.(8.2) would be solved under the optimal control u for t 0 ≤ t ≤ t 1 . This gives us the time interval T = t 1 − t 0 as a function of the initial state of the system defined by the n-dimensional vector x 0 such as Differentiating this with respect to t, we then derive the following: For the optimal control, we have to take the optimal condition for the control parameter u. So, we finally obtain the optimality relation: where U stands for the space of admissible control. This is the result that we apply the principle of dynamic programming to the system of the nonlinear equations. Next, let us define the function g(x, u): ∂w(x(t)) ∂x j f j (x(t), u(t)). (8.62) Differentiating this with respect x i , we obviously find where we have used the trivial relation g(x, u) = 1 by Eq. (A.79). From this we have ∂xi∂xjẋ j , after some manipulation using Eq.(8.64), we obtain d dt Then, if we define On the other hand, the Eq.(8.61) can be written as Since the left hand side of the above equation is nothing but the Pontryagin's Hamiltonian H 0 , hence we can prove the Pontryagin's maximum principle: from using the Bellman's dynamic programming. Thus, as was shown by the Pontryagin's group [135,136], the maximum principle of Lev Semyonovich Pontryagin in the modern control theory is essentially equivalent to the optimality principle of Richard Bellmann in the modern control theory.

The Motoyosi Sugita's Theory of Metabolism: The First Application of the Maximum Principle to Life
Around the year of 1951 Motoyosi Sugita found an idea that he should apply his theory of the maximum principle to the theory of metabolism of life [38]. This was intensively studied and published in the Japanese journals [30,31,32,33,34,41,42,43,44,45] as well as in English [61,77,78,79]. Much later he generalized the idea to more complex systems of life where the control or regularity comes into the system and fortunately these were published in English [62,80,81,82,83,84,85,86,87].
In this section I would like to introduce to you the earliest version of his theory of metabolism as the application of the maximum principle.

Combined Chemical Reactions
Let us denote by n i the mole number for the species i of the molecule inside the body. Let us denote by n a the mole number for the species a of the molecule outside the body. Denote by q s the the reaction coordinate for the chemical reaction s in the detailed balance. Then, we definė where r is the number of chemical reactions and Γ as (Γ is ) is the matrix element that represents the production of the molecule a (i) from the chemical reaction s. If the life being is in the stationary state (or steady state), thenṅ This has to meanq s = 0. However, in general the state is progressing so that it is not in the stationary state; Hence,q s ̸ = 0. Therefore, we can assume that isothermal chemical reactions are performed in the life phenomena.
If we denote byṅ ′ a the molecule flowing in the body from outside and byṅ ′′ a the molecule flowing out the body from inside, the chemical motion of the matters in life is not like the water flow; it straightforwardly flows in and flows out. But it is like a complex circulation of the matters inside the body where there are reverse chemical reactions of matters and the matters are flowing out to the body. This is schematically shown in Figure 7.
This is the characteristics of life. When the system is not in the stationary state, the life itself automatically performs the cycle and adapts itself with adjusting this cycle by interacting with the external systems. Since it is not in stationary in this case, we haveṅ i ̸ = 0.
On the other hand, when we consider the stationary state, since it is not in equilibrium, local entropy production exists due toq s . And therefore there must exist the dissipation of the Gibbs free energy. Let us denote by G K (n a , n i ) the total Gibbs free energy of the system including the external system. Now we can write asĠ where m a and m i are the mole numbers of the species of molecules for the outside and inside of the system, respectively. Now if we assume that there is no dissipation outside the body of life, then the right hand side of Eq.(9.2) represents the dissipation of the Gibbs free energy. The first term means that supply from the outside to the system and waste from the system to the outside. If the system of life is in the stationary state, then they are compensate to each other. Because sinceṅ i = 0, therefore ∑ i µ iṅi = 0. However, since the chemical cycle is performed, we divide this into two parts as where the first term represents the catabolism and the second term the anabolism.
In chemical reactions in life there seem to exist three types of chemical reactions such as (i) consumption, (ii) supply, and (iii) reproduction. Thus we would like to represent them as follows: where A i and B j are reaction components and x i and y j are all integers. B ′ j may be C k . A i and B j may be the same components as well. The chemical reactions that can be regarded as the path of reactions (ii) for dissolution correspond to the main path coming from the external of the body and going out to the external of the body. The reactions regarded as (i) provide the activation Gibbs free energy and hence they are dissipative reactions. At the same time the dissipation can be compensated by reactions of (iii).
In the system of life, the above reaction are not performed independently, but they should be performed at the same time. Therefore, we can assume the following reactions: Now we assume that the reaction rates (speeds) are defined by x iqs , y jqs and z kqs and so forth. We are able to assume the following expression forq s :q and R s is the chemical resistance for the chemical reaction labelled by s and µ i , µ j , and µ k (µ ′ i , µ ′ j , and µ ′ k ) are the chemical potentials for A i , B j and C k (A ′ i , B ′ j and C ′ k ), respectively. Thus, once we consider in this way, the reactions are performed automatically such that the Gibbs free energies ejected from the reactions of (ii) are used for the reproduction reactions of (iii) and the dissipations of the Gibbs free energies from reproductions (namely, the Gibbs free energy produced by catabolism) are used for the progression of this combination of the chemical reaction of Eq.(9.5).

Reactions of Metabolism and Maximum Principle
For the sake of simplicity, we assume the following: This means that the Gibbs free energies produced from (ii) supply are all used for (iii) reproduction. From this situation, large energy and negative entropy are produced and they become the energy and entropy for activation. That is, so-called negative entropy has to be interpreted as activation entropy. In order that smart give and receive of the Gibbs free energies can be performed, it is not possible if the activated complex of the reacting components of the materials of (i), (ii) and (iii) takes common feature with large entropy, but it can be possible if it makes the organization with small entropy. Here is the role of proteins. Therefore, such materials like proteins thermodynamically collapse and they proceed the reactions of Eq.(9.5) by give and take of the Gibbs free energy before their collapse, and constantly reproduce the materials with holding the materials that collapse are going to collapse. We have to interpret the concept of negative entropy in this meaning. The point of view of Schrödinger [147] and Brillouin [148] has a danger to recognize the negative entropy as a permanent existence. According to Eq.(9.10), the consumption (or exhaustion) per second associated withq s is given by On the other hand, the reproduction per second is given by Now when the system is in the stationary state, we have to hold the following relation: As is discussed in the section 7, Motoyosi Sugita applied the method of maximum principle of Eq.(7.16) to the above problem. Assuming that M = cont. and giving the variation of δq s , we seek for the condition that |Ġ K | becomes the maximum. Let us define by λ a Lagrange multiplier (i.e., unknown parameter of Lagrange). We obtain Now the relationship between i andq s is given by Eq.(9.6). We can assume that Eq.(9.11) can be written in the quadratic form in the first approximation as was discussed before [Eq. (7.41) in Section 7]. Within this assumption, for each s the following relation has to be satisfied Here multiplyingq s for both sides of Eq.(9.15) and take the sum for s, we find Comparing Eq.(9.16) with Eq.(9.13), we obtain λ = −1.
This is the detailed balance equation in the process of the chemical reaction s. From Eq.(9.18) together with the help of Eq.(9.1 ′ ), if we can know the ratios ofq s between different s, then we can know the relationship between µ i . These µ i are the chemical potentials in the living state, and therefore, they usually cannot be measured biochemically; i.e., they are the quantities that dominate the living function biochemically.
To understand this point, let us assume that there is a material K such that it is produced by the rate of z kqs and it is consumed by the ratio of x jkqj . In the stationary state, the balance of income and outgo of the material K is given by As is shown in Figure 8, suppose that z kqk occurs as a combination with x kjqk . If the difference of chemical potential in each reaction is written as ∆µ k , then corresponding to Eq. Similarly to what was discussed on Eq.(9.18), what we can say about Eq.(9.21) is the following: If we are able to know the ratio between each reaction rate in the metabolism in the body with keeping the life body in the stationary state, then we are able to find the relation between the chemical potentials. From this we are able to measure the negentropy in the living state by the kinematical method. On the other hand, if we want to measure it by the usual chemical method then we have to make the living system in the equilibrium state and hence inevitably we have to kill the life. However, if we seek for the inter-relationship between the chemical rates with keeping the living chemical reactions, then we can measure the free energy of the system that is in the living state.
As a special example for the above theory of Motoyosi Sugita, he discussed the combined chemical reactions of ATP and proteins. Suppose that all protein reactions are averaged for all proteins conceptually such that we can assume that there is only one protein. Denote by x P AqP the speed of decomposition of ATP to produce the protein, where the Gibbs free energy of ATP decomposed for the production of protein is given by ∆µ A x P AqP . On the other hand, denote by x APqA the speed of decomposition of protein to produce ATP, where the Gibbs free energy of the protein decomposed for the production of ATP is given by ∆µ P x APqA . Then in the stationary state we have x APqA ∆µ P = x P AqP ∆µ A , (9.23) where ∆µ P (∆µ A ) is the chemical potential difference between decomposition and reproduction of protein (ATP) [see Figure 9]. Therefore, if we can know the value of ∆µ A then we can know ∆µ P , vice versa. The free energy of proteins manufactures the complex organization of life and it makes the smart deliver and receive of the free energy possible. Therefore, once it is cut down to the outside, then the meaning of ∆µ P is lost. Now, according to experiments the function of protein depends on the speeds of production and decomposition.
Thus, when we think in this way, the negentropy(= negative entropy) is not permanently existing, but constantly produced by the reaction (iii), and constantly increasing by the reaction (i). If one doubts why entropy cannot increase in life body, then one forgets the combination of reactions (iii) and (i). Even thougḣ n i = 0, as long asq s ̸ = 0, the local production of entropy is carried out in life body.

Analogy Between Thermodynamics in the Transient Phenomena and Theory of Metabolism
In the year of 1953, Motoyosi Sugita noticed some analogy between the thermal engine and the chemical engine [30,31,32,33,34,41,42,43,44,45,77,78,79]. He recognized that it is more convenient for us to consider the Gibbs free energy than to consider energy. This is because when we consider the balance of chemical energies in life phenomena, the energy is immortal forever but the free energy can be dissipated. He thought that this point is very important for life.
Suppose that there is a life system that eats foods and discharges excreta. Let us denote by Z 1 the Gibbs free energy of the life system that intaking from the outside. Let us denote by Z 2 the Gibbs free energy of the life system that excreting to the outside. Let us denote by D the Gibbs free energy of the system that is consumed within the system. Let us denote by G K the Gibbs free energy of the system that is accumulated within the system. Let us denote byĠ K its time derivative. Now the balance of the Gibbs free energy must satisfy the following relation: This equation corresponds to Eq.(2.6) or Eq. (3.19) in the thermal system such that Z 1 −Z 2 and D correspond toU + PV and TṠ, respectively. On the other hand, the Gibbs free energy change for the external system (outside of the system)Ġ e is given by −Ġ e = Z 1 − Z 2 . (9.25) This corresponds to the Gibbs free energy change in the thermal reservoir for the thermal system such that Z 1 − Z 2 corresponds toU + PV . Hence, the total change of the Gibbs free energy is given bẏ Since D in the life system corresponds to TṠ in the thermal system, Eq.(9.26) means the 2nd law of thermodynamics for the life system. From this point of view Motoyosi Sugita intensively studied the theory of life using the analogy.
Let us now take the steady state into consideration. In the steady stateĠ K = 0, we have Let us consider the detailed balance of Gibbs free energy. A part of Gibbs free energy Z 1 − Z 2 is used to do the work of muscles or of digestion, of absorption and of excretion. Let us denote this by W . The other part is used as pump action to promote synthetic reaction or anabolism. The Gibbs free energy of our body is reproduced by this reaction. Let us denote by R the rate of reproduction. There is a loss of the Gibbs free energy during the work. Let us denote it by W . There is a loss of the Gibbs free energy during the pump action. Let us denote it by R. The rest of them is D f . The relation of these quantities is expressed by Figure 10.
Thus, we must hold the following: Let us define the following relation: Then we haveĠ

Balance of Substances in Life
As a generalization of the concept of metabolic reaction, Motoyosi Sugita considered the network of the flows of substances in life systems. Let us denote by n i (X) the concentration of the chemical substance X in the chemical state of molecule i in the life system. For example, X stands for atomic elements such as C, N , etc., while i stands for complex molecules such as Amino acids, fat, etc.
Catabolism occurs in the directions from more complex molecules to simpler molecules, while Anabolism occurs from simpler molecules to more complex molecules. For the sake of simplicity, let us write the anabolism direction as in a series of · · · i − 1 → i → i + 1 · · · , while the catabolism direction is reverse in order. Let us denote by Q i−1,i (X) the anabolic reaction from state i − 1 to state i. Let us denote by q i,k (X) the catabolic reaction from state i to state k, where i > k. Now we can write the balance equation of substances asṅ This network is schematically shown in Figure 11. In the stationary state we must have the following as usual:  Now let us consider the weight increaseẇ of the body of the life. This will be given by ∑ X ∑ i n i (X). Since the system is an open system, if we denote by Y 1 the total quantity of the substances that are intaken from the outside of the body and by Y 2 the total quantity of the substances that are excreted to the outside of the body, then we must have the following: where w is the weight of the body. Let us denote by q ei (X) the quantity of X flowing from the outside into the state i in the system. Let us denote by q ke (X) the quantity of X flowing out from the state k to the outside. By definition the Y 1 and Y 2 must be defined as respectively.

Gibbs Free Energy of Life
As described before, let us denote by G K the Gibbs free energy of the life system and by G e the Gibbs free energy of the external system. Hence, the total Gibbs free energy that includes the Gibbs free energy of the life itself is given by G K + G e . The life system is a system that the total Gibbs free energy of G K + G e is decreasing according to Eq.(9.26). Let us relate the quantities given by Eq.(9.24)-(9.26) with the quantities given by Eq.(9.32)-(9.36). Let us define the chemical potential that X belongs to the state i by . (9.37) From this we can define the rate of the Gibbs free energy of the life system bẏ Now substituting Eq.(9.32) intoṅ i (X) of Eq.(9.38), we obtaiṅ ] . (9.39) This can be converted into the following form: where Z 1 and Z 2 are given by (9.42) respectively. In Eq.(9.40), Q i−1,i (X) represents the anabolic reaction when µ i (X) > µ i−1 (X), while q ik (X) represents the catabolic reaction when µ i (X) > µ k (X). Z 1 is the Gibbs free energy of the intake of food while Z 2 is that of waste from the body of the life. Thus we can consider that the food is absorbed with the chemical potential µ i (X) and the waste is ejected with the chemical potential µ k (X). Hence, the difference Z 1 − Z 2 is equivalent to the metabolic energy. Now if we define D as (9.43) then we obtainĠ This is nothing but Eq. (9.29). In the stationary state it provides the following: Thus, we have been able to represent the quantities for the life system such as G K , D, Z 1 and Z 2 in terms of the quantities in the biological chemical network such as n i (X), µ i (X), Q i,i−1 (X) and q ik (X). This is the essence of the Motoyosi Sugita's theory of metabolic network system, which was entitled as Metabolic Turnover of Entropy and Energy and its Mathematical Analysis in Life I-V. It was first published in Japanese in the Busseiron Kenkyu [30,31,32,33], which was also published in English in the Journal of Physical Society of Japan [61,77,78,79], and later the generalized version of the theory was published in Japanese in the Bulletin of Kobayasi Institute as well as in the Busseiron Kenkyu [34,41,42,43,44,45]. I would like to strongly recommend the Western people to read his theory published as the English papers [61,77,78,79].

The Birth of Network Thermodynamics
The above Motoyosi Sugita's theory of metabolic network in the life-being system is much earlier than the network thermodynamics that was founded by Aharon Katchalsky's group [106,108] as earlier as 20 years. As one of the coworker of Lars Onsager, Aharon Katchalsky with his students George F. Oster, and Alan S. Perelson independently developed the extension of the Onsager's theory of irreversible processes to the theory of life system. They found the relationship between the network of chemical reaction in biosystems and that of electrical circuit in electronics. Hence, they named their theory the network thermodynamics.
The starting point of Aharon Katchalsky's group is Eq.(7.32): (9.46) where in order to adjust with their notation, I have used ϕ in stead of 2ϕ in Eq.(7.32). They took this as the fundamental equation for their theory of network thermodynamics. Let us suppose that there is a biochemical network system that is represented by a network graph such as Figure 11. Let us assume that we define chemical potential µ i on the i-th node of the graph and the current J ij on the link ij in the graph, respectively. Let us assume that there is the standard chemical potential µ that can measure the standard level for the chemical potentials. This is analogous when we consider the electrical circuit where the ground point plays the role of the standard level for all the potentials.
By using this problem-setting, they found the Kirchhoff's law in biosystems, where there are the Kirchhoff's current law (KCL) and Kirchhoff's voltage law (KVL). The KCL is the conservation law of the currents on a node such that the total amount of incoming currents to the node must be the same as the outgoing currents from the node. The KVL is the conservation of the voltages (or the potentials) along any closed circuit in the network graph. Obviously these are very common knowledge in the elementary circuit theory or the elementary electromagnetism. Hence there is no need to explain much more. However, although there is a very famous theorem in the circuit theory that is called the Tellegen's theorem [149,150,151,152,153,154], this is not so well known among physicists except some experts in electronics, although it is a natural generalization of the theorem of the Joule's least heat in the steady current or the Thomson's theorem in the static electric field [155].
The Tellegen's theorem states the following: Let us denote by J ij the current flowing out from the node i and flowing into the node j in the link ij. Let us denote by V ij the voltage difference between the the node i and the node j. When both the KCL and KVL are satisfied at the same time for an arbitrary network graph of electrical circuit, if we represent the current vector J and the voltage vector X by (9.48) then the following relation must be satisfied: X t J = 0, (9.49) where t stands for the transpose of the vector. This simple and rather looking trivial relation of Eq.(9.49) is called the Tellegen's theorem. And it is really a very powerful tool in the electrical circuit theory [149,150,151,152,153,154].
What is important here is that Aharon Katchalsky's group has found that Eq.(9.46) in the biochemical network system is equivalent to Eq.(9.49) in electronic circuit systems. Let us represent the ϕ in Eq. (9.46) in terms of the differences in the chemical potentials µ ij ≡ µ j − µ i where the level chemical potential is denoted byμ e . It is given by The rate of the Gibbs free energy of the system,Ġ K , is given bẏ From Eq.(9.46) we obtainĠ If we explicitly write down the KCL and the KVL for a given network graph and if we use the relations of the KCL and the KVL, then we can prove that the above Eq.(9.52) can be converted into the form of Eq. (9.49). This is the findings of Aharon Katchalsky 's group [106,107,108,109,110,111,112]. I would like to put a comment here. There is a difference between the Motoyosi Sugita's approach and the Katchalsky's approach. In the former, Motoyosi Sugita used the concept of field of chemical potential and therefore, the current flow J ij (X) is given by where T is the temperature of the system, k is the Boltzmann constant, R ij (X) is the chemical resistance, and µ i (X) is the chemical potential on the j-node in the network graph. According to the chemical reaction: the above chemical potentials are defined as Very very sadly Aharon Katchalsky(Aharon Katzir) was killed by terrorism of the Japanese red army at the Ben Gurion International Airport in 1972. He was one of the victims when 26 people were killed by the attack. I feel very sorry for that. I would like to state that Rest in Peace to him and others. Therefore, actually the year of 1972 is the end of his theory. Later on, Perelson and Oster had followed the line of Katchalsky straightforwardly and extended their theory to more complex network systems [106,107,108,109,110,111,112]. However, their theory became very pure mathematical so that none can easily follow them. Hence, the Katchalsky theory seems dead at that time. However, recently some people noticed the importance of the Katchalsky's work and applied to the membrane system of life [156,157] and the complex metabolic network [158,159] as Motoyosi Sugita did it long long ago as early as 65 years. This would be regarded as some revival for their theory.
As was discussed in Section 7, the reason why we must have Eq.

The Phenomena of Life and its Analogy to Social Economics
In the English paper of 1954 [61], Motoyosi Sugita pointed out the resemblance of the phenomena of life and the social economics, since the life phenomena can be regarded as the society of molecules in cells or that of cells as follows.
(i) Our body consumes various organic and inorganic substances, some of which are produced in our body, like hormone, enzymes, protein, nuclear acid, fats and others, and some of which are taken from the external world by the work of our muscles and digestive organs, like inorganic salt, vitamins, amino acids and others.
(ii) These substances are useful to maintain life. The idea of nutritive value is well known but quantitatively the value of caloric units is mainly taken into account. The nutritive value of vitamins, iron and other inorganic substances and some amino acids is also taken into account but only qualitatively. There may exist the idea corresponding to utility or welfare function in economics which may be treated analytically and quantitatively.
(iii) There is consumption of Gibbs free energy (shortly say, F. E.) to produce or absorb the necessary substances and consumption is required by production and intake. Even the absorption of glucose, which is the last stage of digested starch is carried out by the "investment" of F. E. of ATP, an ester of phosphoric acid of high energy. Therefore, in the case of famine or when ill-fed, our organ loses the power to digest or absorb nutritious substances due to the lack of F. E., which corresponds to initial cost. On the contrary, the function of intestinal absorption will be dangerously damaged if over-fed.
The above is shown in Figure 12, in which the energy is fed back to take the chemical energy from the external world. This "feeding back" is similar to business life, in which an enterprise is sometimes suppressed by the lack of the initial cost. Indeed our body corresponds to a factory and ATP to capital.
(iv) There is the balance of the need and the supply. Superfluous protein, for instance, loses its amino-group and changes into carbon-hydrates corresponding to consumer's goods. On the other hand, the protein of our tissues, which corresponds to producer's goods, is destroyed by lack of protein, and the material is used to construct the other necessary part.
According to Professor Kida the relatively short legs of the Japanese are due to the lack of protein of high quality in food during growth. The body seems to lack protein to build legs, for we must use the material to construct the necessary part of our organs. Medical science may be considered good management in the balance of matter and F. E.
(v) Our body corresponds to our system of industry. Various substances are produced in every part of our body and supplied to other parts. On the other hand, the parts are also supplied from other parts. There is an exchange and economy of matter and energy. For instance, the production of protein corresponds to the first department of producer's goods. In this case as well ATP as protein is consumed. The consumption of the latter corresponds to the depreciation of producer's goods, in this case the chemical apparatus made of protein.
The ATP which is consumed, is reproduced again in our body and carbon-hydrates, protein and ATP are consumed for reproduction. Here, the carbon-hydrates correspond to consumer's goods and the reproduction of ATP to the second department of economics. Therefore, there is a close analogy between the two fields. For instance, labour is reproduced by the consumption of goods, just as carbon-hydrates in food. This fact is important from the point of view of methodology (see Figure 9).
On the other hand the consumption of protein which is an example of "catabolism", corresponds to depreciation which is repaired by "anabolism".
(vi) Depreciation and the repair is the general aspect of life. For instance, reproduction is the turnover of the body itself, which depreciates during life, especially by reproduction itself.
If we take, however, the history of man-kind into account, depreciation in the individual body is repaired by other bodies. Therefore, those who enjoy youth enjoy the turnover of the individual body.
Therefore, one of the most prominent aspects of life is the turnover of molecules of cells and of the individual body, so that the world of the living organism is repaired and steadily maintained. This is very important from the point of view of thermo-dynamics, for the F. E. on earth is constantly consumed by organisms.
The steadiness is similar to that of the river, which consumes the potential energy of water and also maintains steadiness on the "balance" of water.
In a similar manner the depreciation of the apparatus of a chemical plant, the value of N. E. (negative entropy) of our body is also depreciated (c.f., see Section 9). On the other hand this value of N. E. regulates the value of F. E. of activation of bio-chemical reactions. Therefore, the "catalytic action" of the organs, corresponding to the function of the chemical plant, is also depreciated and repaired. In this respect the writer has introduced the idea of the metabolic turnover of N. E., corresponding to the depreciation and the repair of producer's goods in economics. Figure 13: Schematic diagram of the circulation of phosphates of adenosine in our body [160]. ATP: Adenosine triphosphate; ADP: Adenosine diphosphate; AMP: Adenosine monophosphate.
(vii) Besides the "feed back" of F. E. there is the circulation of matter in our body, for instance the chemical cycle of ATP⇄ADP or the reduction and oxidation of enzymes. Figure 13 shows the circulation of phosphates of adenosine in which ATP is included. The circulation is very complicated, in general, but is schematized in Figure 7. This is similar to the circulation of paper money in our society. In a similar manner, the matter of high energy is taken from the external world and excreted, so that our body corresponds to a pipe and is called an open system. But it is not an open system like the pipe through which the water of a tank flows.
(viii) The "feedback" of matter and energy is very similar to the management of our social life. Chemical processes in our body are combined like the system of gears (c.f., see Section 9), and, if we wish to promote a process, the effect is fed back and produces sometimes unexpected results. Here is the difference of the biochemical change from that in vitro. Therefore, if the knowledge of chemistry in vitro is applied mechanically the effect may be contrary to expectation, as in the controlled social economy.
There is bad circulation in our body. For instance, the appetite is diminished, if health is destroyed, and health is disturbed if the appetite is diminished. Good management by the physician will eliminate bad circulation.
(ix) There is the balance and stability of matter and F. E., in our metabolism. If the balance is disturbed,the function of our body is disturbed. We have seen that our body resembled a pipe, through which the matter of high chemical energy flows and the matter of low chemical energy is excreted. The balance seems to be favorable to the flow of matter (see Figure 14). In social life the balance of production and consumption is favourable to the movement of goods. There is the recovering action in our body as well as in our society. If the balance is disturbed, and cannot be recovered, a catastrophe occurs and finally death of our body.
, (10.1) was very easy to adjust with cybernetics. Here the chemical resistance σ c ≡ 1 Rc plays the role of the regulation nozzle in the regulator process.
(v) Fifth, the concept of cybernetics is so broad that it fascinated him very much, since he was a full time professor for economics and economical management in the Hitotsubashi University which is one of the top national universities for the humanities course in Japan. (vi) Sixth, his time was a bit earlier than the era of modern optimal control theory. I believe that this would be the main reason for that.
As is known that cybernetics as well as feedback control of Norbert Wiener belong to the classical control theory, we know the modern control theory such as L. Pontryagin's maximum principle [see Section 8] and Richard Bellman's optimality principle [see Appendix A] nowadays. Motoyosi Sugita was born in the age of classical control theory much earlier than that of modern control theory. Therefore, he seemed unfamiliar with what happened in the control theory around 1960 when Pontryagin's revolution occurred. So, he would like to catch up the Norbert Wiener's theory for his theory of thermodynamics of life.
Although this research theme is very fascinated, this is out of scope of this paper. Otherwise, more several hundreds pages would be needed to do it. This work would be for you, the readers, probably not for me, I hope!

Conclusion
In conclusion, I have introduced the personal history of Motoyosi Sugita in Section 1. Here I summarized his birth, education, working, marriage, visits, publications, as well as his research history, etc.
In Section 2, I have shortly summarized his bright ideas of the concept of broad quasi-static change, the concept of virtual heat, and the concept of irreversible cycle. And I have shown his application of them to a certain classical phenomenon in physics such as the Kelvin's thermoelectric effect.
In Section 3, as another example for his application to other classical phenomena in physics, I have discussed the diffusion phenomena. Here the Langevin equation, the mixing entropy and free energy, the number of partition have been discussed.
In Section 4, the theory of condensation in the supersaturated state developed by Becker-Döring, Volmer, and Frenkel has been discussed. I explained that this theory promoted Motoyosi Sugita to recognize the importance of the concept of the field of chemical potential.
In Section 5, I have summarized Motoyosi Sugita's thermodynamics of transient phenomena. Here for the concrete understanding, the theory has been applied to the chemical reactions only. Considering the system of chemical reaction network, the concept of the field of chemical potential, the relationship between cooperative phenomena and the chemical potential have been discussed.
In Section 6, I summarized the maximum principle of Motoyosi Sugita in the transient phenomena. Here I have discussed |Ġ| = max conjecture, the existence of the 4th law of thermodynamics, the relationship between the Boltzmann's H-theorem and the µ-field (i.e., the field of chemical potential). Also, the proof of the conjecture was shown for some special situation.
In Section 7, I have argued the relationship between the Motoyosi Sugita's theory and the theories of Lars Onsager and Ilya Prigogine. Here in general both of them are almost identical and therefore they established the same type of theory independently. However, while the Onsager-Prigogine's theory was limited within the linear theory, Motoyosi Sugita's theory went far beyond the linear theory using the concept of the field of chemical potential.
In Section 8, I have discussed the relationship between the maximim principle of Motoyosi Sugita and that of Pontryagin in the modern optimal control theory. I have shown that the Motoyosi Sugita's approach can be absorbed into the more broad category of Pontryagi's theory. And therefore, the way to prove his conjecture lies on the fact that the Pontryagin's maximum principle can be regard as a key to prove the existence of the 4th law of thermodynamics, when it is applied to non-equilibrium thermodynamics in the transient phenomena. I also have shown the Bellman's principle of optimality [see Appendix A].
In Section 9, I have shortly summarized the first application of the Motoyosi Sugita's maximum principle to the theory of metabolism. Here I have discussed the combined chemical reactions, reactions of metabolism, and the maximum principle, the analogy between thermodynamics in the transient phenomena and the theory of metabolism, the balance equation of substances, the Gibbs free energy of life, the birth of network thermodynamics, respectively.
In Looking back at my long journey to introduce the "widely-unknown Japanese thermodynamicist", Motoyosi Sugita, to the Western world (namely, the English-reading people) as well as to the young generations of biophysical scientists in Japan, what I have done here seems far from to be perfect. However, being insufficient is much better than nothing to do. All that glitters is not gold. If what I have done here would help you to understand the great work of Motoyosi Sugita, then I could sleep well. This is only just a beginning for the construction of theory of life.

Appendix: The Bellman's Theory of Optimality Principle
The Richard Bellman's work in engineering mathematics and control theory has not so well-known in physics community as well [137,138]. However, it is also very important for our purpose as well as so is Pontryagin's theory. As is shown before, on the one hand, Pontryagin's theory is the generalization of the Hamilton principle in classical mechanics to the theory of optimal control. On the other hand, Bellman's theory is the generalization of the Hamilton-Jacobi theory in classical mechanics to the theory of optimal control, whose theory is called the dynamic programming in the control theory [137,138]. From the quality and flavor of Bellman's work and the age that the work was done, I feel like that he is "Richard Feynman" [145] in engineering mathematics. Even their faces are alike to each other as well.

Multistage Processes
The expression of the multistage processes is that in the optimal control theory and in the applied mathematics. It is nothing more than the recursive processes and the iteration processes in physical terminology. Now, let us denote by a state vector x a point in an n-dimensional space. Next, let us consider that the point is transformed into another point x 1 by the transformation function R(x). Namely, By repeating this process, the transformation from the n-th stage to the n + 1-th stage is written by

N -stage Processes and Reduction of Data
Since we human being control the system in the optimal control theory, we consider the case of finite time. Therefore, we assume that there is an upper limit number N in the sequence. That is, we consider only the case: Then, later we take the limiting process: N → ∞. If we write it much more mathematically reasonably, we have Now, we consider that there exists a function in this multistage process. In other words, we consider that a function G is written in the following: This is thought of as a mapping from the coordinates (x, x 1 , x 2 , · · · , x N ) in the very higher dimension onto the scalar function G(x, x 1 , x 2 , · · · , x N ). It is a projection, which deletes many data by this mapping. Therefore, it reduces the data in a sense. What we represent the multistage processes by a function means that we have performed reduction of the data.
As explicit examples, the following forms can be considered:

N -stage Deterministic Process and Mathematical Representation of Policy
Let us consider the expression of the N -stage deterministic process in the optimal control theory and in the applied mathematics [137,138]. It is nothing more than the recursive processes and the iteration processes in physical terminology. Now, let us denote by a state vector x a point in an n-dimensional space. Let us denote by u the rdimensional vectors for control variables that we determine at each stage. Then, an N -stage deterministic process is given as follows: x n+1 = R(x n , u n ), n = 0, 1, · · · , N, (A.10) where we have defined x 0 = x. This produces a vector sequence: Let us now define the evaluation function or the performance index function by G: In the theory of optimal control [135,136], the above external vector u is called the control vector or the control function. On the other hand, in the theory of dynamic programming [137,138], it is called the policy or decision.

Mathematical Representation of Policy
The evaluation function G is a function of the state vectors x i and the policy vectors u i from the initial state to the final state. But in order to decide the policy at any stage, the policy itself have to be evaluated by them. Therefore, the policy q k must be thought of as a function of the past state vectors and the past policy vectors such as u k = u k (x, x 1 , · · · , x k ; u 0 , u 1 , · · · , u k−1 ). (A. 13) This is called the policy function. When the policy makes the evaluation function optimal, we may call it the optimal policy. And the optimality problem is that to determine the optimal policy by the multistage processes.
The above case of Eq.(A.13) is most general, since it includes all the information of the past. Therefore, it is very complicated, since the policies in the past determine the present policy. So, we have to simplify the policy representation by restricting ourselves to consider only the case that the state is determined by the state in the past just before the present time, such as the law of causality. In this restricted case, we have u k = u k (x k ). (A.14) Or for a bit more complicated system, it is given as where π k is defined by π k = [x k , x k−1 , · · · ]. (A. 16) When we adopt the condition such as Eq.(A.14) or Eq.(A.15), the evaluation function G can be written as a sum or a product of the function of local specific variables. In this case we recognize that the separability of the evaluation function is realized. The evaluation function can be written in the following:

Independency from the Past and Mathematical Representation of the Law of Causality
The multistage process (that is, the recursive process) always depends upon only the state one step before. Although the past, the present and the future are all connected in time series, the present state is determined by the past state one step before. This is the concept of the multistage process. Therefore, the present is nothing to do with all the past before the past one step before. In this sense, the present is independent of the past. Mathematically, it is given by This is the so-called mathematical representation of the law of causality.

Mathematical Representation of the Law of Causality
We now define the mathematical representation of causality. Suppose that the state of the system is x = f (x 0 , t) at time t, where x 0 is the initial state of the system. Suppose that the system is progressing from the initial state x = f (x 0 , t) to the final state x ′ = f (x 0 , t + s) at time t + s. At this moment, we can separate the whole interval from the initial time t = 0 to the final time t + s into two intervals: one region is the interval between t = 0 and t and the second is the interval between t and t + s.
In the first interval the state lies in x 0 at time t = 0, and it goes to the state x = f (x 0 , t) at time t. In the second interval, the system is progressing from the state x = f (x 0 , t) at time t to the final state x ′ = f (x 0 , t + s) at time t + s. However, the final state is equal to the state that the system starting from the state x = f (x 0 , t) at time t becomes the system progressing to the state x ′ = f (x, s) = f (f (x 0 , t), s). Thus, we are able to adopt the causality condition that f (x 0 , t + s) = f (f (x 0 , t), s). (A.21)

Recursive Processes
When we consider the systems of engineering or physics, we evaluate a function that emerges by an engineering process or a physical process. We calculate the physical quantity, according to the process that the state of the system changes at a time. This becomes the main theme for us. Now, we call such physical quantities the evaluating function or the performance index in the control theory. Let us denote it by f N (x). For example, g(x i ) = g(x) + g(R(x)) + · · · + g(R N (x)). (A.22) By the way, for the N − 1 terms of N ≥ 1, we know g(R(x)) + g(R 2 (x)) + · · · + g(R N (x)) = g(x 1 ) + g(R(x 1 )) + · · · + g(R N −1 (x 1 )) = f N −1 (x 1 ). (A.23) Therefore, the first relation becomes

Infinite Process, Time-dependent Process, and Non-stationary Process
The process with the limit N → ∞ is called an infinite process. This is the process that we take care of this limit in the above all procedures before. And we consider that transformation function R is always the same so far. But we may think that the transformation function can change at each step. It is the time-dependent process. That is, x n+1 = R n (x n ), n = 0, · · · , N. (A.27) Furthermore, in the non-stationary process: [x m , x m+1 , · · · , x, x 1 , x 2 , · · · , x n , · · · ]. (A. 28) we have x m+1 = R m (x m ), x m+2 = R m+1 (x m+1 ), · · · .

Continuous Multistage Process
We have considered the discrete multistage processes so far. The discrete multistage processes that are generalized to the cases of the continuous time are the continuous multistage processes. In this case, if we divide the time very many: t = 0, τ, 2τ, · · · . (A.32) then we can treat the process as if it is a discrete process, and then finally we take the time interval τ to the limit of 0 (i.e., τ → 0). The transformation function R(x) is considered up to the linear term of τ .

R(x) = x + S(x)τ + O(τ ). (A.33)
We consider the evaluation function up to the linear term of τ as well. That is, f t+τ (x) = g(x)τ + g(R(x))τ + · · · + g(R n (x))τ. (A.34) Therefore, the recursive relation in this case is Expanding both sides of the equation, and comparing the linear terms, we obtain the following: This is the functional relation for the continuous multistage process. Physically considering, it is nothing but the Fokker-Planck equation [146] when the external force g(x) exists, where S(x) corresponds to the velocity.
As is discussed before, if we want to obtain the maximum of the f , we need find the solution of the following equation: This is called the Bellman's differential equation.

Principle of Optimality and Bellman's Optimality Equation
Bellman introduced the principle of optimality [137,138], which is described in the following principle: Principle of optimality: An optimal policy has the property that whatever the initial state and the initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.
This principle is a very general property and has universality. And we can say that this principle of optimality is equally matched for the Dirac-Feynman's principle for the path integral in physics [145].
For example, we consider the following evaluation function: G(x, x 1 , x 2 , · · · ; u 0 , u 1 , u 2 , · · · ) = N ∑ i=0 g(x i , u i ). (A.38) This is the function that this physical quantity provides data when we decide whether or not the maximum effect is attained in the multistage deterministic process. We decide whether the process is effective or not through evaluating this function. Namely, we determine so that the function becomes maximum. Therefore, we denote by F N (x) the function when it becomes maximum: Since we assume that we decide the optimal policy at each step from the principle of optimality, we have g(x i , u i ) = g(x, u 0 ) + [g(x 1 , u 1 ) + · · · + g(x N , u N )] = g(x, u 0 ) + G N −1 (R(x, u 0 )). (A.40) Comparing this for N ≥ 1, we get

The Meaning of the Principle of Optimality and Dynamic Programming
The meaning of the principle of optimality and dynamic programming is as follows: In general, the evaluation function is a problem of the two-point boundary value of the initial and final states such that we have to consider all processes in between the interval.
In classical mechanics, we impose that the action function S becomes minimum between the initial and final states under the principle of least action. This provides the Euler-Lagrange equation for the orbit. Conversely, once the Euler-Lagrange equation is written down, we would like to solve the Euler-Lagrange equation. This provides the temporal motion of the orbit in between the two boundary times, which guarantees that the action S becomes always minimum.
On the other hand, in the theory of optimal control, we impose that the evaluation function G becomes maximum between the initial and final states as the principle of optimality. This provides the Bellman's equation of optimality for the state. Conversely, once the Bellman's equation of optimality is written down, we solve the Bellman's equation. This provides the temporal development of the state in between the two boundary times, which guarantees that the evaluation function G becomes optimal.
In this sense, the Bellman's principle of optimality is the natural extension of the principle of least action in classical mechanics. In the automatic control engineering and the control theory, to determine the policy at each stage is called the dynamic programming. The Bellman's dynamic programming provides the algorithms to give the optimal evaluation at each stage of the process.
Inversely, we can think that the Euler-Lagrange equation of motion is the algorithm that determines the orbit of the classical object in order to give the optimal action at each time. The evaluation function in classical mechanics is the action function and the evaluation function in the optimal control theory plays the same role as the action in classical mechanics.

Continuous Multistage Deterministic Process
We have considered the discrete multistage deterministic processes so far. The discrete multistage deterministic processes can be generalized the continuous multistage deterministic processes. In this case, if we divide the time interval into very many small intervals of τ , we can use the idea of the discrete time: t = 0, τ, 2τ, · · · . (A.43) Then we can treat the process as if it is a discrete process, and at the end we take the limit of τ → 0. Let us assume N τ = T . Suppose that the policy is fixed as u. where R(x, u) and S(x, u) are the n-dimensional vectors. The evaluation function is given by And we denote by F N the maximum of the evaluation function G N such that F N = max u G N . Since t = iτ , we can regard it as a function of continuous time t such as F N = F(t).
We consider the evaluation function of Eq.(A.39) up to the linear term of τ : F t+τ (x) = g(x, u)τ + g(R(x, u))τ + · · · + g(R n (x, u))τ. Expanding both sides of the equation up to the linear order of τ , and comparing the coefficients of the linear terms, we obtain the following: This is the functional equation in the continuous multistage deterministic process. This is called the Bellman's partial differential equation. Physically speaking, it is nothing but the Fokker-Planck equation [146] when the external force g(x, u) exists, where S(x, u) corresponds to the velocity. In the above we have considered the final time as a free parameter of T . However, since we can regard the initial time t 0 as a free parameter, in this case we can just replace as dT = −dt. Then, the corresponding equation becomes ∂F ∂t This corresponds to the Hamilton-Jacobi equation in classical mechanics, since the F and the second term in the left hand side corresponds to the action integral S and the Hamiltonian H in classical mechanics, respectively. Therefore, we can call it the Bellman's Hamilton-Jacobi equation. This point will be discussed more explicitly later.
For the sake of simplicity, let us consider the following evaluation function with one-dimensional state vector: where x(0) = q. The value that the evaluation function is maximum is given by The state variable is q = x(t) at some fixed time t, and time left for the process becomes T − t. In the usual case of the variational problem, u(t) =ẋ corresponding to the velocity(i.e., the tangent) becomes the determination of the policy.
Let us consider the problem to determine the initial tangentẋ(0). If we denote it by u, then the integral interval can be separated into the following two regions: In the first interval, This is the condition to get the maximum. To satisfy the principle of optimality is to require that always this condition is satisfied. Therefore, if we think that this condition is always satisfied, then we get rid of max in the right-hand side. In this case of the variational problem, we are not able to understand whether or not the original evaluation function takes the maximum value or the minimum value only from the extremum condition for the variation in the linear order. And the condition of Legendre is the condition that guarantees it; Namely, for the case of the maximum (minimum), we hold ∂ 2 g ∂u 2 < 0 (> 0). (A. 62) In classical mechanics it is very difficult to put the restriction on the policy and it is not necessary to do so. However, in the dynamic programming and in the optimal control theory, there are various ranges and restrictions of the policy.
For example, |u| = |ẋ(t)| ≤ k, 0 ≤ t < T. (A. 63) In such a case, since the principle of optimality is satisfied, we change the problem to the one that we seek for the maximum with a constraint. That is, , F(q, 0) = 0. (A.64)

Geometrical Meaning of dynamic programming
The geometrical meaning of the dynamic programming is as follows: In classical mechanics we seek for the curve x = x(t) corresponding to the orbit in the mechanical system. We select the orbit so that the action becomes maximum or minimum. Therefore, the unknown function u is regarded as a point in the functional space. On the other hand, in the dynamic programming, we seek for the optimal direction at each instant. The solution is represented by the envelope curve that is constructed by collecting the optimal directions selected at each point. Namely, it turns out to be the envelope of its tangents of the curve. Using the terminology in fluid mechanics, it corresponds to the streamline. In this respect, the variational principle in classical mechanics is dual to the principle of optimality each other. Hence, we find the duality between the variational principle and the principle of optimality.

Hamilton-Jacobi Equation
When we apply the principle of optimality to classical mechanics, the problem of optimal control reduces to solve the Hamilton-Jacobi equation [137,138,139,140]. Consider the action integral I: L(x,ẋ, t)dt, (A. 65) where L is the Lagrangian. When the action integral takes the minimum, we write as Similarly as before, we divide the whole interval of time into two intervals (t 0 , t 0 + τ ) and (t 0 + τ, t): S(x 0 , t 0 ; x, t) = miṅ p 0 is the momentum vector at t = t 0 . Now, this time if we apply the principle of optimality for the state x not at the initial time t 0 but at the final time t, then the terms that depend upon the time derivative change its sign to minus. Therefore, we obtain the following: This is again the Hamiltonian-Jacobi equation in classical mechanics. Thus, when the idea of the Bellman's principle of optimality is applied to the special case of the action function in classical mechanics, then it reproduces to the usual the principle of least action. In this respect, the Bellman's principle of optimality is thought of as being a natural generalization of the principle of least action.