The Entropy Rate of Thermal Diffusion

The thermal diffusion of a free particle is a random process and generates entropy at a rate equal to twice the particle temperature in natural units of information per second. The rate is calculated using a Gaussian process with a variance as a combination of quantum and classical diffusion. The solution to the diffusion of a free particle is derived from the equation for kinetic energy and its associated imaginary diffusion constant and a real diffusion constant representing classical diffusion. We find the entropy of the initial state is one natural unit, which is the same amount of entropy the process generates after the de-coherence time, hbar over twice the temperature.


Primary Finding:
When a free particle is at a non-zero temperature, it is composed of a spectrum of frequencies that evolve at different rates which causes the probability distribution of where one can find the particle to spread. We will show that the entropy rate, associated with the probability distribution diffusing, is equal to twice the particle's temperature.
⁄ (1) The rate, R, is calculated below using the natural logarithm, and thus the units for the rate are natural units of information per second, when the temperature (T) is expressed in degrees Kelvin, Boltzmann's constant ( ) is expressed in Joules per Kelvin, and hbar (ħ) is Planck's constant divided by 2π in Joule-seconds.
This equation tells us the minimum amount of information we need, each second, in order to track a diffusing free particle to the highest precision that nature requires. By quantifying the amount of information needed to follow a free particle for a certain time, and showing it is finite, we are able to guarantee that a computer (or other discrete state-space machine with finite memory) can store a particle's initial state and trajectory.
What is unique about this result is that there is no dependence on the mass of the particle or any other variable except the temperature.

Assumptions:
We prove this primary result by making the following three assumptions: 1) The continuous diffusion of a free particle can be modeled as a discrete process with a time step that is much smaller than the de-coherence time , , where T is the temperature.

2)
Knowing the particle's location at time step n+1 allows one to determine the location of the particle at the previous time step n; i.e., conditional entropy is zero, ( ) where is the random variable that represents where the particle can be found at time step n.
3) At each time step the minimum uncertainty wave-packet is localized around its new location, and thus the conditional entropy of the step, given all previous steps, is the same as the conditional entropy of the 1 st step given its initial state, ( ) ( ).
These three assumptions taken together are reasonable and give insight into the behavior of the system. Assumption 1 is aided by the analysis found in [1] which shows the time step of discrete diffusion is ( ) ⁄ and thus for non-relativistic particles, the assumption holds. Assumption 2 says that there is no entropy beyond the minimum uncertainty wave-packet after a measurement of the particle's location was made.
Assumption 3 says that the vacuum localizes the diffusing particle up to the minimum uncertainty wave packet at each step in the process. Even though an undisrupted particle's wave packet will spread, at the vacuum level the particle is re-initialized at each step, like in quantum nondemolition measurements [2].

At
, a free particle in vacuum is initialized into a minimum uncertainty Gaussian wave-packet with a spatial variance equal to ( ) . As time increases so does its variance and thus its entropy.
To calculate the entropy rate of this process, it is helpful to think of time as occurring in discrete units of a small size dt (assumption 1).
We can look at a Venn diagram of this process, figure (1).
(orX0 in the figure) is a random variable, drawn from ( ), that describes the location of where the particle can be found at time .
(X1) is a random variable, drawn from ( ), that describes the location of where the particle can be found at time .
(X2) is drawn from ( ) and so on up to which is drawn from ( )where .

Figure (1) -Venn diagram of the conditional entropies of the diffusion process
As hinted to in the diagram (but explicitly stated here as assumption 2 and assumption 3), we will assume that the conditional entropy of each step is constant; where h is the differential entropy ∫ ( ) and where ( ) is the distribution which determines . This essentially means that knowing the location of the particle at any time allows one to calculate where it was in the previous time step and that the minimum uncertainty wave-packet maintains its coherence as its first moment (or average value) diffuses via a process with a variance as given by equation 2.
In section 5, we show that as time increases, a free particle diffuses such that the variance of where the particle can be found (if localized) is ( ) Thus (or simply ) is a Gaussian random variable with variance ( ) ( ⁄ ) .

Entropy Rate:
We can calculate the entropy rate of this process using the definition of the entropy rate. We will use the entropy rate, R, as calculated by taking the limit as the number of steps goes to infinity of the conditional entropy of the last step given all previous steps divided by the time step [3]. To solve for R, we first notice that since ( ) (assumption 2) we can show by induction that ( ) ( ) (4) Due to the symmetric nature of mutual information, we can prove the equation below [2].
Next, we use assumption 3 to re-write the difference in entropy at time step n and n-1 as equal to the difference in entropy at time step 1 and the initial state.
Since the X n 's are Gaussian, we can easily calculate the differential entropy of each step using equation (2) and the differential entropy of the Gaussian distribution [2].
Using equations (31) and (32) this is re-written * ( )+ (10) We are assured by assumption 1 that ( ) ⁄ . Thus, we can Taylor expand the logarithm giving the first term plus the terms that are O(dt) or smaller.
Ignoring the terms of O(dt) or smaller, we get our primary result ⁄ (12) The other method to calculate the entropy rate is ,which equals the limit as n goes to infinity of the entropy of all the X n 's divided by n times dt [3]. Since we are looking at the rate of generation of the entropy (not the initial conditions), we subtract the entropy of the initial state h(X 0 ). This also assures that R is in the correct units.
Assumption 3 now lets us rewrite this as We see that (18) We can safely conclude that ⁄ In this view, the temperature acts as an average energy and generates information (or entropy) at a rate equal to twice the average energy divided by ħ.

The Variance of X n :
Given the wave particle duality, which states that a free particle is both a wave and a particle, we see that our free particle undergoes both quantum mechanical diffusion of the wave and classical diffusion of the particle. Introducing , , ( ) and ( ) makes this more clear.
is a random variable drawn from ( ) ( ) ( ) ,the probability distribution associated with the quantum mechanical wave-function, which is the solution to the quantum diffusion equation, equation (33).
is a random variable drawn from ( ) and is the solution to real diffusion equation, equation (42).
If were an observation of where the particle is located, it would be the sum of a sample drawn from ( ) and the uncorrelated sample , drawn from ( ).
(20) Thus the action of ( ) is to translate the center of the wave function, ( ), by a sample of .
As we know from probability theory, the resulting distribution, ( )is equal to the convolution of ( ) and ( ) over the x variable (30) [4].
( ) ( ) ( ) (21) Since both ( ) and ( ) are Gaussian distributions, it is easy to show that the convolution of the two is again a Gaussian distribution with an expected value being equal to the sum of the two expected values (which in this case is zero) and a variance that is equal to the sum of the variances of the individual distributions.
In this equation t is the amount of time that has passed since the particle was initialized in the minimum uncertainty state, is the standard deviation of the minimum uncertainty state, is the standard deviation of the minimum uncertainty state in the momentum domain and m is the mass of the particle.
Inserting into the last term the Heisenberg Uncertainty principle (32), 2 , we can group.
To understand the model, it is helpful to look at equation (26). ( ) is the sum of three variances. The first is from the Heisenberg Uncertainty Principle of the initialized state, the second is from the thermal drift of the center of the minimum uncertainty wave packet moving with a group momentum taken as a sample of the momentum domain, and the third is from the classical diffusion of the center of the wave-function on top of the other two.
It is also possible to derive equation (27) by assuming no force on the particle, which lets you deduce ⁄ . Squaring and taking the ensemble average is all you need [5].

The Imaginary Diffusion Equation:
The Kinetic Energy Hamiltonian characterizes the wave packet of a free particle in one dimension, where H is the Hamiltonian, p is the momentum along the x direction, and m is the mass of the particle [6].

(28) Given that the momentum commutes with the Hamiltonian, [ ] [ ⁄ ]
, each eigenvalue of the momentum is a constant of motion and thus the variance in momentum space does not grow with time. It is possible to learn the width of the variance of the momentum by looking at the equipartition of energy [7]. Using the equipartition of energy we know to equate the degree of freedom associated with the average Kinetic Energy to one half the temperature times Boltzmann's constant.
̅̅̅ (29) Since we will assume that the average momentum is zero, we can solve for the variance of the momentum. ̅ (30) ̅̅̅ ̅ (31) Also from the Heisenberg Uncertainty Principal, we can solve for the standard deviation of the wave-function in the spatial domain in terms of its width in the momentum space.  (33), we will begin in the momentum domain ( ⁄ ) and take the inverse Fourier Transform to observe how ( ) evolves over time [8]. We use ⁄ (the wavenumber divided by ) as the independent variable because we want both ( ⁄ ) and ( ) to be normalizable to one.
( ⁄ ) ( ( ) ) ( ( ) ) (37) Our assumption that the wave-function of the free particle in the momentum space is a Gaussian wavepacket is quite reasonable given the nice properties of the Gaussian. Similarly, this assumption is already implicit in the equipartition of energy which was used to find the width of the initial wave-packet. Because the equipartition theorem is derived from the perfect gas law (where particles are modeled using the binomial distribution, of which the Gaussian is the limit), the Gaussian is the right distribution to start with.
To properly account for the evolution of ( ) governed by equation (33), ( ( )) is used as the kernel for the inverse Fourier Transform.
( ) ∫ ( ( ) ) ( ( ) ) ( ( )) (38) Using equation (36) to substitute in for ω you can solve for equation (38) by completing the squares to get ( ) [8]. ( )is in Gaussian form; to calculate the variance, we need to take the magnitude squared of the wave-function and get the distribution of the particle.
And (41) This is, of course, the well know result from quantum mechanics where the variance of the particle is the sum of the initial variance from the Heisenberg Uncertainty Principal and the associated variance of the momentum domain imparting a thermal group velocity ⁄ [9].

The Real Diffusion Equation:
When the diffusion constant of a diffusion process is real and does not vary with position, the resulting diffusion equation is as below [10].
( ) ( ) (42) Of course the solution to this real diffusion equation is the Gaussian with variance equal to 2Dt [11].
( ) ( ) (44) To find D, we will start with the imaginary diffusion operator and using analytical continuation, perform a Minkowski transformation [6]. The imaginary diffusion operator (33) is (45) Upon applying the Minkowski transformation, imaginary time is replaced with real time, [12]. Applied on the imaginary diffusion operator, the Minkowski transformation brings out the real diffusion constant we are looking for.
(46) By observation we see that (47) We can also derive from kinematic arguments as was shown in [1]. We can calculate the variance of f(x,t).

Entropy at
It is important to ask the entropy of the initial state. We find that at the entropy is 1 natural unit. Since the wavefunction and associated probability distribution are continuous, we calculate the entropy using the equation for differential entropy. One might object that the differential entropy is only accurate up to a scale factor. However I argue (and so did Hirshman [13]) that if you add the differential entropy in the dual domain, the scale factor cancels out because of the scale property of the Fourier Transform and the result is an absolute measure. As before we will use the position and the wavenumber divided by as the dual domains. ( ( ) ( )) ( ( ⁄ ) ( ⁄ ))(49) Hirschman [12] showed that this entropy for any wavefunction is ( ) and ( ) when the wavefunction is Gaussian.
While we are working with a Gaussian initial state, the answer appears to be a little more complex than just ( ). We learn from [1] that when solving for the quantum and relativistic length scales of dark particles, particles come in pairs. With only one particle and no reference frame there is no way of knowing the position or the momentum, even if there were universal measuring sticks. We get around this with two particles and a measuring stick/clock by determining the relative displacement and speed. Thus we need to look at the entropy of the relative difference of position and momentum of the two particles.
Define ( ) ( ) ( ) as the probability distribution on the location of particle 1 at and similarly for for particle 2. For the momentum space define ( ⁄ ) ( ⁄ ) as the probability distribution on the wavenumber divided by for the first particle and for the second particle. The probability distribution on the relative displacement and wavenumber, and are and ,respectively, and will be Gaussian assuming both the reference particle and the initial particle have Gaussian wave-functions. Since differential entropy is invariant to the first moment we can assume without loss of generality, the first moment of the reference wave-function is zero. The second moment of and will be the sum of the respective second moments of the particle and reference particle if the two are not correlated.
We can go even further and show that the reference particle should have the same second moments as the particle we are measuring if we minimize the entropy. Thus, we arrive at the distributions for both domains for and Thus, the total absolute entropy of the initial state, , is ( ( )) ( ( ⁄ )) (51) ( ( ) ) ( ( ⁄ ) )(52) (53) There are 2 things of note relative to the rate, , calculated above. First we see that since the rate, , from above, is the difference between the entropy at two times, the impact of the wider distribution of vs. is negated. Thus we could have done the analysis above using and instead and and the result would be the same. Second, we see that the entropy of the initial state is equal to the additional entropy generated by the diffusion process during the de-coherence time, ⁄

Conclusion:
We have seen that by making three assumptions about the thermal diffusion of a free particle, we are able to show that entropy is generated at a rate equal to twice the particle's temperature (when expressed in the correct units). This result will be applicable to all studies on free particles and other environments that are governed by similar equations. Also a myriad of applications exist in computer modeling, including but not limited to the following: finite difference time domain methods, Block's equations for nuclear magnetic resonance imaging, and plasma and semiconductor physics.
To check the primary result, one would perform a quantum non-demolition measurement on the quantum state of an ensemble of free particles. The minimum bit rate needed to describe the resulting string of numbers that describe the trajectory would be the entropy rate and should be equal to twice the temperature.
However even before an experiment can be conducted, this result is useful by suggesting the use of different information theoretical techniques to examine problems with de-coherence and might give a different perspective on the meaning of temperature.
This result is interesting as a stand-alone data point, that the entropy rate is equal to twice the temperature. However if we could to go further and more generally say that temperature is the same as entropy rate, it would change the way we view temperature and entropy.

Acknowledgements
JLH thanks Thomas Cover for sharing his passion for the Elements of Information Theory, and McKinsey & Co. for the amazing environment where this article was written.