1 Introduction

In earlier investigations, there have been some suggestions and proposals that the time may emerge in thermal way [1,2,3]. Among many discussions regarding the origin or emergence of time [4, 5], there exists a novel viewpoint that the time appears as a statistical or thermodynamic entity. The main purpose of this paper is to illustrate a simple model to embody this notion. The key idea herein is as follows. We emprically recognize that the temperature is a coarse-grained, macroscopic parameter that is estimated statistically through measurements of certain phenomena in our world. Then, instead of presuming that the time acts as an independent, a priori variable, we regard it as a coarse-grained parameter inferred a posteriori as well as the temperature. To this end, we consider a nonequilibrium relaxation process for which the time and temperature parameters are analyzed in the framework of information geometry associated with certain statistical events.

In the literature, the idea that time may emerge as a statistical, macroscopic (coarse-grained) quantity has often been advocated in the context of quantum mechanics and general relativity. In his “thermal time hypothesis” [1,2,3, 6], Rovelli first “forgets” a priori the notion of time from the standpoint of generally covariant quantum field theory, and then introduces an idea of physical time flow associated with the thermodynamic state of system in the framework of von Neumann algebra. “Time” is thus recovered in a system with a large number of degrees of freedom, where Hamiltonian to generate the temporal evolution can be inferred macroscopically in terms of the statistical state operator. In other theories on the emergence of time [5], the importance of the collapse of quantum wave function [7] has been remarked in relation to the decoherence to yield the irreversibility of time [4]. Penrose [8] also suggested that gravity would be the relevant contextual feature associated with the wave function collapse.

In contrast, the present study will discuss the emergence of time in the framework of non-relativistic, classical physics without resort to the quantum theory. Since we rely on the simplest physical model conceivable for the description of thermodynamic, irreversible processes, we can obtain some explicit mathematical expressions for the relationship between time and temperature. The main purpose of this study is to derive a relevant equation (or a “structure”) to connect “time” with “temperature” in the context of information theory, thus providing a novel insight into the notion of thermal origin of time.

In the present modeling, we consider a phenomenon represented by a dynamical variable x that behaves according to a simple stochastic relaxation process. In particular, we focus on a linear relaxation dynamics of x with harmonic potential at temperature T, which is known as the Onsager–Machlup (OM) process [9]. Assuming an initial condition that x is localized at a point \(x_0\), we can solve the Fokker–Planck equation for this relaxation process to find an analytical expression [10,11,12] for a time-dependent probability distribution function P(xt). Then, we rewrite this distribution function as \(P(x;t,\beta )\) or \(P(\{x\}|t,\beta )\) with \(\beta =1/k_{B}T\) (\(k_{B}\) is the Boltzmann constant) to represent that the distribution of x is described given two parameters, time t and (inverse) temperature \(\beta\). Here, we know that the temperature is a coarse-grained parameter determined through measurements on the statistical behavior of the variable x, and we further presume that this may also be the case for time. In the terminology of information theory [13], \(P(\{x\}|t,\beta )\) expresses the likelihood of two parameters t and \(\beta\) for a given dataset of \(\{x\}\). The Fisher information metric [14, 15] for t and \(\beta\) can then yield a geometrical relationship between the two parameters in terms of the statistical distribution or measurement of x. According to the two-dimensional differential geometry in \((\beta ,t)\) space, we will be led to some tensor expressions to describe the interrelated behaviors of t and \(\beta\). We thus find in this case a simple differential equation represented by the scalar curvature, \(R = -1\), which we regard as a fundamental equation to govern the two coarse-grained parameters \(\beta\) and t associated with the OM relaxation process.

In the following Sect. 2, we first illustrate how two coarse-grained parameters, temperature and time, can be statistically estimated in a linear relaxation process of the OM type. In Sect. 3, the Fisher information metric for time and temperature is derived on the basis of the OM relaxation process. Furthermore, after some algebra in differential geometry, we are led to a simple expression for the scalar curvature to characterize the two-dimensional information space of time and temperature; the mathematical details for the derivation are found in the author’s earlier paper [12] and an Appendix. In Sect. 4, the logical way is reversed so that one can see the mutual relation between time and temperature as a solution to the basic equation for the curvature tensors. Conclusions of the present study are given in Sect. 5.

2 Statistical Inference of Temperature and Time in the Onsager-Machlup Process

As a model system, this study considers a temporal relaxation process of dynamical variable x described by an overdamped Langevin equation at temperature T [16,17,18,19,20,21],

$$\begin{aligned} {\dot{x}}=-\frac{D}{k_{B}T}U'(x)+\eta (t), \end{aligned}$$
(1)

where \(k_{B}\), D and \(U'(x)\) refer to the Boltzmann constant, diffusion coefficient and the derivative of potential energy, respectively. \(\eta (t)\) is Gaussian noise with zero average satisfying the fluctuation-dissipation relation,

$$\begin{aligned} \langle \eta (t)\eta (t')\rangle = 2D\delta (t-t'), \end{aligned}$$
(2)

where \(\langle \quad \rangle\) means the statistical average. The time-dependent probability distribution function P(xt) sampled by the stochastic differential equation (1) then obeys the Fokker–Planck–Smoluchowski equation [16, 17, 20],

$$\begin{aligned} \frac{\partial }{\partial t}P(x,t) = D\frac{\partial }{\partial x}\left[ \frac{\partial }{\partial x}P(x,t) + \beta U'(x)P(x,t)\right] \end{aligned}$$
(3)

with \(\beta =1/k_{B}T\).

In this study we focus on a (linear) dynamics under the harmonic potential,

$$\begin{aligned} U(x) = \frac{1}{2}kx^{2}, \end{aligned}$$
(4)

with the spring constant k. Then, assuming a localized initial distribution, \(P(x,0) = \delta (x-x_{0}) \ (x_{0} \ne 0)\), a relaxation process of the Onsager–Machlup (OM) type [9] is realized. With \(\gamma = \beta kD\), an explicit expression for the time-dependent probability distribution function is found to be [11, 12]

$$\begin{aligned} P(x,t) = \sqrt{\frac{\beta k}{2\pi (1-e^{-2\gamma t})}}\exp \left[ -\frac{\beta k(x-x_{0}e^{-\gamma t})^{2}}{2(1-e^{-2\gamma t})}\right] , \end{aligned}$$
(5)

which is a well-known expression [10]. Here, we see that the statistical distribution of “phenomenon” x is governed by the time and temperature parameters t and \(\beta\), along with model-specific parameters k, D and \(x_0\). Alternatively, Eq. (5) mathematically represents a dynamical model structure that our world possesses approximately. To imply this viewpoint explicitly, we express the distribution function as \(P(x;t,\beta )\) or \(P(\{x\}|t,\beta )\). In the limit of \(t \rightarrow \infty\), \(P(x;t,\beta )\) approaches the Boltzmann distribution with the harmonic potential U(x).

Now, let us consider a problem of how we can infer the temperature and time when we observe the phenomena x whose dynamics is described in terms of Eq. (5) in our world. Given the conditional probability distribution function \(P(x|\lambda )\) with the parameter \(\lambda\), when we have obtained the data \(\{x_i\}\ (i=1,2,...)\) for the quantity x, we can statistically infer the parameter on the basis of Bayes’ theorem as [13]

$$\begin{aligned} P(\lambda |\{x_i\}) = \frac{P(\{x_i\}|\lambda )P(\lambda )}{P(\{x_i\})}, \end{aligned}$$
(6)

where \(P(\lambda )\) and \(P(\{x_i\})\) refer to the prior probability for \(\lambda\) and the evidence for \(\{x_i\}\), respectively. If we presume these probabilities are constant a priori, we will find

$$\begin{aligned} P(\lambda |\{x_i\}) \propto P(\{x_i\}|\lambda ) = \prod _{i}P(x_i|\lambda ). \end{aligned}$$
(7)
Fig. 1
figure 1

Plots of the likelihood \(P(\{x_i\}|\lambda ) = \prod _{i}P(x_i|\lambda )\) in Eq. (7) when we use Eq. (5) with \(k, D, x_{0} = 1\) and \(\{x_i\} = \{0.25,0.30,0.35,0.37,0.40\}\). a \(\lambda = \beta\) and \(t = 0.5\). b \(\lambda = t\) and \(\beta = 5.0\)

Fig. 2
figure 2

a Fisher information metric \(g_{\beta \beta }\) as a function of t for \(\beta = 0.1, 1, 10\). b Fisher information metric \(g_{\beta \beta }\) as a function of \(\beta\) for \(t = 0.1, 1, 10\). Model parameters have been employed as \(k, D, x_{0} = 1\) for illustration

Here, we set \(k, D, x_{0} = 1\) for simplicity, and assume that we have obtained the data \(\{x_i\} = \{0.25,0.30,0.35,0.37,0.40\}\) at \(t = 0.5\), for example. Employing Eq. (5), we can then depict the right-hand side of Eq. (7) as a function of \(\beta\) in Fig. 1a. We can thus know, through the measurement of \(\{x_i\}\), the most probable value of (inverse) temperature as \(\beta = 5.8\) at which the probability or the likelihood is maximal. This may be a way in which we observe the temperature statistically. On the other hand, if we have obtained the data \(\{x_i\} = \{0.25,0.30,0.35,0.37,0.40\}\) at \(\beta = 5.0\), for example, we can depict the right-hand side of Eq. (7) as a function of t in Fig. 1b. In this case we obtain an inference that the most probable value of time is \(t = 0.18\) through the measurement. If we collect much more information on the data \(\{x_i\}\), we will find a sharper peak in the likelihood \(P(\{x_i\}|\lambda )\).

3 Information Geometry of Relaxation Process

In the preceding section, we have seen that the most probable values of \(\beta\) and t can be inferred from the data \(\{x_i\}\) when we assume the relaxation process of the OM type. In this section, we mathematically address the geometry of the parameter \((\beta ,t)\) space which regulates the probability distribution of the phenomena x.

The Fisher information metric is a particular Riemannian metric in the space of probability distributions and plays a central role in information geometry [14, 15]. It can be used to calculate the informational difference between measurements as the infinitesimal form of the relative entropy such as the Kullback-Leibler divergence [13]. In the present study based on the probability distribution function P(x) for the event \(\{x\},\) the Fisher information metric is given by

$$\begin{aligned} g_{\mu \nu }& = {} \langle (\partial _{\mu }\ln P)(\partial _{\nu }\ln P)\rangle \\ & = {} \int dx P(x)\left[ \partial _{\mu }\ln P(x)\right] \left[ \partial _{\nu }\ln P(x)\right] \end{aligned}$$
(8)

in the parameter space represented by \(\mu\) and \(\nu\). In particular, we consider in this study the two-dimensional parameter space formed by the inverse temperature \(\beta = 1/k_{B}T\) and the time t to characterize the nonequilibrium process described by \(P(x;t,\beta )\) of Eq. (5). The (covariant) metric tensor is then calculated to be

$$\begin{aligned} g_{\beta \beta }= & {} \frac{1}{2\beta ^{2}}\left( 1-2\gamma t\frac{\varepsilon }{1-\varepsilon }\right) ^{2} + \frac{kx_{0}^{2}}{\beta }\gamma ^{2}t^{2}\frac{\varepsilon }{1-\varepsilon }, \end{aligned}$$
(9)
$$\begin{aligned} g_{tt}= & {} 2\gamma ^{2}\left( \frac{\varepsilon }{1-\varepsilon }\right) ^{2} + kx_{0}^{2}\beta \gamma ^{2}\frac{\varepsilon }{1-\varepsilon }, \end{aligned}$$
(10)
$$\begin{aligned} g_{\beta t}= & {} g_{t\beta } = -\frac{\gamma }{\beta }\frac{\varepsilon }{1-\varepsilon }\left( 1-2\gamma t\frac{\varepsilon }{1-\varepsilon }\right) + kx_{0}^{2}\gamma ^{2}t\frac{\varepsilon }{1-\varepsilon }, \end{aligned}$$
(11)

with the determinant,

$$\begin{aligned} g = \det (g_{\mu \nu }) = \frac{kx_{0}^{2}\gamma ^{2}\varepsilon }{2\beta (1-\varepsilon )}, \end{aligned}$$
(12)

where \(\varepsilon = e^{-2\gamma t} = e^{-2\beta kDt}\) has been introduced. The contravariant metric tensor, i.e., the inverse matrix of the Fisher information metric, is accordingly obtained as

$$\begin{aligned} g^{\beta \beta }= & {} g_{tt}/g, \end{aligned}$$
(13)
$$\begin{aligned} g^{tt}= & {} g_{\beta \beta }/g, \end{aligned}$$
(14)
$$\begin{aligned} g^{\beta t}= & {} g^{t\beta } = -g_{\beta t}/g = -g_{t\beta }/g. \end{aligned}$$
(15)

Then, according to standard algebra in differential geometry [22], the Ricci tensor \(R_{\mu \nu }\) is found to satisfy a relation,

$$\begin{aligned} R_{\mu \nu } = -\frac{1}{2}g_{\mu \nu }, \end{aligned}$$
(16)

in the present case [12] (see also Appendix). This corresponds to the scalar curvature given by (with the use of Einstein’s convention)

$$\begin{aligned} R = g^{\mu \nu }R_{\mu \nu } = -\frac{1}{2}g^{\mu \nu }g_{\mu \nu } = -1 \end{aligned}$$
(17)

in two dimensions. The Einstein tensor is then

$$\begin{aligned} G_{\mu \nu } = R_{\mu \nu }-\frac{1}{2}Rg_{\mu \nu } = 0, \end{aligned}$$
(18)

which should be the case due to the symmetries in two dimensions [23] (see also Appendix).

Thus, we have found that the Fisher information metric given by Eqs. (9)–(11), which was derived from the \(P(x;t,\beta )\) for the OM process, gives \(R = -1\) or Eq. (16) in the two-dimensional differential geometry for the parameter space of \(\beta\) and t, irrespective of the values of D, k and \(x_{0}\). We may regard this as a characterization for the geometry formed by the two parameters of \(\beta\) and t [12]. For the mathematical details associated with the discussions above, refer to the author’s earlier paper [12] and Appendix.

4 Basic Equation for Thermal Time and Its Solution

Here, let us review the analysis illustrated in the preceding sections. We start with Eq. (5) to describe the temporal (t) evolution of the probability density function \(P(x;t,\beta )\) for a variable x with temperature parameter \(\beta\) and model-specific parameters kD and \(x_0\) for a nonequilibrium relaxation (Onsager-Machlup) process. We may then interpret this equation as follows: Through measurement of phenomenon x in our world, whose behavior is mathematically expressed by Eq. (5), we can empirically, experimentally or statistically estimate the two parameters, time t and (inverse) temperature \(\beta\), as shown in Sect. 2. The Fisher information metric expressed as Eqs. (9)–(11) then gives the regulatory relationship between t and \(\beta\) in the two-dimensional parameter space, along with specific model parameters kD and \(x_0\) to describe the Onsager-Machlup (OM) process. Given the covariant metric tensors in the \((t,\beta )\) space, we can calculate a variety of associated tensors in the two-dimensional differential geometry, thus leading to a simple equation for the scalar curvature, \(R = -1\), irrespective of the specific model parameters.

Now, this logic can be reversed. That is, we can regard this simple relation, \(R = -1\) or Eq. (16), as a basic equation to regulate the two variables t and \(\beta\) involved in the OM process. In general, the differential geometry in two dimensions can be characterized by only one degree of freedom (see Appendix), and therefore the equation \(R = -1\) or Eq. (16) can be regarded as a basic equation like the Einstein equation for the Einstein tensors in the case of four-dimensional temporal-spatial geometry. Then, we know a solution to this differential equation under relevant boundary condition for specific physical model, that is, Eqs. (9)–(11). Thus, starting with the “fundamental” equation for a dynamical process, Eq. (16) or (17), we may find the geometry of the parameter (information) space of \((t,\beta )\), as expressed by the Fisher information metric. In passing, it is noted that the “solution” contains the auxiliary parameters such as k, D and \(x_{0}\) that are specific for the OM-type relaxation process. These parameters, in turn, provide a characteristic timescale represented by \(\gamma ^{-1} = (\beta kD)^{-1}\) (see also Eq. (5)) in the considered system, which makes a connection to real world.

Then, let us see the behaviors of the information metric \(g_{\mu \nu }\). In the following we set \(k, D, x_{0} = 1\) for illustration. Figure 2 illustrates the behaviors of \(g_{\beta \beta }\) as functions of t and \(\beta\) for various values of \(\beta\) and t, respectively. For \(t \rightarrow 0\), \(g_{\beta \beta }\) vanishes as t/2 as shown in Fig. 2a, which means that the temperature cannot be defined when the “particle” or “event” x is localized at one point \(x_0\) in the initial state. On the other hand, when considering the limit of \(t \rightarrow \infty\), we find \(g_{\beta \beta } \rightarrow 1/2\beta ^2\), thus corresponding to the thermal equilibrium state in which the temperature is well defined. When looking at the \(\beta\) dependence of \(g_{\beta \beta }\) in Fig. 2b, we observe that \(g_{\beta \beta }\) goes to \(1/2\beta ^2\) and \((t^2 + t)/2\) in the low-temperature (\(\beta \rightarrow \infty\)) and high-temperature (\(\beta \rightarrow 0\)) limits, respectively.

Fig. 3
figure 3

a Fisher information metric \(g_{tt}\) as a function of t for \(\beta = 0.1, 1, 10\). b Fisher information metric \(g_{tt}\) as a function of \(\beta\) for \(t = 0.1, 1, 10\). Model parameters have been employed as \(k, D, x_{0} = 1\) for illustration

Fig. 4
figure 4

a Fisher information metric \(g_{\beta t}\) as a function of t for \(\beta = 0.1, 1, 10\). b Fisher information metric \(g_{\beta t}\) as a function of \(\beta\) for \(t = 0.1, 1, 10\). Model parameters have been employed as \(k, D, x_{0} = 1\) for illustration

Next, Fig. 3 shows the behaviors of \(g_{tt}\) as functions of t and \(\beta\) for various values of \(\beta\) and t, respectively. As shown in Fig. 3a, \(g_{tt}\) vanishes in the limit of \(t \rightarrow \infty\), which is consistent with the picture that the time loses its sense in the thermodynamic equilibrium state, while \(g_{tt} \rightarrow 1/2t^2\) for \(t \rightarrow 0\). Concerning the \(\beta\) dependence illustrated in Fig. 3b, we see \(g_{tt} \rightarrow 0\) in the low-temperature limit of \(\beta \rightarrow \infty\), thus implying that the time does not exist because all the degrees of freedom for motion are frozen at zero temperature in classical mechanics. In the high-temperature region of \(\beta \rightarrow 0\), on the other hand, we see \(g_{tt} \rightarrow 1/2t^2\), showing what we expect for the behavior of usual “time”. Furthermore, Fig. 4 shows the behaviors of \(g_{\beta t}\), in which we observe that \(g_{\beta t}\) vanishes in the limit of \(t \rightarrow \infty\) or \(\beta \rightarrow \infty\), thereby indicating the decoupling between time and temperature. Thus, we see how the time emerges thermodynamically in the present model.

Finally, the logic employed in the present paper is reviewed as follows: Starting with the time-dependent distribution function describing the relaxational OM process, we have obtained the Fisher information metric \(g_{\mu \nu }\) in the two-dimensional space of t and \(\beta\) in the framework of the information geometry. The standard procedure for the two-dimensional differential geometry has then led to a finding of an equation represented by \(R = -1\) or Eq. (16), thus giving a universal constraint on the interrelation between time and temperature. Here, forgetting the process leading to this simple equation, we take a viewpoint that this equation itself is fundamental for characterizing the properties of (or the “structure” formed by) time and temperature, both of which may newly be introduced (or redefined) in somewhat abstract manner. This inversion of logic is similar to that employed for the Schrödinger equation in quantum mechanics and the Einstein equation in general relativity. Regardless of how these equations were derived, we rather regard them as fundamental (due to their simplicity) and the dynamical variables contained in them (e.g., wavefunction in the former and space-time coordinates in the latter) are reinterpreted in accord with real world. In the present case, Eq. (16) or (17) is used to “define” the essential relationship between “time” and “temperature” irrespective of whatever procedure was employed to derive it. By regarding Eqs. (9)–(11) for \(g_{\mu \nu }\) as a solution to this fundamental equation, the interpretation concerning the “thermal time” has emerged, as illustrated above.

5 Conclusion

In this study we have discussed the thermal origin of time on the basis of mathematical framework of information geometry. We have modeled a nonequilibrium relaxation process of the OM type in terms of a probability distribution function of events \(\{x\}\), \(P(x;t,\beta )\), in which time t and (inverse) temperature \(\beta\) are regarded as coarse-grained parameters. Relying on Bayes’ theorem, temperature and time are inferred statistically through measurements of the events. The relationship between time and temperature is then formulated in the framework of information geometry, in which the Fisher information metric plays an essential role. The usual recipe of the differential geometry in the two-dimensional parameter space of \(\beta\) and t leads to a simple equation for the scalar curvature as \(R = -1\). Then, we can form a reversed logic that this basic equation \(R = -1\) or \(R_{\mu \nu }=-g_{\mu \nu }/2\) has a solution for the Fisher information metric \(g_{\mu \nu }\) to characterize the relaxation process. Investigating the global behaviors of the metric tensors, we see a thermodynamic emergence of time.

This study has focused on a special case of the OM process for nonequilibrium relaxation phenomena. Therefore, we do not know how large a family of nonequilibrium processes can be described in terms of the present simple relation of \(R = -1\) obtained for the OM process. More generally, we may expect some equations such as dimensionless \(R =\) constant to describe a larger family of dynamical processes.

We have thus found in this work that the time and the temperature are essentially and intrinsically correlated. In particular, the present analysis provides a simple model to substantiate a concept that the time as a coarse-grained parameter appears in thermal manner through measurement or occurrence of events that take place statistically.