# 7 Thermodynamics and statistical mechanics

## 7.1 The Gibbs ensembles

Before considering ferromagnetism at finite temperature at the related cooperative phenomena, we would like to refresh some basic concepts of statistical mechanics. We will develop in this chapter explicit calculations only for systems of non-interacting particles: the ideal gas and its magnetic counterpart, i.e., the paramagnet. In the approach put forward by Gibbs the microscopic configuration of a gas is specified by its particles positions \(\{r_{\alpha, i}\}\) and the relative momenta \(\{p_{\alpha, i}\}\), with \(\alpha =x,y,z\) and \(i\) running through the number \(N\) of particles in the gas.
These canonical coordinates spam a \(6N\)-dimensional space called \(\Gamma\)-\(Raum\). A point in this space is representative of a unique microscopic configuration of the gas. If, starting from a given configuration, the positions and momenta are swapped between two particles of the gas, the new configuration shall correspond to a different representative point in the \(\Gamma\)-\(Raum\) w.r.t. the starting one. However, the two configurations will be undiustinguishable at the macroscopic level. More generally, a given macroscopic state will correspond to a collection of points in the \(\Gamma\)-\(Raum\). Gibbs found it convenient to associate the macroscopic behavior of a system of many particles with a continuous distribution of representative points in the \(\Gamma\)-\(Raum\) called density function \(\rho\) so defined that
\[\begin{equation}
\rho(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}, t) d^{3N}r \,d^{3N} p
\end{equation}\]
is the number of representative points that at time \(t\) are contained in the infinitesimal volume
\(d^{3N}r \,d^{3N} p\) of the \(\Gamma\)-\(Raum\).
Macroscopic configurations that are most likely observed shall occupy larger *volumes* in the \(\Gamma\)-\(Raum\).

### Microcanonical ensemble

We sketch in this section the derivation proposed by Boltzmann to estimate the *equilibrium* distribution of velocities for the molecules of an ideal gas.
We consider a gas composed of \(N\) particles enclosed in a box of volume \(V\). We assume that particles make elastic collisions among themselves and against the walls of the box, so that kinetic energy is conserved. The three components of position coordinates \(r_{\alpha}\) and those of the relative momenta \(p_{\alpha}\) (with \(\alpha =x,y,z\)) define a 6-dimensional space. For a given total energy \(E\) and volume \(V\) particles have access only to a finite portion of this space^{23}. Let us imagine to divide this region into \(N_c\) cells, each one of volume \(\omega_c = d^3r\,d^3 p\) and associate an arbitrary distribution function with the number of particles \(n_i\) that occupy the \(i\)th cell. These are called occupation numbers and need to fulfill the constraints:
\[\begin{equation}
\tag{7.1}
\label{microcan-constraints}
\begin{split}
&\sum_i^{N_c} n_i = N \\
&\sum_i^{N_c} n_i \epsilon_i = E
\end{split}
\end{equation}\]
where \(\epsilon_i\) is the energy of the particle in the \(i\)th cell^{24}
\[\begin{equation}
\tag{7.2}
\label{microcan-epsilon}
\epsilon_i = \frac{\vec p^2}{2m}\,.
\end{equation}\]
An arbitrary set of integers \(n_i\) defines a distribution function
\[\begin{equation}
\tag{7.3}
\label{microcan-dist-f}
f_i = \frac{n_i}{\omega_c}\,.
\end{equation}\]
This distribution is uniquely determined once we choose a point in the \(\Gamma\)-\(Raum\) but not vice versa. In other words, \(f_i\) is also associated with a finite volume of the \(\Gamma\)-\(Raum\). Following Boltzmann, we denote with \(\Gamma_E\left[\{n_i\} \right]\) the volume associated with a given set of occupation numbers \(\{n_i\}\). This volume shall be proportional to the number of ways of distributing \(N\) distinguishable particles among the \(N_c\) cells, namely
\[\begin{equation}
\Gamma_E\left[\{n_i\} \right] \propto
\frac{N!}{n_1!n_2! \dots n_{N_c}!}
\end{equation}\]
Taking the logarithm of this expression we obtain
\[\begin{equation}
\tag{7.4}
\label{gamma-E}
\begin{split}
\ln\Gamma_E\left[\{n_i\} \right] &=
\ln(N!) - \sum_{i=1}^{N_c} \ln(n_i!) + \text{constant} \\
&\simeq N\left[\ln(N)-1\right] - \sum_{i=1}^{N_c} n_i \left[\ln(n_i)-1\right] + \text{constant}
\end{split}
\end{equation}\]
where in the second row we made the approximation \(\ln(n_i!) \simeq n_i\left[\ln(n_i) -1\right]\) (and \(\ln(N!) \simeq N\left[\ln(N) -1\right]\)) under the assumption that the relevant \(n_i\) be very large.
The equilibrium distributions shall correspond to the set \(\{n_i\}\) that maximizes the volume occupied in the \(\Gamma\)-\(Raum\) under the constraints given in Eqs. @ref(eq:microcan-constraints}. Mathematically, this can be computed with the method of Lagrange multipliers, that consists in maximizing the functional
\[\begin{equation}
\Omega = -\sum_{i=1}^{N_c} n_i \left[\ln(n_i)-1\right]
- \lambda_\mu \left(\sum_{i=1}^{N_c} n_i - N\right)
- \lambda_\beta \left(\sum_{i=1}^{N_c} n_i\epsilon_i - E\right)
+ \text{constant}
\end{equation}\]
where \(\lambda_\mu\) and \(\lambda_\beta\) are Lagrange multipliers.

The condition \(\partial \Omega /\partial n_j = 0\) yields
\[\begin{equation}
\label{largest-gamma-n}
\begin{split}
&-\ln(n_j) - \lambda_\mu - \lambda_\beta \epsilon_j = 0 \\
&\quad \Rightarrow \quad
\bar n_j = C' {\rm e}^{-\lambda_\beta \epsilon_j}
\end{split}
\end{equation}\]
The Hessian matrix of \(\Omega\) evaluated for \(\{\bar n_j\}\) reads
\[\begin{equation}
\frac{\partial \Omega}{\partial{n_k} \partial{n_j} } = -\frac{\delta_{k,j}}{\bar n_k} \,,
\end{equation}\]
which indicates that the set \(\{\bar n_j\}\) indeed maximizes the volume \(\Gamma_E\). Setting to zero the derivative w.r.t. the Lagrange multipliers gives back the constraints in Eqs. (7.1). Formally, this means that the specific values of \(\lambda_\mu\) and \(\lambda_\beta\) need to be adjusted to fulfill those constraints themselves. We will come back to this point later.

Perhaps the most famous Boltzmann’s contribution was that of establishing a relation between the volume occupied in the \(\Gamma\)-\(Raum\) by a microcanonical ensemble and the relative entropy \[\begin{equation} \tag{7.5} \label{Boltzmann-grave-eq} S = k_B \ln\Gamma_E \,. \end{equation}\] This equation (reproduced on Boltzmann’s grave!) establishes the connection between the statistical description of microscopic states – defined within the framework of a given model – and macroscopic thermodynamics.

### Thermodynamic and information-theory entropy

We make now a small digression and rewrite Eq. (7.5) in a different way. Let us replace the second row of Eq. (7.4) into the expression of the Boltzmann entropy, without imposing the constarints (7.1) and neglecting irrelevant constants
\[\begin{equation}
\tag{7.6}
\label{entropy-to-info}
\begin{split}
S &= k_B \left\{N\left[\ln(N)-1\right] - \sum_{i=1}^{N_c} n_i \left[\ln(n_i)-1\right] \right\} \\
&= k_B \left\{N\ln(N) - \sum_{i=1}^{N_c} n_i\ln(n_i) -N +\sum_{i=1}^{N_c} n_i \right\} \\
&= k_B \left\{\sum_{i=1}^{N_c} n_i\ln(N) - \sum_{i=1}^{N_c} n_i\ln(n_i)\right\} \\
& = - k_B N \sum_{i=1}^{N_c} \frac{n_i}{N} \ln\left(\frac{n_i}{N}\right) \,,
\end{split}
\end{equation}\]
where to go from the second to the third row we used twice the equivalence \(\sum_{i=1}^{N_c} n_i=N\).
For a set \(\{n_j\}\) the ratio \(n_i/N\) give the relative occupancy of the \(i\)th cell. If one understands \(n_i\) as random variables, those ratios represent the probability \(p_i\) that the canonical coordinates of a particle fall into the \(i\)th cell. In terms of these probabilities, the entropy per particle can then be expressed as
\[\begin{equation}
\label{Gibbs-entropy}
S_{\rm s.p.} = - k_B\sum_{i=1}^{N_c} p_i \ln(p_i)\,.
\end{equation}\]
(where we replaced in Eq. (7.6) \(n_i/N\) by \(p_i\) and divided by \(N\)). For a classical system with discrete set of microstates characterized by the energy \(\epsilon_{i}\), \(p_{i}\) can be interpreted as the probability of occurrence of a given microstate. The quantity \(S_{\rm s.p.}\) is named *Gibbs entropy* and remains meaningful even when the considered system is not at thermal equilibrium.

A further generalization was (nearly unconsciously) made by C. Shannon, who defined the same quantity roughly 70 years after Boltzmann in the context of information theory

\[\begin{equation}
\label{Shannon-entropy}
H[p(x)] = - \sum_{x} p(x) \log_2 p(x)
\end{equation}\]
where \(p(x)\) is a generic probability distribution and \(x\) a random variable that may also take continuous values (in which case the sum must be replaced with an integral). As it is not bounded to have a physical interpretation, the Shannon entropy \(H[p(x)]\) lacks a proper unit. In fact, changing the base of the logarithm is equivalent to changing the unit and in information theory \(\log_2\) is normally used because with this choice \(H[p(x)]\) gives the number of bits that are necessary to store a certain piece of information. The quantity
\[\begin{equation}
\label{information}
I(x)=-\log_2 p(x)
\end{equation}\]
is called information, because it measures the knowledge we gain identifying the outcome of a random trial. For instance, by tossing a coin and getting a head outcome, we gain an amount of information equal to \(\log_2(2)\); by casting a die and getting 3 as an outcome we gain an amount of information equal to \(\log_2(6)\), etc. The Shannon entropy is the expected information averaged over all the possible events and is a property of the probability distribution \(p(x)\) itself.

The above derivation clearly shows the analogy between information-theory entropy and thermodynamic entropy^{25}. In this regard, it is worth quoting what Shannon
said about the entropy named after him:

*My greatest concern was what to call it (…). When I discussed it with John von Neumann (…) he told me, ``You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, no one knows what entropy really is, so in a debate you will always have the advantage.’’*

But can we learn something about thermodynamic entropy from Shannon entropy? In the attempt to get little insight into this question, suppose we are dealing with a system of 3 spins 1/2. The relative Hilbert space consists of \(2^3=8\) states that we associate with the micorstates of the system. Putting aside any physical meaning or energetic consideration, we may map those states into 8 different cards. The probability to pick a card \(i\) at random shall be \(p_i=1/8\), which yields the Shannon entropy
\[\begin{equation}
\label{Shannon-entropy-cards}
H[p(x)] = - \sum_{x}^{8}\frac{1}{8} \log_2\left(\frac{1}{8}\right)
= \log_2 8 = 3\,.
\end{equation}\]
This reflects the fact that, if someone choose at random one card out of 8, I need to pose *at least* 3 questions in order to identify the chosen card without having any prior knowledge (or using any trick).
For this trivial example, we see that the Shannon entropy of the cards problem is equivalent to the Gibbs entropy of a system of 3 non-interacting spins 1/2, with all degenerate eigenvalues and at thermodynamic equilibrium (provided we change the unit with which entropy is measured by a factor \(k_B/\log_2 e\)).
Let us now assume that our 3 spins are coupled through a ferromagnetic exchange interaction of Heisenberg type; we know that the ground state is realized by the multiplet with maximal spin \(S^{\rm T}=3/2\). Then, if the strength of the exchange interaction is much larger than the thermal energy we can assume that only the ground state be populated. This restricts the accessible states to 4 and accordingly the Shannon entropy associated with those states reduces to

\[\begin{equation}
\label{Shannon-entropy-gs}
H[p_i] = - \sum_{x}^{4}\frac{1}{4} \log_2\left(\frac{1}{4}\right)
= \log_2 4 = 2\,.
\end{equation}\]
If we then apply a magnetic field and hypothetically achieve a Zeeman splitting much larger than the thermal energy (\(\mu_B B \gg k_BT\)), only one state of the multiplet \(S^{\rm T}=3/2\) will be significantly populated: the corresponding entropy is thus suppressed to zero.

From this conceptual experiment we learn that introducing physical knowledge about a system reduces the Shannon entropy or, equivalently, increases our knowledge about the microstates of the system.
In this perspective, we may regard the entropy of a system as the amount of ``missing’’ information needed to determine a microscopic state from the knowledge of its state at the level of macroscopic thermodynamics^{26}

In the quantum-mechanical description the Gibbs entropy is replaced by the von Neumann entropy defined in terms of the density matrix.

### Canonical ensemble

Following the textbook by K. Huang, we now ask ourselves what ensemble is appropriate to describe a system not isolated but in thermal equilibrium with a larger one. The goal is to determine the probability of finding this system in a given point in the \(\Gamma\)-\(Raum\), namely to define the functional dependence of the density function \(\rho(\{p_{\alpha, i}\}, \{r_{\alpha, i}\})\) (with \(i \, \in\) system 1). To this purpose, we consider a system described by the microcanonical ensemble with energy \(E\) and split it into two subsystems 1 and 2. These two subsystems can exchange energy but we shall assume that their average energies \(\bar E_1\) and \(\bar E_2\) be defined and \(\bar E_1 \ll \bar E_2\).

The probability of the system 1 being represented by a point in the elementary volume \(d\Gamma_1\) of the \(\Gamma\)-\(Raum\) characterized by the energy \(E_1\) is proportional to the volume occupied by all the microstates of the system 2 associated with the energy \(E_2=E - E_1\)(by definition of \(E\)). In formula, we express this fact as
\[\begin{equation}
\rho(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}) \propto \Gamma_2(E_2) = \Gamma_2(E-E_1) \,.
\end{equation}\]
Equation (7.13) relates \(\Gamma_2(E-E_1)\) to the entropy \(S_2(E-E_1)\) associated with the system 2. Since we expect only values \(E_1\simeq \bar E_1\) to be important and we have assumed that \(\bar E_1 \ll \bar E_2\), we can expand the entropy about the value \(E=E_2\) to obtain the relation
\[\begin{equation}
k_B \ln \left[\Gamma_2(E-E_1) \right] = S_2(E-E_1)\simeq
S_2(E) - E_1 \left[\frac{\partial S_2}{\partial E}\right]_{E=E_2} \,.
\end{equation}\]
The derivative of the entropy w.r.t. to the energy is the definition of the inverse temperature in the microcanonical ensemble; in the specific case it is the inverse temperature \(1/T_2\) of the largest system. Therefore, neglecting the first term independent of \(E_1\) and \(T_2\) the equation above implies that
\[\begin{equation}
\Gamma_2(E-E_1) \propto \exp\left[-\frac{E_1}{k_B T_2}\right]\,.
\end{equation}\]
Since we have assumed that the systems are in thermal equilibrium, it is \(T_2=T_1=T\). Moreover, we know that for a classical system \(E_1 = \mathcal{H}(\{p_{\alpha, i}\}, \{r_{\alpha, i}\})\), thus the sought for density function \(\rho\) is
\[\begin{equation}
\tag{7.7}
\rho(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}) = \rm e^{-\beta\mathcal{H}(\{p_{\alpha, i}\}, \{r_{\alpha, i}\})}
\end{equation}\]
with \(\beta = 1/(k_BT)\).
Note that the temperature \(T\) (contained in \(\beta\)) summarizes all the information about the degrees of freedom of the larger system 2 that acts as a heat reservoir. The ensemble defined by the density function in Eq. (7.7) is called *canonical ensemble*. The energy levels of a system at thermal equilibrium should fulfill the Boltzmann statistics expressed by this density function. In the following, we will actually use this as definition of *thermal equilibrium*.

## 7.2 Thermal averages

In the remainder of the course we will largely use the Boltzmann distribution (7.7) to compute thermal averages in the canonical ensemble. In this approach a fundamental quantity is the partition function, which for a gas reads

\[\begin{equation}
\tag{7.8}
\label{Classical-Z-gas-N}
\mathcal{Z}= \frac{1}{N!} \int \frac{d^{3N}r d^{3N} p}{(2\pi \hbar)^{3N} }
e^{-\beta \mathcal{H}\left(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}\right)}\,,
\end{equation}\]
where the factor \(N!\) accounts for the ``correct Boltzmann counting’’. This correction is necessary only when a system of particles that are not distinguishable at a macroscopic level is considered.
\(\mathcal{Z}\) is related to \(a\) thermodynamic potential, \(\mathcal{F}\), via the general relation
\[\begin{equation}
\label{Free-energy-Z}
\mathcal{F}=-\frac{1}{\beta}\ln\mathcal{Z}\,.
\end{equation}\]
The average of any observable \(\mathcal{O}\left(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}\right)\) can be computed as
\[\begin{equation}
\label{Calssical-average-1}
\langle \mathcal{O}\rangle= \frac{1}{\mathcal{Z}} \frac{1}{N!} \int \frac{d^{3N} q d^{3N} p}{ (2\pi \hbar)^{3N} }
\mathcal{O}\left(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}\right) e^{-\beta \mathcal{H}\left(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}\right)}\,.
\end{equation}\]
A useful trick is that of adding to the Hamiltonian a term in which a (conjugated) field \(\phi\) is coupled with the observable one would like to compute:

\[\begin{equation}
\mathcal{H}\left(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}\right) \quad \Rightarrow \quad
\mathcal{H}\left(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}\right) - \phi \mathcal{O}\left(\{p_{\alpha, i}\}, \{r_{\alpha, i}\}\right) \,.
\end{equation}\]
Given the equivalence
\[\begin{equation}
\left[\frac{ \partial \, e^{-\beta \left(\mathcal{H} + \phi \,\mathcal{O}\right)}}{\partial \phi} \right]_{\phi=0}
= -\beta \mathcal{O}e^{-\beta \mathcal{H} }
\end{equation}\]
one has that
\[\begin{equation}
\tag{7.9}
\label{Calssical-average-2}
\langle \mathcal{O}\rangle= - \frac{1}{\beta}\frac{1}{\mathcal{Z}} \frac{\partial\mathcal{Z}} {\partial \phi}
= -\frac{1}{\beta} \frac{\partial \ln \mathcal{Z}} {\partial \phi} \,.
\end{equation}\]
If one is interested in the thermal average of the Hamiltonian itself, the relevant relation is

\[\begin{equation}
\mathcal{H} = -\frac{ \partial \, e^{-\beta \mathcal{H}} }{\partial \beta}
\end{equation}\]
and thus
\[\begin{equation}
\tag{7.10}
\label{Calssical-average-Ham}
\langle \mathcal{H}\rangle= - \frac{\partial \ln \mathcal{Z}} {\partial \beta} \,.
\end{equation}\]

The thermodynamic potential \(\mathcal{F}\) will inherit from the partition function \(\mathcal{Z}\) the dependence on external parameters.
For an ideal gas these parameters are \((N,V,T)\), corresponding to potential called Helmholtz free energy.
For a magnetic system the typical external parameters are \((N,B,T)\) and the associated potential is called Gibbs free energy, usually indicated with \(G\). To avoid proliferation of notation, hereafter we shall use the symbol \(\mathcal{F}\) and call it just *free energy*, implicitly assuming that it represents the thermodynamic potential dependent on the parameters of our model.

Further details about how to compute thermal averages in the canonical ensemble are given in the Appendix.

### Equipartition theorem for an ideal gas

For an ideal gas the partition function given in Eq. (7.8) reduces to the product of \(N\) independent, single-particle partition functions:
\[\begin{equation}
\label{Classical-Z-gas-N-sp}
\mathcal{Z}= \frac{1}{N!} \mathcal{Z}_{\rm s.p.}^N
\end{equation}\]
with
\[\begin{equation}
\label{Classical-Z-gas-sp}
\mathcal{Z}_{\rm s.p.} = \frac{V}{(2\pi \hbar)^{3}}
\int \Pi_\alpha \exp\left[-\frac{\beta p_\alpha^2}{2m} \right] d p_\alpha
\end{equation}\]
and \(\alpha=x,y,z\).
This is the product of three Gaussian integrals, each one of the form

\[\begin{equation}
\label{Gaussian-1}
\tag{7.11}
\int {\rm e}^{-\alpha u^2}\, d u = \sqrt{\frac{\pi}{\alpha}}\,,
\end{equation}\]
and with second moment
\[\begin{equation}
\tag{7.12}
\label{Gaussian_2}
\int u^2 \, {\rm e}^{-\alpha u^2}\, d u =
-\frac{\partial}{\partial\alpha} \int {\rm e}^{-\alpha u^2}\, d u =
\frac{1}{2}\frac{1}{\alpha} \sqrt{\frac{\pi}{\alpha}}\,.
\end{equation}\]
Using Eq. (7.11) one obtains

\[\begin{equation}
\mathcal{Z}_1= \frac{V^{1/3}}{2\pi \hbar} \int \,\exp\left[-\frac{\beta}{2m} (p_\alpha)^2\right] \, d p_\alpha = \frac{V^{1/3}}{2\pi \hbar} \sqrt{\frac{2\pi m}{\beta}}\,,
\end{equation}\]
which, combined with Eq. @ref(eq:Gaussian_2), yields the thermal average of {} in the Hamiltonian:

\[\begin{equation}
\label{equipartition-thm}
\begin{split}
\langle \frac{1}{2m} p_\alpha^2\rangle =
&=\frac{1}{\mathcal{Z}_1} \frac{V^{1/3}}{2\pi \hbar} \int \frac{(p_\alpha)^2}{2m}
\exp\left[-\frac{\beta}{2m} (p_\alpha)^2\right] \, d p_\alpha = \\
&=\frac{1}{2}\frac{1}{2m}\frac{2m}{\beta} = \frac{1}{2} k_B T
\end{split}
\end{equation}\]
This is the content of the celebrated *equipartition theorem*. The beauty of this theorem is that it applies to
every Hamiltonian that can be decoupled into the sum of \(N\) independent quadratic degrees of freedom:
each one of such degrees of freedom contributes to the total averaged energy with a term \(k_BT/2\).
In some future chapter, we will use this powerful result to prove that cooperative models of magnetism with continuous symmetry cannot sustain long-range order on lattices of dimension 1 or 2.

For the specific case of an ideal gas composed of \(N\) particles, the total quadratic degrees of freedom – associated with
the kinetic energy – are \(3N\) and thus the average of the total energy reads
\[\begin{equation}
\label{Calssical-average-Ham-gas}
\langle \mathcal{H}\rangle= \frac{3N}{2} k_BT \,.
\end{equation}\]

### Microcanonical vs canonical ensemble

While deriving the equilibrium distribution of occupation numbers \(\{\bar n_j\}\) in the microcanonical ensemble we found that \[\begin{equation} \bar n_j = C' {\rm e}^{-\lambda_\beta \epsilon_j}\,. \end{equation}\] Inserting Eqs. (7.2) and (7.3) into the equation above, one obtains the following distribution \[\begin{equation} \tag{7.13} \label{most-prob-microcan} f_i = C {\rm e}^{-\lambda_\beta \frac{\vec p^2}{2m}} \,. \end{equation}\] In that context we wrote that the Lagrange multiplier \(\lambda_\beta\) is to be adjusted to fulfill the constraint that the total energy \(E\) takes a predefined value. If we assume that the average energy per particle in the microcanonical ensemble must equal the expected thermal average \(3 k_BT/2\), the descriptions given in the microcanonical and in the canonical ensemble must be equivalent. This requirement leads us to identify the Lagrange multiplier \(\lambda_\beta\) with \(\beta =1/(k_BT)\). Note that with this choice the probability distribution \(f_i\) corresponds to the Maxwell-Boltzmann distribution of velocities.

Taking a second look at the role played by \(\phi\) and \(\beta\) in Eqs. (7.9) and (7.10), the reader will note that the Lagrange multipliers introduced in the microcanonical ensemble to impose constraints are mapped into conjugated fields used to compute thermal averages in the canonical or in the grand canonical ensemble^{27}.

As a small concluding remark we note that to derive the density function in Eq. (7.7) we assumed the coupling between the two parts 1 and 2 of the whole system to be small. This implicitly relies on the interaction between particles to be short-ranged enough. For instance, for a system of charged particles interacting via Coulomb interaction this is not true and the physics described in the canonical ensemble may differ from that given in the microcanonical ensemble: The latter is in such cases the only meaningful framework to describe the whole system.

### The Bohr–van Leeuwen theorem

We turn now our attention to the problem of \(N \sim 10^{23}\) non-interacting electrons in a solid experiencing a constant magnetic field.
We will treat them as a gas of classical particles and associate with each electron a single-particle Hamiltonian of the form
\[\begin{equation}
\label{Ham-charged-p}
\mathcal{H} = \frac{1}{2m}\left(\vec p - q_{\rm e}\vec A\right)^2 + q_{\rm e} \phi(\underline{r})
\end{equation}\]
with \(q_{\rm e}=-e<0\). The electrostatic potential \(\phi(\underline{r})\) is generated by the
nuclei and other charges in the crystal, but it is assumed to be independent of the reciprocal position of electrons^{28}. We would like to

compute the thermal average of the magnetic moment per electron in response to the application of an external field or, in other words, the average polarization of the electron gas at finite temperature. For classical particles the partition function in the canonical ensemble is given by Eq. (7.8).
The average magnetic moment along the applied field is obtained as the derivative of the free energy w.r.t. to the applied field
\[\begin{equation}
\tag{7.14}
\label{mu-ave-ch1}
\langle \mu^\alpha\rangle=-\frac{1}{N}\frac{\partial \mathcal{F}}{\partial B^\alpha}=\frac{k_{\rm B} T}{N} \frac{1}{\mathcal{Z}}\frac{\partial \mathcal{Z}}{\partial B^\alpha} \,,
\end{equation}\]
with \(\alpha=x,y,z\).
As done for the ideal gas, we can limit ourselves to considering the partition function produced by a single particle:

\[\begin{equation}
\label{Classical-Z-el-Bohr-vonL}
\mathcal{Z}= \frac{1}{(2\pi \hbar)^{3} } \int d^{3} p \,\exp\left[- \frac{\beta}{2m}\left(\vec p - q_{\rm e}\vec A\right)^2\right]\,
\int d^{3} r \,\exp\left[-\beta q_{\rm e} \phi(\underline{r})\right]\,.
\end{equation}\]
The integration with respect to \(d^{3} p\) can be transformed into three standard Gaussian integrals with a shift of the origin

\[\begin{equation}
\begin{cases}
&u_x = p_x - q_{\rm e} A_x \\
&u_y = p_y - q_{\rm e} A_y \\
&u_z = p_z - q_{\rm e} A_z
\end{cases}
\end{equation}\]
After this transformation of coordinates (see the Appendix for further details) one obtains
\[\begin{equation}
\label{Classical_Z_el1_ch1}
\mathcal{Z}= \frac{1}{ \hbar^{3} }\left(\frac{ m}{2\pi\beta}\right)^{3/2} \,
\int d^{3} r \,\exp\left[-\beta q_{\rm e} \phi(\underline{r})\right]
\end{equation}\]
(where we have made use of the result (7.11)).
Since the integral with respect to \(d^{3} r\) – so-called *the configuration integral* – does not depend on the magnetic field \(\vec B\), the whole partition function is also independent of \(B\).
In line with Eq. (7.14), this means that the average magnetic moment (i.e. polarization of the electron gas) vanishes. In other words, if statistical mechanics and classical mechanics are applied consistently, the average magnetic moment per electron is always zero. This important result of statistical physics is named the Bohr–van Leeuwen theorem, after Niels Bohr and Hendrika Johanna van Leeuwen who proved it independently in the 1910s. The importance of this theorem was acknowledged only considerably later by van Vleck (1932):

*‘’At any finite temperature, and in all finite applied electric or magnetic fields, the net magnetization of a collection of electrons in thermal equilibrium vanishes identically’’.*

Practically, this theorem marks a conclusive statement about the need for quantum mechanics to account for magnetic phenomena, like diamagnetism, paramagnetism or ferromagnetism.

## 7.3 Appendix: averages and thermodynamic potentials

### Classical models

In the canonical ensemble, the partition function is given by

\[\begin{equation}
\label{Classical_Z}
\mathcal{Z}= \int \frac{d^{3N} q d^{3N} p}{(2\pi \hbar)^{3N} }
e^{-\beta \mathcal{H}\left(p,q\right)}\,,
\end{equation}\]
\(\mathcal{H}\) being the Hamiltonian of the system and \(\beta=1/(k_B T)\).
\(\mathcal{Z}\) is related to \(a\) thermodynamic potential, \(\mathcal{F}\), via the general relation
\[\begin{equation}
\tag{7.15}
\mathcal{F}=-\frac{1}{\beta}\ln\mathcal{Z}\,.
\end{equation}\]
The average of any observable \(\mathcal{O}\left(p,q\right)\) can be computed as
\[\begin{equation}
\label{Calssical_average}
\langle \mathcal{O}\rangle= \frac{1}{\mathcal{Z}}\int \frac{d^{3N} q d^{3N} p}{ (2\pi \hbar)^{3N} }
\mathcal{O}\left(p,q\right) e^{-\beta \mathcal{H}\left(p,q\right)}\,.
\end{equation}\]
Classically, the trace operator is defined as
\[\begin{equation}
\label{Calssical_trace}
\mathcal{T}r= \int \dots \frac{d^{3N}q d^{3N}p}{ (2\pi \hbar)^{3N} }\,,
\end{equation}\]
which allows defining
\[\begin{equation}
\tag{7.16}
\mathcal{Z}= \mathcal{T}r\left\{e^{-\beta \mathcal{H}\left(p,q\right)}\right\}
\quad\text{and}\quad
\langle \mathcal{O}\rangle=\frac{1}{\mathcal{Z}}
\mathcal{T}r\left\{\mathcal{O}\left(p,q\right) e^{-\beta \mathcal{H}\left(p,q\right)}\right\}\,.
\end{equation}\]

### Quantum models

Assume that \(|\psi_\alpha\rangle\) be a complete basis of the Hilbert space on which the Hamiltonian of the model is defined. Quantum-mechanically, the trace is then given by
\[\begin{equation}
\label{Quantum_trace}
\mathcal{T}r= \sum_\alpha \langle \psi_\alpha | \dots |\psi_\alpha\rangle \,.
\end{equation}\]
By analogy with (7.16), the partition function and thermal averages are accordingly defined
\[\begin{equation}
\tag{7.17}
\begin{split}
&\mathcal{Z}= \mathcal{T}r\left\{e^{-\beta \mathcal{H}}\right\}
=\sum_\alpha \langle \psi_\alpha | e^{-\beta \mathcal{H}} |\psi_\alpha\rangle\\
&\langle \mathcal{O}\rangle=\frac{1}{\mathcal{Z}}
\mathcal{T}r\left\{\mathcal{O} e^{-\beta \mathcal{H}}\right\}
=\frac{1}{\mathcal{Z}} \sum_\alpha \langle \psi_\alpha | \mathcal{O} e^{-\beta \mathcal{H}} |\psi_\alpha\rangle\,.
\end{split}
\end{equation}\]
In few advanced computations one stops at this level. Generally, the trace is
evaluated on a complete basis of eigenstates of \(\mathcal{H}\):
\[\begin{equation}
\label{Eigenstates}
\mathcal{H}|\varphi^{i}\rangle=E^i |\varphi^{i}\rangle\,.
\end{equation}\]

The computation of (7.17) is, consequently, simplified:

\[\begin{equation}
\tag{7.18}
\begin{split}
&\mathcal{Z}= %\mathcal{T}r\left\{e^{-\beta \mathcal{H}}\right\}
\sum_i \langle \varphi^i | e^{-\beta \mathcal{H}} |\varphi^i\rangle
=\sum_i e^{-\beta E^i}\\
&\langle \mathcal{O}\rangle=\frac{1}{\mathcal{Z}}
\mathcal{T}r\left\{\mathcal{O} e^{-\beta \mathcal{H}}\right\}
=\frac{1}{\mathcal{Z}} \sum_i \langle \varphi^i |\mathcal{O} |\varphi^i\rangle
e^{-\beta E^i}\,.
\end{split}
\end{equation}\]

### Spin models

Limiting ourselves to a Hamiltonian of the type
\[\begin{equation}
\tag{7.19}
\mathcal{H}=-\frac{1}{2}J\sum_{|{\underline n}-{\underline n}'| } = \hat{\mathbf S}({\underline n})\cdot \hat{\mathbf S}({\underline n}')
+g \mu_B B^{\rm ext} \sum_{{\underline n}} \hat{S}^z({\underline n})
\end{equation}\]
one possible choice for the basis of the Hilbert space is the following one: \(|\psi_\alpha\rangle\)=\(|M_1,M_2,\dots M_N \rangle\)=\(|M_1\rangle\otimes|M_2\rangle\dots\otimes|M_N\rangle\)
with \(\hat{S}^z(n)|M_n\rangle\)=\(M_n|M_n\rangle\) and \(n\) label for the lattice site.
Note that the Hamiltonian in Eq.(7.19) is not diagonal on this basis. After having diagonalized it, thermal averages can be computed according to Eqs.(7.18).
For many problems in magnetism, substituting the quantum-mechanical operators \(\hat{\mathbf S}({\underline n})\)
by classical vectors is legitimate:

\[\begin{equation}
\label{Quantum-to-classical-spins_app}
\hat{\mathbf S}({\underline n}) \rightarrow \vec{S}({\underline n})
\equiv S_0 \left(\sin\theta \cos\varphi, \sin\theta \sin\varphi, \cos\theta\right)
\end{equation}\]
where \(S_0^2=S\,(S+1)\) (more often \(S_0=1\)). The partition function then reads
\[\begin{equation}
\label{Classical_spin_Zeta_app}
\mathcal{Z}= \int d\Omega_1\int d\Omega_2\dots \int d\Omega_N
e^{-\beta \mathcal{H}\left(\left\{ \vec{S}({\underline n}) \right\} \right)}\,,
\end{equation}\]
with \(d\Omega_{\underline n}=\sin\theta_{\underline n} d\theta_{\underline n} d\varphi_{\underline n}\)
being the solid-angle element of the spin located at the site \({\underline n}\). \
Both in the quantum and the classical case \(\mathcal{Z}\) depends on \(T\) and on the applied field \(B^{\rm ext}\).
Therefore, the thermodynamic potential obtained as the logarithm of \(\mathcal{Z}\) in the is the Gibbs free energy of macroscopic thermodynamics (see Eq.(7.15)). In this course, it will generally be indicated with \(\mathcal{F}\) (not with \(G\)!).

## 7.4 Appendix: Transformation of coordinates of the Bohr–van Leeuwen theorem

The calculation of the partition function involved in the Bohr–van Leeuwen theorem
\[\begin{equation}
\tag{7.20}
\mathcal{Z}= \frac{1}{(2\pi \hbar)^{3} } \int d^{3} p \,\exp\left[- \frac{\beta}{2m}\left(\vec p - q_{\rm e}\vec A\right)^2\right]\,
\int d^{3} r \,\exp\left[-\beta q_{\rm e} \phi(\underline{r})\right]
\end{equation}\]
encompasses the following transformation of variables
\[\begin{equation}
\tag{7.21}
\begin{cases}
&u^x = p^x - q_{\rm e} A^x \\
&u^y = p^y - q_{\rm e} A^y \\
&u^z = p^z - q_{\rm e} A^z
\end{cases}
\end{equation}\]
In the Coulomb gauge (\(\vec \nabla\cdot\vec A=0\)) the vector potential for a constant \(\vec B\) field reads
\[\begin{equation}
\tag{7.22}
\vec A = \left( -\frac{yB}{2},\frac{xB}{2}, 0\right)
\end{equation}\]
Formally, the transformation of variables (7.21) involves the six coordinates
\[\begin{equation}
\tag{7.23}
(p^x, p^y, p^z, x, y, z) \Rightarrow (u^x, u^y, u^z, x', y', z')
\end{equation}\]
where the spatial coordinates \((x, y, z)\) actually remain the same. The correspondig Jacobian
\[\begin{equation}
\mathbf{J} = \frac{\partial (p^x, p^y, p^z, x, y, z)}{\partial (u^x, u^y, u^z, x', y', z') }
\end{equation}\]
is a six-by-six matrix and can be computed explicitely inverting Eq.(7.21) and using the expression (7.22) for the scalar potential:
\[\begin{equation}
\tag{7.24}
\begin{cases}
&p^x = u^x - q_{\rm e} yB \\
&p^y = u^y + q_{\rm e} xB \\
&p^z = u^z \\
&x = x' \\
&y = y' \\
&z = z'
\end{cases}
\end{equation}\]
All the diagonal elements of \(\mathbf{J}\) equal one; most of its off-diagonal elements vanish apart from
\[\begin{equation}
\tag{7.25}
\begin{cases}
&\frac{\partial p^x}{\partial y} = - q_{\rm e} B \\
&\frac{\partial p^y}{\partial x} = q_{\rm e} B
\end{cases}
\end{equation}\]
Therefore, the Jacobian associated with this transformation has the form

\[\begin{equation}
\mathbf{J} =
\begin{pmatrix}
\mathbb{I} & \mathbb{0} \\
C & \mathbb{I}
\end{pmatrix}
\end{equation}\]
with
\[\begin{equation}
C =
\begin{pmatrix}
0 & q_{\rm e} B & 0 \\
-q_{\rm e} B & 0 & 0 \\
0 & 0 & 0
\end{pmatrix}
\end{equation}\]
To define the infinitesimal volume of integration in the new variables we need to evaluate the determinant of \(\mathbf{J}\)
\[\begin{equation}
d^3 p \, d^3 {r} = \text{det}(\mathbf{J})\, d^3 u \, d^3 {r'}
\end{equation}\]
From the general property of block matrices
\[\begin{equation}
\text{det}
\begin{pmatrix}
A & B \\
C & D
\end{pmatrix}
= \text{det}\left(D\right)\times\text{det}\left(A -BD^{-1}C \right)
\end{equation}\]
it follows that
\[\begin{equation}
\text{det}(\mathbf{J}) =
\text{det}
\begin{pmatrix}
\mathbb{I} & \mathbb{0} \\
C & \mathbb{I}
\end{pmatrix}
= \text{det}\left(\mathbb{I}\right)\times\text{det}\left(\mathbb{I} \right) =1
\end{equation}\]
and, threfore, \(d^3 p \, d^3 {r} = d^3 u \, d^3 {r'}\) used in the main text.

Technically, this corresponds to the

*microcanonical ensemble*`.↩︎Note that in the presence of a conservative force one could still define \(\epsilon_i = \vec p^2/(2m) + \varphi(\vec r_i)\), with \(\varphi(\vec r_i)\) being the potential associated with the force field.↩︎

See the manuscript

*CourseLibrary/Articles/Aikaike_Prediction_and_Entropy.pdf*for a historical account.↩︎The interested reader is encouraged to read the article by E. T. Jaynes, Phys. Rev.

**106**620 (1957) available in the*CourseLibrary*, in which a reversed viewpoint is taken: entropy and its maximization are assumed as the starting concept; the usual results of statistical mechanics are then derived imposing different physical constraints case by case, by means of the Lagrange multipliers approach.↩︎The role of the Lagrange multiplier \(\lambda_\mu\) introduced in the microcanonical ensemble is taken by the chemical potential in the grand canonical ensemble, where it is used to compute the average number of particles in the system.↩︎

Note that a realistic Hamiltonian would depend on the specific position of all the other electrons in the solid, while \(\phi(\underline{r})\) is assumed to depend only on the position of the considered electron.↩︎