# 1 The electron in a static electromagnetic field

## 1.1 Magnetism before quantum physics

The fact that some solids attract or repel other materials by virtue of what nowadays would be called magnetic interaction was already evident to mankind in ancient times.
One of the first references to the magnetic properties of magnetite (Fe\(_3\)O\(_4\), lodestone) was reported by the Greek philosopher Thales of Miletus in the 6th century BC. The name “magnet” probably comes from the lodestones found in Magnesia.
Though the use of lodestone in compasses for navigation in China dates back to Middle Ages, humans kept perceiving magnetism as a kind of magic
till the 19th century, when a clear relationship between this phenomenon and moving electric charges was established^{1}.

This observation opened the door to the modern understanding of magnetism in solids in terms of self organization of individual atomic magnetic moments.

The first important observations toward a rationalization of magnetism and its origin came with the experiments of Ampère and Oersted, in the early decades of the 19th century. They demonstrated that i) a current is able to influence a magnetic needle (Oersted) and ii) a mechanical force exists between two wires in which an electric current circulates (Ampère). Later, Faraday completed our knowledge of magnetic field by discovering that time dependent magnetic fields can create an electromotive force. The current description of electromagnetic fields was provided by J. C. Maxwell through the well-known set of equations named after him. The origin of magnetism in matter – precisely of magnetic moments – was more debated: Ampère postulated that magnetism in atoms originates from the existence of closed atomic-sized currents. Poisson and later Maxwell, instead, ascribed the origin of atomic magnetic moments to magnetic charges that appear always coupled into dipoles.

Today we know that part of the magnetism of atoms is due to the motion of electrons about the nuclei. This orbital contribution is in line with the intuition of elementary currents put forward by Ampère.
Another contribution is given by the *intrinsic* magnetic moment of electrons (spin contribution).

We will start our course showing that electrons *must* have an intrinsic magnetic moment in order that their quantum-mechanical description fulfills the same invariance properties as the Maxwell equations^{2}. Moving further in the course, you will appreciate how magnetic materials have been used across the years as benchmark to test our understanding of fundamental achievements in theoretical physics: from quantum mechanics to the theory of critical phenomena.

## 1.2 The Dirac equation and the electron spin

### Lorentz covariance

While Newton’s laws of classical mechanics are invariant under Galilean transformations, Maxwell equations are not. The latter are rather invariant (covariant) w.r.t. Lorentz transformations^{3}.
If \(\vec r\) and \(t\) are the position vector and the time measured in a certain reference frame, in a second reference frame that moves with velocity
\[\begin{equation}
\vec v = c \, \tanh \alpha\, \hat n
\end{equation}\]
w.r.t. the first one they transform as
\[\begin{equation}
\tag{1.1}
\begin{cases}
&\vec r ' = \vec r + \left[ \left(\cosh \alpha -1 \right) (\vec r\cdot\hat n) - \sinh\alpha \, (ct )\right] \, \hat n \\
& ct' = \cosh \alpha \, (c t ) - \sinh\alpha \, (\vec r\cdot\hat n)
\end{cases}
\end{equation}\]
with \(c\) being the light velocity and \(\alpha= {\rm atanh}(v/c)\). The important take of the Lorentz transformation of coordinates in Eq.(1.1) is that it mixes up time and space coordinates.
For simplicity we limit ourselves to a Lorentz boost along the \(x\) direction in which case Eq.(1.1) simplifies as
\[\begin{equation}
\tag{1.2}
\begin{cases}
&x ' = \cosh \alpha \, x- \sinh\alpha \, (ct )\\
& ct' = \cosh \alpha \, (c t ) - \sinh\alpha \, x \,.
\end{cases}
\end{equation}\]
It is straightforward to verify that the difference of squared spacetime intervals is the same in both reference frames
\[\begin{equation}
(c\Delta t)^2 - (\Delta x)^2 = (c\Delta t')^2 - (\Delta x')^2 \,.
\end{equation}\]
In general, spacetime intervals
\[\begin{equation}
\tag{1.3}
(\Delta s)^2 = (c\Delta t)^2 - (\Delta \vec r)^2
\end{equation}\]
are conserved across frames related by a Lorentz transformation. This leads to the introduction of the 4-vector formalism
\[\begin{equation}
\tag{1.4}
x^\mu = (ct, \vec r)
\end{equation}\]
with the metric tensor
\[\begin{equation}
\tag{1.5}
g_{\mu \nu} = g^{\mu \nu} =
\begin{pmatrix}
1 & 0 & 0 & 0\\
0 & -1 & 0 & 0\\
0 & 0 & -1 & 0\\
0 & 0 & 0 & -1\\
\end{pmatrix}
\end{equation}\]
so that
\[\begin{equation}
\tag{1.6}
(\Delta s)^2 = g^{\mu \nu} \Delta x^\mu \Delta x^\nu = \Delta x^\mu \Delta x_\mu \,.
\end{equation}\]
The invariant spacetime interval \(\Delta s\) is equal to the product of the speed of light by
the *proper time* \(\tau\), which fulfills the property
\[\begin{equation}
\tag{1.7}
\Delta \tau = \gamma^{-1} \Delta t
\qquad \qquad
\text{with} \qquad
\gamma^{-2} = 1 - \frac{v^2}{c^2} \,.
\end{equation}\]
With these few concepts at hand, one can proceed defining the 4-velocity
\[\begin{equation}
\tag{1.8}
u^\mu= \frac{d x^\mu}{d\tau } = \frac{dt}{d \tau}\left(c, \vec v \right) =
\gamma \left(c, \vec v \right)
\end{equation}\]
and from this the 4-momentum
\[\begin{equation}
\tag{1.9}
p^\mu = m u^\mu = m \gamma \left(c, \vec v \right)
\end{equation}\]
which has conserved length \(p^\mu p_\mu = m^2 c^2\) (using the metric tensor in Eq.(1.5). Since energy and momentum in restricted relativity are defined as
\[\begin{equation}
\tag{1.10}
\begin{split}
&\vec{p} = \gamma m \vec v \\
&E = \gamma m c^2
\end{split}
\end{equation}\]
the 4-momentum can also be written as follows
\[\begin{equation}
p^\mu = \left(\frac{E}{c}, \vec p \right)
\end{equation}\]
which leads to the famous energy-momentum relation
\[\begin{equation}
\tag{1.11}
p^\mu p_\mu = \frac{E^2}{c^2} - (\vec{p})^2 = m^2 c^2
\qquad \Rightarrow \qquad
E^2 = (c \vec{p})^2 + m^2 c^4\,.
\end{equation}\]

Using the 4-vector formalism Maxwell equations can be written in an elegant and compact way, in which they are manifestly covariant, i.e., invariant under Lorentz transformations. We will limit ourselves to the few ideas exposed so far because they suffice to describe the early attempts to develop a quantum-mechanical theory of the atom compatible with restricted relativity.

### Early attempts

Starting from the quantum Hamiltonian of a non-relativistic free particle
\[\begin{equation}
\tag{1.12}
\mathcal{H} = \frac{\hat{\mathbf p}^2}{2 m}
\end{equation}\]
the mapping of \(\mathcal{H}\) and the momentum to operators
\[\begin{equation}
\tag{1.13}
\begin{split}
& \mathcal{H} \rightarrow i\hbar \frac{\partial}{\partial t} \\
& \hat{\mathbf p} \rightarrow - i\hbar \vec \nabla
\end{split}
\end{equation}\]
leads to the Schrödinger equation for a free particle
\[\begin{equation}
\tag{1.14}
i\hbar \frac{\partial \psi}{\partial t} = - \frac{\hbar^2}{2m} \nabla^2 \psi \,.
\end{equation}\]
Identifying \(E\) in Eq.(1.11) with \(\mathcal{H}\), it seems natural to assume
\[\begin{equation}
\tag{1.15}
\mathcal{H} = \sqrt{(c \vec{p})^2 + m^2 c^4}
\end{equation}\]
as the Hamiltonian of a relativistic free particle and accordingly
\[\begin{equation}
\tag{1.16}
i\hbar \frac{\partial \psi}{\partial t} =
\sqrt{-\hbar^2 c^2 \nabla^2 + m^2 c^4}\psi
\end{equation}\]
as the relativistic equivalent of the Schrödinger equation(1.14).
The reader is immediately faced with the problem of handling the square-root operator,
as scientists in the 1920s were.
One way to overcome this problem it to square operators on both sides of the equation, which yields the so-called Klein-Gordon equation
\[\begin{equation}
\tag{1.17}
-\hbar^2 \frac{\partial^2 \psi}{\partial t^2} =
\left( -\hbar^2 c^2 \nabla^2 + m^2 c^4 \right) \psi \,.
\end{equation}\]
Plane waves
\[\begin{equation}
\tag{1.18}
\psi = N {\rm e}^{i \vec x \cdot \vec p/\hbar}
\end{equation}\]
are solutions to the Klein-Gordon equation with eigenvalues
\[\begin{equation}
\tag{1.19}
E = \pm \sqrt{(c p)^2 + m^2 c^4}
\end{equation}\]
Such eigenvalues indeed fulfill the energy-momentum relation in Eq.(1.11).
However, about a century ago it was not obvious how to interpret negative-energy solutions.
Nowadays we know that they are associated with antiparticles. The final blow for this approach to a relativistic formulation of quantum-mechanics is the fact that the *probability density* (modulus of the wave function) obtained from the solutions to the Klein-Gordon equation is not positive defined.

### The Dirac approach

Following the historical path taken by Paul F. Dirac in 1928, we list the requirements that a relativistically covariant version of the Schrödinger equation should fulfill:

- for the free-particle case its eigenvalues should be the same as the Klein-Gordon Eq.(1.17);

- time and space derivatives should enter that equation similarly;

- its wave function should correspond to a positive-defined probability density.

The first requirement is trivially fulfilled by the Klein-Gordon equation; the second one as well because just second derivatives of time and space coordinates appear; the third requirement is not fulfilled.
To circumvent this problem – without discarding the positive aspects of the Klein-Gordon equation – Dirac proposed to replace the Schrödinger equation with a set of \(N\)-coupled equations *linear* both in time and space coordinates. Formally, he represented the wave function with a vector (of functions!)

\[\begin{equation}
\tag{1.20}
\Psi =
\begin{bmatrix}
\psi_{1} \\
\psi_{2} \\
\vdots \\
\psi_{N}
\end{bmatrix}
\end{equation}\]
which is solution of the equation
\[\begin{equation}
\tag{1.21}
i\hbar \frac{\partial}{\partial t}
\begin{bmatrix}
\psi_{1} \\
\psi_{2} \\
\vdots \\
\psi_{N}
\end{bmatrix}
= -i \hbar \left(
\boldsymbol{\alpha}_x\hat{p}_x +
\boldsymbol{\alpha}_y\hat{p}_y +
\boldsymbol{\alpha}_z\hat{p}_z +
m c^2 \boldsymbol{\beta}
\right)
\begin{bmatrix}
\psi_{1} \\
\psi_{2} \\
\vdots \\
\psi_{N}
\end{bmatrix}
\end{equation}\]
with \(\boldsymbol{\alpha}_x, \,\boldsymbol{\alpha}_y, \, \boldsymbol{\alpha}_z, \,\boldsymbol{\beta}\) being \(N\times N\) matrices. The requirement that the spectrum of the eigenvalues of the Dirac equation be the same as for the Klein-Gordon equation implies that these four matrices must obey the following algebra:

\[\begin{equation}
\tag{1.22}
\begin{split}
&\boldsymbol{\alpha}_h \boldsymbol{\alpha}_k + \boldsymbol{\alpha}_k \boldsymbol{\alpha}_h =
2 \mathbb{I} \; (\text{if } k=h), \quad \boldsymbol{0} \; (\text{otherwise}) \\
%\boldsymbol{\delta}_{h k} \\
&\boldsymbol{\alpha}_k \boldsymbol{\beta} + \boldsymbol{\beta}\boldsymbol{\alpha}_k = \boldsymbol{0} \\
&\boldsymbol{\alpha}^2 = \boldsymbol{\beta}^2 = \mathbb{I}
\end{split}
\end{equation}\]
where \(\mathbb{I}\) is the \(N\times N\) identity matrix and \(\boldsymbol{0}\) the \(N\times N\) null matrix. From this algebra it follows that eigenvalues of the \(\boldsymbol{\alpha}_h,\) (\(h=x,\,y,\,z\)) and \(\boldsymbol{\beta}\) matrices can only be \(\pm 1\) and their trace has to be zero. Both requirements can only be fulfilled if \(N\) is even.
For \(N=2\) we know that only 3 independent matrices with vanishing trace can be built, i.e., the Pauli matrices. The smallest dimension in which the algebra of Eqs.(1.22) can be realized is thus \(N=4\).
A possible \(4\times 4\) representation of the \(\boldsymbol{\alpha}_h,\) (\(h=x,\,y,\,z\)) and \(\boldsymbol{\beta}\) matrices reads
\[\begin{equation}
\tag{1.23}
\boldsymbol{\alpha}_h =
\begin{pmatrix}
\boldsymbol{0} & \hat{\sigma}_h \\
\hat{\sigma}_h & \boldsymbol{0}\\
\end{pmatrix}
\qquad\text{and}\qquad
\boldsymbol{\beta}=
\begin{pmatrix}
\mathbb{I} & \boldsymbol{0} \\
\boldsymbol{0} & - \mathbb{I} \\
\end{pmatrix}
\end{equation}\]
where \(\hat{\sigma}_h\) are the (\(2\times 2\)) Pauli matrices
\[\begin{equation}
\tag{1.24}
\hat{\sigma}_x =
\begin{pmatrix}
0 & 1 \\
1 & 0 \\
\end{pmatrix}
\qquad
\hat{\sigma}_y =
\begin{pmatrix}
0 & -i \\
i & 0 \\
\end{pmatrix}
\qquad
\hat{\sigma}_z =
\begin{pmatrix}
1 & 0 \\
0 & -1\\
\end{pmatrix}
\end{equation}\]

For an electron at rest with mass \(m_{\rm e}\) the Dirac equation(1.21) reduces to

\[\begin{equation}
\tag{1.25}
i\hbar \frac{\partial}{\partial t}
\begin{bmatrix}
\psi_{1} \\
\psi_{2} \\
\psi_{3} \\
\psi_{4}
\end{bmatrix}
= m_{\rm e} c^2
\begin{pmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 &-1 & 0 \\
0 & 0 & 0 &-1\\
\end{pmatrix}
\begin{bmatrix}
\psi_{1} \\
\psi_{2} \\
\psi_{3} \\
\psi_{4}
\end{bmatrix}
\end{equation}\]
whose solutions are given by
\[\begin{equation}
\tag{1.26}
\Psi^1 = e^{-\frac{i m_{\rm e} c^2 t}{\hbar}}
\begin{bmatrix}
1 \\
0 \\
0 \\
0
\end{bmatrix}
\qquad
\Psi^2 = e^{-\frac{i m_{\rm e} c^2 t}{\hbar}}
\begin{bmatrix}
0 \\
1 \\
0 \\
0
\end{bmatrix}
\end{equation}\]
%
\[\begin{equation}
\Psi^3 = e^{\frac{i m_{\rm e} c^2 t}{\hbar}}
\begin{bmatrix}
0 \\
0 \\
1 \\
0
\end{bmatrix}\qquad
\Psi^4 = e^{\frac{i m_{\rm e}c^2 t}{\hbar}}
\begin{bmatrix}
0 \\
0 \\
0 \\
1
\end{bmatrix}
\end{equation}\]
The first two eigenfunctions correspond to positive energy (electron), while the second two to negative energy (positron).

### The emergence of spin in the low-energy limit

We consider now the behavior of an electron with charge \(q_{\rm e}\) and mass \(m_{\rm e}\) in a static electromagnetic field at low energy, namely much smaller than the particle rest energy \(m_{\rm e} c^2\). The non-relativistic Hamiltonian is obtained adding the electrostatic energy term to the free-particle Hamiltonian and replacing the momentum according to the minimal coupling prescription \[\begin{equation} \tag{1.27} \hat{\mathbf p} \rightarrow \hat{\mathbf p} - q_{\rm e}\vec A \,, \end{equation}\] which yields \[\begin{equation} \tag{1.28} \mathcal{H} = \frac{1}{2m_{\rm e}}\left(\hat{\mathbf p} - q_{\rm e}\vec A\right)^2 + q_{\rm e} \phi \end{equation}\] The Dirac equation acquires similar terms. As we are interested in the non-relativistic limit, it is convenient to split the four components of the wave function into two components \((\tilde\varphi_1,\tilde\varphi_2)\) associated with the positive-energy solutions (electron) and two components \((\tilde\chi_1,\tilde\chi_2)\) associated with the negative-energy solutions (positron). The corresponding Dirac equation reads \[\begin{equation} \tag{1.29} i\hbar \frac{\partial}{\partial t} \begin{bmatrix} \tilde\varphi_1 \\ \tilde\varphi_2 \\ \tilde\chi_1 \\ \tilde\chi_2 \end{bmatrix} = c \hat{\boldsymbol{\sigma}} \cdot \left(\hat{\mathbf p} - q_{\rm e}\vec A\right) \begin{bmatrix} \tilde\chi_1 \\ \tilde\chi_2 \\ \tilde\varphi_1 \\ \tilde\varphi_2 \end{bmatrix} + q_{\rm e} \phi \begin{bmatrix} \tilde\varphi_1 \\ \tilde\varphi_2 \\ \tilde\chi_1 \\ \tilde\chi_2 \end{bmatrix} + m_{\rm e} c^2 \begin{bmatrix} \tilde\varphi_1 \\ \tilde\varphi_2 \\ -\tilde\chi_1 \\ -\tilde\chi_2 \end{bmatrix} \end{equation}\]

Note that in this notation we have already applied the \(\boldsymbol{\alpha}_h,\) (\(h=x,\,y,\,z\)) and \(\boldsymbol{\beta}\) matrices: the first ones swap the position of the \((\tilde\varphi_1,\tilde\varphi_2)\) and \((\tilde\chi_1,\tilde\chi_2)\) components and bring the \(\hat{\boldsymbol{\sigma}}\) operators; \(\boldsymbol{\beta}\) just brings a minus sign in front of the \((\tilde\chi_1,\tilde\chi_2)\) components.

In the low-energy limit we are considering here – in which \(m_{\rm e} c^2\) is much larger than any other energy scale in the problem – the wave function can be factorized into a part varying slowly with time and in a fast-varying (with frequency \(m_{\rm e} c^2/\hbar\)) part: \[\begin{equation} \tag{1.30} \begin{bmatrix} \tilde\varphi_1 \\ \tilde\varphi_2 \\ \tilde\chi_1 \\ \tilde\chi_2 \end{bmatrix} = e^{-\frac{i m_{\rm e} c^2 t}{\hbar}} \begin{bmatrix} \varphi_1 \\ \varphi_2 \\ \chi_1 \\ \chi_2 \end{bmatrix} \end{equation}\] Defining \[\begin{equation} \tag{1.31} \hat{\boldsymbol{\pi}} = \hat{\mathbf p} - q_{\rm e}\vec A \end{equation}\] the slow-varying parts of the wave function should fulfill the equation \[\begin{equation} \tag{1.29} i\hbar \frac{\partial}{\partial t} \begin{bmatrix} \varphi_1 \\ \varphi_2 \\ \chi_1 \\ \chi_2 \end{bmatrix} = c \hat{\boldsymbol{\sigma}} \cdot \hat{\boldsymbol{\pi}} \begin{bmatrix} \chi_1 \\ \chi_2 \\ \varphi_1 \\ \varphi_2 \end{bmatrix} + q_{\rm e} \phi \begin{bmatrix} \varphi_1 \\ \varphi_2 \\ \chi_1 \\ \chi_2 \end{bmatrix} -2 m_{\rm e} c^2 \begin{bmatrix} 0 \\ 0 \\ \chi_1 \\ \chi_2 \end{bmatrix} \end{equation}\]

For kinetic energies and field-interaction energies much smaller than \(m_{\rm e} c^2\) each equation for the two \(\chi\) components can be approximated as

\[\begin{equation}
\tag{1.32}
\chi_r = \frac{\hat{\boldsymbol{\sigma}} \cdot \hat{\boldsymbol{\pi}}}{2m_{\rm e} c} \varphi_r
\end{equation}\]
with \(r=1,2\). Note that this equation implies that the amplitude of each \(\chi_r\) be roughly
\(v/c\) times smaller than the corresponding \(\varphi_r\) component.

Equation(1.32) can be substituted into the equation of the \(\varphi_r\) components to obtain
\[\begin{equation}
\tag{1.33}
i\hbar \frac{\partial}{\partial t}
\begin{bmatrix}
\varphi_1 \\
\varphi_2
\end{bmatrix}
= \left[ \frac{(\hat{\boldsymbol{\sigma}}\cdot\hat{\boldsymbol{\pi}}) (\hat{\boldsymbol{\sigma}}\cdot\hat{\boldsymbol{\pi}})}{2m_{\rm e}}
+ q_{\rm e} \phi \right]
\begin{bmatrix}
\varphi_1 \\
\varphi_2
\end{bmatrix}
\end{equation}\]
Using the properties of the Pauli matrices the first term on the r.h.s. can be written in a more familiar way^{4}
\[\begin{equation}
\tag{1.34}
\begin{split}
&(\hat{\boldsymbol{\sigma}}\cdot\hat{\boldsymbol{\pi}}) (\hat{\boldsymbol{\sigma}}\cdot\hat{\boldsymbol{\pi}}) =
\hat{\boldsymbol{\pi}}^2 + i \hat{\boldsymbol{\sigma}}\cdot\hat{\boldsymbol{\pi}}\times \hat{\boldsymbol{\pi}} = \\
&= \left(\hat{\mathbf p} - q_{\rm e}\vec A\right)^2 +
i\hat{\boldsymbol{\sigma}} \cdot
\left(\hat{\mathbf p} - q_{\rm e}\vec A\right)\times \left(\hat{\mathbf p} - q_{\rm e}\vec A\right) \\
&= \left(\hat{\mathbf p} - q_{\rm e}\vec A\right)^2 +
i^2 q_{\rm e} \hbar \left(\vec\nabla \times \vec A + \vec A \times \vec\nabla \right) \\
&= \left(\hat{\mathbf p} - q_{\rm e}\vec A\right)^2 -
q_{\rm e} \hbar \, \hat{\boldsymbol{\sigma}}\cdot\vec B
\end{split}
\end{equation}\]
to obtain the famous Pauli equation
\[\begin{equation}
\tag{1.35}
i\hbar \frac{\partial}{\partial t}
\begin{bmatrix}
\varphi_1 \\
\varphi_2
\end{bmatrix}
= \left[ \frac{ \left(\hat{\mathbf p} - q_{\rm e}\vec A\right)^2}{2m_{\rm e}}
- \frac{q_{\rm e} \hbar }{2m_{\rm e}} \hat{\boldsymbol{\sigma}}\cdot\vec B + q_{\rm e} \phi \right]
\begin{bmatrix}
\varphi_1 \\
\varphi_2
\end{bmatrix}
\end{equation}\]
For a uniform magnetic field the vector potential can be written as \(\vec A = (\vec B \times \underline{r})/2\).
After some elementary passages the Pauli equation takes the form
\[\begin{equation}
\tag{1.36}
i\hbar \frac{\partial}{\partial t}
\begin{bmatrix}
\varphi_1 \\
\varphi_2
\end{bmatrix}
= \left[
\frac{\hat{\mathbf p}^2}{2m_{\rm e}} - \frac{q_{\rm e} \hbar}{2m_{\rm e}} \left(\hat{\mathbf l} + 2 \hat{\mathbf s}\right) \cdot \vec B + \frac{q_{\rm e}^2}{8m_{\rm e}} (\vec B \times \hat{\underline{r}})^2 + q_{\rm e} \phi \right]
\begin{bmatrix}
\varphi_1 \\
\varphi_2
\end{bmatrix}
\end{equation}\]
where \(\hat{\mathbf l} = \hat{\underline{r}} \times \hat{\mathbf p}/\hbar\) is the orbital momentum and
\(\hat{\mathbf s} = \hat{\hat{\boldsymbol{\sigma}}}/2\) is the particle spin. We found it convenient to indicate the position operator with \(\hat{\underline{r}}\), for coherence of notations throughout the lecture notes.
Moreover, we adopt the convention of expressing angular momenta in \(\hbar\) units.

The reader will recongnize in the term linear in \(\vec B\) the *Zeeman* interaction. This interaction consists of an orbital and a spin contribution. The first contribution does not vanish for an electron that has a finite component of the angular momentum along the applied field, assumed as \(z\) direction, e.g., an electron occupying an atomic orbital with \(l=1\) and \(m=-1\) (see later). Due to the negative charge of the electrons, the orbital part of the Zeeman interaction favors quantum states for which \(\hat{l}^z\) has the minimal expectation value. In a semiclassical picture one would say that the Zeeman interaction favors antiparallel alignment between the angular momentum \(\vec l\) and \(\vec B\).
As for the spin contribution to the Zeeman energy, we would like to remark that it emerged spontaneously from the Dirac equation taking the low-energy limit. Note that the number 2 in front of the spin operator \(\hat{\mathbf s}\) correctly accounts for the gyromagnetic factor of the spin being twice the gyromagnetic factor of the orbital momentum (up to 0.1%).

Henceforth, we will express the *Zeeman* interaction of an electron (with electric charge \(q_{\rm e} =-e\)) as

\[\begin{equation}
\tag{1.37}
\mathcal{H}_{{\rm Z}} = \mu_{\rm B} \,\left(\hat{\mathbf l} + 2 \hat{\mathbf s}\right) \cdot \vec B
= -\hat{\boldsymbol \mu}\cdot \vec B \,,
\end{equation}\]
where \(\mu_{\rm B}=e\hbar/(2m_{\rm e})=9.274\dots \times 10^{-24}\) J/T \(= 5.788 \dots \times 10^{-2}\) meV/T is the Bohr magneton and \(\hat{\boldsymbol \mu} = - \mu_{\rm B} \,(\hat{\mathbf l} + 2 \hat{\mathbf s} )\) is the magnetic-moment operator. As anticipated in the first section, this magnetic moment consists of an orbital and an intrinsic (spin) contribution.

The term in the Pauli equation(1.36) proportional to \(B^2\) is much weaker than the Zeeman interaction and it is only observable if the total angular momentum of electrons in one atom exactly vanishes. This term is responsible for the so-called *diamagnetic* contribution.

The comparison between the Schrödinger equation and the Pauli equation(1.36) highlights that the latter includes the spin contribution explicitly. However, as spatial and spin d.o.f. are not directly coupled, the solutions of the Pauli equation can be written as
\[\begin{equation}
\tag{1.38}
\begin{bmatrix}
\varphi_1 \\
\varphi_2
\end{bmatrix} = \psi_{\rm Schr.}(\underline{r})\,|spin\rangle \,
\end{equation}\]

with \(|spin\rangle\) indicating the spin part of the wave function. The spatial part

\(\psi_{\rm Schr.}(\underline{r})\) is a solution of the Schrödinger equation.

In the next section we will see that the *spin-orbit* interaction, instead, directly couples spatial and spin d.o.f.. Therefore, factorizing spatial and spin parts in the wave function is not accurate when spin-orbit interaction is taken into account.

### Spin-orbit coupling

The calculation reproduced in the previous section is not the only possible way to take the low-energy limit of the Dirac equation for an electron. Foldy and Wouthuysen^{5} developed a systematic procedure to decouple the positive- and negative-energy solutions by expanding the Hamiltonian in powers of (energy terms)/(\(m_{\rm e}c^2\)) along with the successive application of canonical transformations.
The resulting low-energy Hamiltonian for the positive-energy solutions (electrons) reads:
\[\begin{equation}
\tag{1.39}
\begin{split}
\mathcal{H} &= m_{\rm e} c^2 + \frac{\hat{\mathbf p}^2}{2m_{\rm e}} - \frac{q_{\rm e} \hbar}{2m_{\rm e}} \left(\hat{\mathbf l} + 2 \hat{\mathbf s}\right) \cdot \vec B + \frac{q_{\rm e}^2}{8m_{\rm e}} (\vec B \times \hat{\underline{r}})^2 + q_{\rm e} \phi \\
&-i \frac{q_{\rm e}\hbar^2}{4m_{\rm e}^2 c^2} \, \hat{\mathbf s}\cdot\vec\nabla\times \vec E
-\frac{q_{\rm e}\hbar}{2m_{\rm e}^2 c^2} \, \hat{\mathbf s}\cdot\vec E\times \hat{\mathbf p} \\
&-\frac{\hat{\mathbf p}^4}{8m_{\rm e}^3 c^2}-\frac{q_{\rm e}\hbar^2}{8m_{\rm e}^2 c^2} \vec\nabla\cdot \vec E \,.
\end{split}
\end{equation}\]
The first row of the equation above comprises the terms already encountered in the Pauli equation plus the rest-mass energy. The two terms in the second row correspond to the spin-orbit coupling. For static electromagnetic fields one has \(\vec\nabla\times \vec E =0\) and, accounting for the sign of the electron charge \(q_{\rm e}=-e\), the spin-orbit interaction takes the form
\[\begin{equation}
\tag{1.40}
\mathcal{H}_{\rm so}=\frac{e\hbar}{2m_{\rm e}^2 c^2} \, \hat{\mathbf s}\cdot\vec E\times \hat{\mathbf p} \,.
\end{equation}\]
In the next sections we will specialize the field \(\vec E\) to the electron-nucleus Coulomb interaction, possibly corrected with the screening effect of the other electrons in the atom (see Eq.@ref(eq:V_ee_cetral). However, the validity of Eq.(1.40) goes beyond the atom and it applies, for instance, to the spin of electrons in a semiconductor that experience an electric field generated by an interface or applied externally (Rashba spin-orbit coupling).
One can interpret the spin-orbit Hamiltonian(1.40) as a Zeeman interaction of the spin magnetic moment \(\hat{\boldsymbol \mu}=-2\mu_{\rm B} \hat{\mathbf s}\) with an effective magnetic field \(\vec B = (\vec E\times \vec v)/(2c^2)\) experienced by the electron moving with velocity \(\vec v=\vec p/m_{\rm e}\) in a field \(\vec E\). Classically, this effective magnetic field would be two times larger. The discrepancy between experimentally observed splitting of energy levels (fine structure) and the result of the classical (non-relativistic) calculation animated a long debate at the beginning of the twentieth century.

When the electric field is the gradient of a *spherically symmetric* electrostatic potential \(\phi(\underline{r})\), like for electron-nucleus Coulomb interaction, it takes the form:
\[\begin{equation}
\tag{1.41}
\vec E = -\nabla\phi(\underline{r})=-\frac{1}{r}\frac{ \partial \phi(\underline{r})}{\partial r}\, \hat{\underline{r}}\,.
\end{equation}\]
Replacing this term into Hamiltonian(1.40) the familiar form of the spin-orbit interaction is obtained
\[\begin{equation}
\tag{1.42}
\mathcal{H}_{\rm so}= -\frac{e\hbar}{2\, m_{\rm e}^2\,c^2} \frac{1}{r}\frac{ \partial \phi(\underline{r})}{\partial r}\,\hat{\mathbf s}\cdot (\hat{\underline{r}} \times \hat{\mathbf p})
=-\frac{e}{2}\left(\frac{\hbar}{m_{\rm e} \,c}\right)^2 \frac{1}{r}\frac{ \partial \phi(\underline{r})}{\partial r}\,\hat{\mathbf s}\cdot \hat{\mathbf l} =\xi_{\rm so}\,\hat{\mathbf s}\cdot \hat{\mathbf l}
\end{equation}\]
where we used that \(\hat{\underline{r}} \times \hat{\mathbf p}=\hbar \,\hat{\mathbf l}\).
Since \(\partial \phi(\underline{r})/\partial r\) is negative for electrons, the quantity
\[\begin{equation}
\tag{1.43}
\xi_{\rm so}= -\frac{e}{2}\left(\frac{\hbar}{m_{\rm e} \,c}\right)^2 \frac{1}{r}\frac{ \partial \phi(\underline{r})}{\partial r}\,,
\end{equation}\]
averaged over the spatial wave function, is positive. Therefore, the spin-orbit interaction favors antiparallel alignment between the spin and the orbital angular momenta of an electron.
This can be seen more formally defining the sum of the spin and orbital momentum as \(\hat{\mathbf j}=\hat{\mathbf s}+\hat{\mathbf l}\) and squaring both sides of the equation. The scalar product \(\hat{\mathbf s}\cdot\hat{\mathbf l}\) can then be expressed in terms of the squares of the three angular momenta:
\[\begin{equation}
\tag{1.44}
\hat{\mathbf s}\cdot\hat{\mathbf l}=\frac{1}{2}(\hat{\mathbf j}^2-\hat{\mathbf s}^2-\hat{\mathbf l}^2)\,,
\end{equation}\]
Inserting this result into Eq.(1.42) gives

\[\begin{equation}
\tag{1.45}
\xi_{\rm so}= \frac{1}{2} \xi_{\rm so} (\hat{\mathbf j}^2-\hat{\mathbf s}^2-\hat{\mathbf l}^2)\,.
\end{equation}\]
which manifestly favors minimal values of the total angular momentum \(j\).

The terms in the last row of Eq.(1.39) – proportional to \(\hat{\mathbf p}^4\) and to \(\vec\nabla\cdot \vec E\) (the so-called Darwin term) – are very small and will be neglected henceforth.

*The Theory of Magnetism*, D. C. Mattis – Harper’s physics series – Harper & Row (New York, Evanston, and London, 1965).↩︎This section has been adapted from Chapter 1 of the book

*``Relativistic Quantum Mechanics’’*, by J.~D. Bjorken St"ohr and S.~D. Drell (McGraw-Hill College, 1964) (source available in the { BookChapters} folder).↩︎The Galilean group encompasses Galilean transformations plus rotations and translations; the Poincaré group encompasses Lorentz transformations (boosts) plus rotations and translations.↩︎

The reader has to interpret this operator as applied to each component of the wave-function \(\varphi_r\) (with \(r=1,2\)). In particular, one has \(\vec\nabla \times (\vec A \varphi_r) = \varphi_r (\vec\nabla \times \vec A ) + (\vec\nabla\varphi_r) \times \vec A\) \(=\varphi_r (\vec\nabla \times \vec A ) - \vec A \times(\vec\nabla\varphi_r)\) and therefore \(\left(\vec\nabla \times \vec A +\vec A \times \vec\nabla \right)\varphi_r = \varphi_r(\vec\nabla \times \vec A) - \vec A \times(\vec\nabla\varphi_r)+\vec A\times\vec(\nabla\varphi_r)\) \(=\varphi_i(\vec\nabla \times \vec A)= \varphi_r \vec B\).↩︎

Leslie L. Foldy and Siegfried A. Wouthuysen, Phys. Rev. {}, 29 (1950).↩︎