Variational Methods for Solving High-Dimensional Quantum Systems ()
1. Introduction
Many exciting phenomena in quantum many-body systems are due to the interplay of quantum fluctuations and correlations. Celebrated examples are the superfluid Helium [1] [2], the fractional quantum Hall effect [3] [4], the Haldane phase in quantum spin chains [5] [6], quantum spin liquids [7], and high-temperature superconductivity [8]. The aim of theoretical physics is to understand the emergent properties for such challenging quantum many-body systems. The main difficulty in investigating quantum many-body problems is due to the fact that the Hilbert space spanned by the possible microstates grows exponentially with the system size. To unravel the physics of microscopic model systems and to study the robustness of quantum phases of matter, large scale numerical simulations are essential. The exact diagonization method is only possible for small many-body systems. For large systems, efficient quantum Monte Carlo (QMC) methods can be applied. In a large class of quantum many-body systems (e.g., fermionic degrees of freedom, geometric frustration), however, these QMC sampling techniques cannot be used effectively due to the sign problem [9]. In this case, the variational methods have been shown to be a powerful tool to efficiently simulate quantum many-body systems.
The first type of variational method is the density matrix renormalization group (DMRG) method which was originally developed to study ground state properties of one-dimensional systems. The success of the DMRG method is based on the area law of the quantum ground state [10], and thus can be represented efficiently using matrix product states (MPS) [11]-[19]. This algorithm has been generalized to study two-dimensional systems [20], Abelian and non-Abelian symmetries [21]-[25], single-site optimization with density matrix perturbation [26] [27], hybrid real-momentum space representation [28] [29], and the development of real-space parallelization [30], continuous matrix product state (cMPS) [31] [32]. The cMPS can also be used to study the Lieb-Liniger model [33] [34], the Gaudin-Yang model [35], periodic bc atomtronics [36] and 1 + 1 relative Bose theories [37].
The machine learning has been extensively used in physical sciences [38]. The second type of variational method is based on machine learning, where neutral network variational ansatz efficiently represents highly correlated quantum states and their parameters are easily optimized by means of the variational Monte Carlo (VMC) method [39], which has powerful expressibility for quantum state [40]-[43]. Alongside with restricted Boltzmann machine (RBM) [44]-[59], other network structures such as a feed-forward [60]-[63], recurrent neural networks [64] [65], autoregressive neural TensorNet [66], and convolutional neural networks [67]-[78], have been adopted. Recurrent neural networks (RNNs) are tremendously powerful tools that have been used in language modeling [79], speech recognition [80], and image generation [81] [82]. RNNs process data streams by maintaining a hidden state, which is updated by applying an identical function to the previous hidden state and the input at the next time step.
At the same time, neural networks using self-attention layers, like the Transformer [83], have had a profound impact on much of machine learning. They have led to breakthroughs in natural language processing [84], language modeling [85], image recognition [86], and protein folding [87]. However, the application of Transformer in this field is still rather limited, with a few results concerning quantum lattice models [88], open systems [89], quantum state tomography [90] and quantum circuit simulation [91]. Therefore, the full potential of the transformer architecture has yet to be explored.
The third type of variational method is called the variational quantum eigensolver (VQE), which is extensively used in quantum calculation. This is a hybrid algorithm, where the quantum state is prepared by a quantum algorithm, which is implemented according to a quantum circuit with many quantum gates. Classical computer is used to optimize the parameters used in quantum circuit. Compared with the matrix product state and machine learning ansatz, the variational state in VQE is realized by quantum algorithm, which can be implemented in the quantum hardware devices, and thus exponential speedup over classical methods becomes possible. The previous quantum algorithms to find the ground state of a given Hamiltonian were based on adiabatic state preparation and quantum phase estimation subroutines [92] [93], both of which have circuit depth requirements beyond those available in the NISQ era. Here we listed both the original VQE architecture [94] and some more advanced methods: orthogonality constrained VQE [95], subspace expansion method [96], subspace VQE [97] [98], multistate contracted VQE [99], adiabatically assisted VQE [100] [101], and accelerated VQE [102]-[104]. To study the dynamics of many-body quantum systems, the conventional quantum Hamiltonian simulation algorithm such as the Trotter-Suzuki product formula [105] was used. However, the circuit depth of standard Trotterization methods can rapidly exceed the coherence time of noisy quantum computers. This has led to recent proposals for variational approaches to dynamical simulation, including iterative variational algorithms [106] [107], imaginary time evolution [108], general first order derivative equations with non-Hermitian Hamiltonians [109], adaptive ansatz to reduce the circuit depth [110] [111] and variational fast forwarding [112]-[115].
In this paper, these three variational methods are discussed in detail and are used to solve the ground state energy of the Fermi-Hubbard model and Schwinger model. The paper is organized as follows: The details of the three variational methods are given in Sections 2, 3, and 4. In Sections 5 and 6, the Fermi-Hubbard model and Schwinger model have been mapped to spin 1/2 systems using the Jordan-Wigner representation. In Section 7, the ground state energy of the Fermi-Hubbard model and Schwinger model are calculated using these three variational methods. The discussion is given in the final Section 8.
2. Density Matrix Renormalization Group
A general discrete Hilbert space of the many-body quantum systems with
sites has the structure
(1)
where
is the set of discrete quantum numbers at site
.
is the standard orthogonal basis for the local Hilbert space
corresponding to the
th site,
. The whole Hilbert space
is the tensor product of individual local Hilbert space:
. The standard orthogonal basis
in
is also called the computational basis. For the qubit systems,
. For spin-1/2,
, which is the two eigenvalues of Pauli matrix
. For general spin-
with integral or half integral
,
and
has the dimension
. Sometimes, the dimension of Hilbert space is reduced under some kind of restriction. For example, the spin 1/2 system with the restriction
, the dimension 16 of
is reduced to 6 if
.
The state
can be represented by
(2)
where the sum
over all quantum numbers is assumed. Since the computational basis is standard orthogonal, the wave function
is the inner product of
and
.
The Hamiltonian
of many-body quantum system is a Hermitian operator from
dimensional Hilbert space
into itself. We assume that the dimension
of
are same for
. The calculation of the spectra of
is difficult due to the exponentially large dimension over
. For small
, exact diagonalization can be adopted since
can be represented as the Hermitian matrix in the computation basis
.
A general operator
in the basis
can be written as
(3)
Since the basis
is orthogonal to each other,
(4)
i.e.,
. Apply this operator
to a general state
in (2), one has
(5)
The inner product between
and
is
(6)
where
denotes the complex conjugate of
.
The wave function
can be regarded as the tensor
of order
, and thus the manipulation of the wave function is reduced to be that of tensor
. Similarly,
is the tensor of order
. The computational cost of
is
which is not acceptable. The trick of reducing computational cost is the decomposition of tensor and then truncation. A very important operation for tensor is the decomposition as follows:
(7)
where
is the
matrix for fixed (physical) indices
,
, i.e.,
is a tensor of order 3. The index
is contracted in the multiplication of all
, and
since the contracted result
is a complex number. Since the index
is not physical, they are called virtual indices.
The decomposition (7) can be realized by
steps of singular value decomposition (SVD) of matrices, which is given in detail in Appendix. Inserting (7) into (2), we get the matrix product state
(8)
If
for
, then
and
,
, and thus
. The exact decomposition of (7) will cause exponential disaster. Fortunately, if the state of interest is not highly entangled, the truncation in SVD will not cause much error for the approximation of this state.
As shown in Appendix, if the tensor
is normalized
(9)
the
tensors
of order 3 satisfy the left canonical condition
(10)
where
is the
identity matrix. For a general tenor
of order
, we have the left canonical form
(11)
with the left canonical condition for the
tensors
(12)
Here we intentionally introduce a number
which becomes 1 if
satisfies (9). In fact, this left canonical form can also be obtained from the general matrix product form (7) where
does not satisfy (10). This can be realized by
steps of SVD. First,
(13)
where
and
are unitary matrices and
are the singular values. Then
satisfies the left canonical condition. The remaining part
is combined with
to get
and use SVD for this matrix again, we can find
which satisfies the left canonical condition. Continuing this process, we can get the left canonical form (11).
Similarly, if the SVD is implemented from the right-hand side, the general tensor has the right canonical form
(14)
with the right canonical condition for the tensor
of order 3
(15)
If the SVD is implemented from both sides, they will meet at some bonds, e.g.,
(16)
We call this the mixed canonical form. Here
is a
diagonal matrix.
A full decomposition is
(17)
where
is the
matrix and
is the
diagonal matrix. Obviously, these three canonical forms are related to each other
(18)
for
.
Assume that
has the decomposition
(19)
where
is the tensor of order 4 with
,
and
,
and
are the
and
matrices, respectively. For example, the Hamiltonian
(20)
can be written in the form (19) with
Here
,
and
are the Pauli
,
and
matrices, respectively.
From the decompositions in (7) and (19), one has
(21)
The inner product
with
in (6) is a scalar
(22)
The summation over one virtual index
and two physical indices
and
of
,
and
is a tensor of order 3 with virtual indices
,
and
. Note that
.
If the operator
is local and
is a canonical form, the expression in (22) can be simplified. For example,
has the mixed canonical form (16),
is the operator acting on local site 2 and site 3. For example,
for
,
(23)
The ground state of Hamiltonian
can be calculated by DMRG algorithm [11], which is a variational method by minimizing
over the tensor
of order
. Here we always require that
. Since
has the structure of matrix product state, this global minimization can be realized by local minimization step by step. For example,
and
have a form (19), and (16), respectively, then
has a form (23) can be written as
(24)
where
is a tensor of order 4.
, which is the tensor of order 8, is obtained by the contraction of the left environment
, right environment
,
and
. Here the left environment
is obtained from the contraction of
,
,
and
,
. Similarly,
is obtained from the contraction of
,
and
,
,
.
To keep the mixed canonical form of tensor of order
, THE SVD for the minimum
is
(25)
Here the tensor
of order 4 is rewritten as a matrix
. The bond dimension becomes
times
or
which is almost
times of the bond dimension for the original
. Here we assume that
. To keep the bond dimension under control, the SVD in (25) is truncated for
. In practice, the largest
singular values in magnitude
are kept and then scale these singular values such that to ensure
. For the ground state with not highly entangled, the singular values in magnitude decays vary fast. This truncation will not cause much error. Replacing
by , we get another better tensor with the less value of
. This better tensor is also in the mixed canonical form. This is called one step of two-site DMRG algorithm since the local tensors
at site 2 and
at site 3 are updated. To update the local tensors
at site 3 and
at site 4 in the next step of DMRG, we should prepare the left environment
, right environment
.
is the contraction of
,
,
and
. So we use
steps of DMRG to update all
sub-tensors from the left to the right-hand side. This is called one sweep of DMRG. The next sweep is implemented from the right to the left-hand side. This algorithm will stop if
will not decrease.
3. Boltzmann Machine Learning
The quantum expectation value of an operator
on a non-normalized state
can be written as a classical expectation value over the distribution
using
(26)
where
(27)
If the matrix presentation
of the operator
under the basis
is sparse, there are only
basis
such that
are nonzero. Equation (26) shows that
can be calculated by Monte Carlo sampling of
with the probability
.
We use neutral network to describe the wave function
, which depends on real parameters
, and thus denoted by
. Denote by
the change of the kth component
.
has the expansion
(28)
Define the local operator
with matrix elements
(29)
where we introduced the log-derivative of
(30)
Here
depends on
. We thus have
(31)
From (28), one has
(32)
where we omitted
.
is normalized to be
(33)
Define
(34)
where
(35)
Thus
and
are orthogonal,
. But
for
.
The norm squared of
is
(36)
where
denotes the real part. One has from (32) (36)
(37)
where
is the imaginary part of
.
Let
be the Hamiltonian operator and using (37), one has
(38)
The
th parameter
is updated according to the gradient descent method
(39)
where
,
,
is the learning rate.
If each component
has a change
, the wave function
has changed to be
. Let
. Equation (37) is generalized to be
(40)
where
.
The distance between
and
is
(41)
When
,
is minimized to be
(42)
with
(43)
where we used
(44)
Find
such that: 1)
is minimized; 2) the distance
between
and
is as small as possible. Minimize
(45)
to find that
satisfies
(46)
which can be used to update the parameter
. This algorithm is called the stochastic reconfiguration method.
In the Boltzmann machine ansatz, the wave function is chosen to be
(47)
where
are the real parameters. Here
is a given positive integer number.
is the ratio of
and
. Compared to the other neutral network ansatz,
for the Boltzmann machine ansatz can be calculated analytically [39].
4. Variational Quantum Eigensolver
The variational quantum eigensolver is aimed at finding the ground state energy of a Hamiltonian
by minimizing the cost function
, where he parameter
is updated by
by iterations. Here
is the learning rate. We choose the trial state
for some ansatz
and initial state
. The variational Hamiltonian ansatz also aims to prepare a trial ground state for a given Hamiltonian
(where
are Hermitian operators, usual Pauli strings) by Trotterizing an adiabatic state preparation process [116]. Here, each Trotter step corresponds to a variational ansatz so that the unitary is given .
Consider a general Hamiltonian
(48)
where
is an even integer and the coefficients
and
do not depend on sites.
, etc., denotes the Pauli
matrix acting on site
and spin
. Splitting the
sites to be even sites and odd sites, this Hamiltonian can be written as
(49)
where
(50)
(51)
and
are defined similarly by replacing summation over even site
with odd site
.
Inspired by Ref. [117] [118], the quantum circuit is
, where
(52)
where
.
is defined similarly, with the subscription
replaced by
. Since the terms in
, etc., commute with each other,
The initial state
is chosen to be the ground state
of
with the ground state energy
. This is because
(53)
for
,
and even site
.
5. Fermi-Hubbard Model
The Hamiltonian for Fermi-Hubbard model is
(54)
where open boundary condition is assumed and
. The creation operator
and the annihilate operators
satisfy
(55)
For the half filling case:
, the Hamiltonian is reduced to
(56)
Combining the index
and
to be
, we have
(57)
where
if
and 0 if
. The Jordan-Wigner representation shows that
Here
are Pauli operators. In the Jordan-Wigner representation, the Hamiltonian for half-filling case, is written as
(58)
which is the special case for the Hamiltonian in (48) except the constant
.
If one uses the index
for
and
, the Hamiltonian in (56) becomes
(59)
Under the Jordan-Wigner representation
it becomes
(60)
6. Schwinger Model
The massive Schwinger model, or QED in two space-time dimensions, is a gauge theory describing the interaction of a single fermionic species with the photon field. In the temporal gauge, the Hamiltonian density is
(61)
where
are the
matrices,
and
are two parameters. The electric field,
satisfies the Gauss’s law
Here
and
denotes the partial derivative along one-dimensional space direction
and time direction
.
The theory is quantized using canonical quantization by imposing anticommutation relations on the fermion fields
(62)
and by imposing commutation relations on the gauge fields
(63)
The model can now be formulated on a “staggered” spatial lattice. Let the lattice spacing be
, and label the sites with an integer
. Define a single-component fermion field
at each site
, obeying anticommutation relations
(64)
The gauge field is defined on the links
connecting each pair of sites by
(65)
Then the lattice Hamiltonian equivalent to Equation (61) is
(66)
where the number of lattice sites
is even, and the correspondence between lattice and continuum fields is
(67)
and
are the two components of the continuum spinor
,
(68)
From the commutation relations (63) and the above correspondence, one has
(69)
The dimensionless Hamiltonian is
, which is also denoted by
(70)
with the dimensionless parameters
(71)
The Gauss law
is discretized to be
(72)
The elimination of fermion fields can be realized by using a Jordan-Wigner transformation
where
denotes the matrix
(
) acting on the jth qubit.
The Gauss law in (72) is reduced to be
(73)
and the dimensionless Hamiltonian in (70) is
(74)
The factor
has been eliminated by a residual gauge transformation [119]
(75)
Finally, the model (70) can be mapped to a spin 1/2 Hamiltonian [119] [120]
(76)
where
is the boundary electric field (on the leftmost link), which can describe the background field. In the second equality, we used
7. Simulation Results
We investigate the Fermi-Hubbard model at half filling, where the Hamiltonian is given by (56) or (58). Table 1 presents the ground state energy per site
obtained by DMRG algorithm for different values of
and
. It is evident that the thermodynamic limit (
) is achieved in each column. We also check these results by exact diagonalization and find that the error between the DMRG and exact ground state energies is less than 1.42 × 10−9 for all
in Table 1, demonstrating the high accuracy of the DMRG method. Figure 1 illustrates the energy
in Equation (24) as a function of iteration for different values of
with
. We perform 8 sweeps, with 14 (
) local minimizations in each sweep (multiplied by 2 due to spin up and spin down states). It is noteworthy that the ground state energy is obtained after a single sweep (14 local minimizations). Furthermore, the energy
decreases with each step of local minimization.
![]()
Figure 1. The dependence of
of the Fermi-Hubbard model for half-filling case on the iteration for different
,
by DMRG algorithm.
Table 1. The ground state energy
of the Fermi-Hubbard model for half-filling case obtained by DMRG algorithm.
|
0 |
2 |
4 |
8 |
|
−1.118033 |
−1.718985 |
−2.488286 |
−4.279293 |
|
−1.164653 |
−1.757718 |
−2.515427 |
−4.294683 |
|
−1.189692 |
−1.778204 |
−2.529475 |
−4.302603 |
We display the ground state energy per site,
, obtained using the restricted Boltzmann machine with
in Table 2. To calculate the ground state energy
, we use the last 10,000 sampling data out of a total of 40,000 data points. Comparing these results with those obtained using the DMRG algorithm, we find that the results from the restricted Boltzmann machine are reliable when
and
are not too large. However, for
and
, the value
in Table 2 is significantly higher than
in Table 1. Figure 2 illustrates the dependence of
on the iteration number for the restricted Boltzmann machine with
and
. For the case
, there are large fluctuations of
with each iteration.
Figure 2. The dependence of
of the Fermi-Hubbard model for half-filling case on the iteration for different
by restricted Boltzmann machine with
and
.
Table 2. The ground state energy
of the Fermi-Hubbard model for half-filling case obtained from the last 10,000 sampling data (total 40,000 data). Using restricted Boltzmann machine with
.
|
0 |
2 |
4 |
8 |
|
−1.112495 ± 0.00458 |
−1.713604 ± 0.00430 |
−2.482209 ± 0.00468 |
−4.270301 ± 0.00639 |
|
−1.154108 ± 0.00698 |
−1.747218 ± 0.00706 |
−2.502630 ± 0.00775 |
−4.272329 ± 0.01133 |
|
−1.172597 ± 0.00820 |
−1.759270 ± 0.00934 |
−2.507544 ± 0.01090 |
−4.211473 ± 0.01207 |
Table 3 displays the ground state energy per site,
, obtained using the variational quantum eigensolver (VQE) with
. The results from VQE show good agreement with those obtained using the DMRG algorithm. The largest deviation in
between VQE and DMRG is 0.017611, which occurs for the case
and
. Figure 3 illustrates the dependence of
on the iteration number for VQE with different values of
and
. When compared with the fluctuation in
shown in Figure 2 for the restricted Boltzmann machine, the fluctuation observed in Figure 3 is very small.
Figure 3. The dependence of
of the Fermi-Hubbard model for half-filling case on the iteration by VQE with
for different
,
.
Table 3. The ground state energy
of the Fermi-Hubbard model for half-filling case obtained by VQE after 2000 iterations,
in quantum circuit
, the learning rate
.
|
0 |
2 |
4 |
8 |
|
−1.11802 ± 0.000003 |
−1.718857 ± 0.000036 |
−2.488273 ± 0.000009 |
−4.274382 ± 0.002363 |
|
−1.164649 ± 0.000001 |
−1.757449 ± 0.000014 |
−2.514657 ± 0.000070 |
−4.284413 ± 0.000717 |
|
−1.189618 ± 0.000004 |
−1.776534 ± 0.000159 |
−2.520144 ± 0.001265 |
−4.284992 ± 0.001337 |
The ground state energy per site,
, of the Schwinger model (76) has been calculated using both the density matrix renormalization group (DMRG) algorithm and the variational quantum eigensolver (VQE). However, the restricted Boltzmann machine is not suitable for solving the Schwinger model as it fails to approach the ground state energy accurately. In this study, we have chosen
and
. Table 4 presents the ground state energy per site obtained by the DMRG algorithm for different values of
and
. Comparing the results to the exact values, we observe that
for
and 8 has an error less than
, while for
the error is on the order of
. For fixed values of
and 0.5,
tends to the thermodynamic limit as
. Table 5 displays the ground state energy per site obtained by VQE for different values of
and
, and 16 due to computational limitations. It is worth noting that
obtained by VQE is slightly larger than those obtained by the DMRG algorithm.
Table 4. The ground state energy
of the Schwinger model with
obtained by DMRG algorithm.
|
0.0 |
2.5 |
|
−55.627908 |
−54.403737 |
|
−59.184968 |
−57.966911 |
|
−61.162808 |
−59.954053 |
|
−62.209047 |
−61.012423 |
|
−62.752515 |
−61.564360 |
Table 5. The ground state energy
of the Schwinger model with
obtained by VQE. The last 1000 iterations are used (total 3000 iterations),
in quantum circuit
, and the learning rate
.
|
0.0 |
2.5 |
|
−55.513396 ± 0.000435 |
−54.263672 ± 0.000295 |
|
−58.831446 ± 0.000582 |
−57.581606 ± 0.000463 |
|
−60.318314 ± 0.000503 |
−59.068237 ± 0.000498 |
8. Discussion
We have used three variational methods to calculate the ground state energy for Fermi-Hubbard model at half filling and Schwinger model. The DMRG algorithm has proven to be successful for these two models, while the VQE algorithm, although less accurate, is also effective for calculating ground state energies. However, it seems that the restricted Boltzmann learning method is not as accurate as the DMRG algorithm for the Fermi-Hubbard model. It’s unfortunate that the restricted Boltzmann does not work for solving the ground state energy of Schwinger model. We also used mean field ansatz, Jastrow ansatz, multi-layer dense network, group convolutional neural networks [75] [121] and autoregressive networks [62], etc., but it also does not work. The authors in [122] used the transformer quantum state (TQS) to study the ground state of 1D transverse field Ising model and showed the power of TQS in expressing the quantum states for quantum many-body systems. We used TQS to calculate the ground state energy of Schwinger model with
for different
in Figure 4. TQS is trained for many randomly chosen
and then the TQS model is extrapolated to the other values of
. The trained TQS model can also be fine-tuned. The calculated result in Figure 4 should be comparable with the ground state energy obtained by DMRG algorithm.
![]()
Figure 4. The ground state energy
of Schwinger model with
obtained by transformer quantum state (TQS).
Variational methods can indeed be applied to various other aspects of quantum systems, such as excited states, thermal states, real-time dynamics, dissipative dynamics, and quantum control. It’s beneficial to learn from algorithms and ideas used in one variational method and apply them to other variational methods. This cross-pollination of ideas can lead to further advancements in the field.
Acknowledgements
Daming Li was supported by National Key Research and Development Program of China (No. 2024YFA1014104), and Shanghai Science and Technology Innovation Action Plan Project (No. 22JC1401500).
Appendix
The decomposition in (7)
For the sake of notation convenience,
for
. Combining the last
indices
together:
, the tensor
can be regarded as
matrix
(77)
The SVD of
matrix
is
(78)
with
,
is the singular value,
and
are
and
unitary matrix (the column vectors with unit length are orthogonal to each other), respectively. Let
, which is a
matrix (
and
fixed). The left two parts in SVD are combined to be
matrix
(79)
where
,
. So, we have
The SVD of
is
(80)
Let
, which is a
matrix (
fixed). The remaining two parts are combined as
matrix
(81)
After
steps of SVD, we get
matrix
, and use SVD again
(82)
Let
, which is a
matrix. Let
, which is a
matrix (
and
fixed). Thus after
steps of SVD, the decomposition in (7) is obtained.
From the above SVD processes,
satisfy the left canonical condition. For example,
(83)
due to the unitary matrix
where
is the
identity matrix. Here the summation over
can be ignored. Since
satisfy the left canonical condition, one has
(84)
If
, the left canonical condition for
is also satisfied.