Building a Transformer-Based NQS for Frustrated Spin Systems with NetKet

admin April 16, 2026

0 0 4 minutes read

Building a Transformer-Based NQS for Frustrated Spin Systems with NetKet

The intersection of many body physics again deep learning open a new frontier: Neural Quantum States (NQS). While traditional methods struggle with complex systems with high dimensions, the global attention method of Transformers provides a powerful tool to capture complex quantum correlations.

In this lesson, we use the research grade Variational Monte Carlo (VMC) pipe using NetKet again JAX to solve i J1–J2 frustrated Heisenberg chain. We will:

Build a culture Transformer based NQS properties.
Configure the wave function using Stochastic reconfiguration (decrease of natural gradient).
Benchmark our results against direct diagonalization and analyze emerging quantum phases.

By the end of this guide, you will have a robust, physically based simulation framework capable of exploring quantum magnetism without the reach of classical direct methods.

!pip -q install --upgrade pip
!pip -q install "netket" "flax" "optax" "einops" "tqdm"


import os
os.environ["XLA_PYTHON_CLIENT_PREALLOCATE"] = "false"


import netket as nk
import jax
import jax.numpy as jnp
import numpy as np
import matplotlib.pyplot as plt
from flax import linen as nn
from tqdm import tqdm


jax.config.update("jax_enable_x64", True)
print("JAX devices:", jax.devices())


def make_j1j2_chain(L, J2, total_sz=0.0):
   J1 = 1.0
   edges = []
   for i in range(L):
       edges.append([i, (i+1)%L, 1])
       edges.append([i, (i+2)%L, 2])
   g = nk.graph.Graph(edges=edges)
   hi = nk.hilbert.Spin(s=0.5, N=L, total_sz=total_sz)
   sigmaz = np.array([[1,0],[0,-1]], dtype=np.float64)
   mszsz = np.kron(sigmaz, sigmaz)
   exchange = np.array(
       [[0,0,0,0],
        [0,0,2,0],
        [0,2,0,0],
        [0,0,0,0]], dtype=np.float64
   )
   bond_ops = [
       (J1*mszsz).tolist(),
       (J2*mszsz).tolist(),
       (-J1*exchange).tolist(),
       (J2*exchange).tolist(),
   ]
   bond_colors = [1,2,1,2]
   H = nk.operator.GraphOperator(hi, g, bond_ops=bond_ops, bond_ops_colors=bond_colors)
   return g, hi, H

We include all necessary libraries and configure JAX for stable calculations of high accuracy. We describe the J1–J2 perturbed Heisenberg Hamiltonian using a custom colored graph representation. We develop a Hilbert space and GraphOperator to efficiently simulate interacting spin systems in NetKet.

class TransformerLogPsi(nn.Module):
   L: int
   d_model: int = 96
   n_heads: int = 4
   n_layers: int = 6
   mlp_mult: int = 4


   @nn.compact
   def __call__(self, sigma):
       x = (sigma > 0).astype(jnp.int32)
       tok = nn.Embed(num_embeddings=2, features=self.d_model)(x)
       pos = self.param("pos_embedding",
                        nn.initializers.normal(0.02),
                        (1, self.L, self.d_model))
       h = tok + pos
       for _ in range(self.n_layers):
           h_norm = nn.LayerNorm()(h)
           attn = nn.SelfAttention(
               num_heads=self.n_heads,
               qkv_features=self.d_model,
               out_features=self.d_model,
           )(h_norm)
           h = h + attn
           h2 = nn.LayerNorm()(h)
           ff = nn.Dense(self.mlp_mult*self.d_model)(h2)
           ff = nn.gelu(ff)
           ff = nn.Dense(self.d_model)(ff)
           h = h + ff
       h = nn.LayerNorm()(h)
       pooled = jnp.mean(h, axis=1)
       out = nn.Dense(2)(pooled)
       return out[...,0] + 1j*out[...,1]

We implement a neural quantum state based on Transformer using Flax. We spin the embedding configuration, use multi-layered attention blocks, and aggregate global information through aggregation. We derive the complex log-amplitude, which allows our model to represent highly expressed multi-body wave functions.

def structure_factor(vs, L):
   samples = vs.samples
   spins = samples.reshape(-1, L)
   corr = np.zeros(L)
   for r in range(L):
       corr[r] = np.mean(spins[:,0] * spins[:,r])
   q = np.arange(L) * 2*np.pi/L
   Sq = np.abs(np.fft.fft(corr))
   return q, Sq


def exact_energy(L, J2):
   _, hi, H = make_j1j2_chain(L, J2, total_sz=0.0)
   return nk.exact.lanczos_ed(H, k=1, compute_eigenvectors=False)[0]


def run_vmc(L, J2, n_iter=250):
   g, hi, H = make_j1j2_chain(L, J2, total_sz=0.0)
   model = TransformerLogPsi(L=L)
   sampler = nk.sampler.MetropolisExchange(
       hilbert=hi,
       graph=g,
       n_chains_per_rank=64
   )
   vs = nk.vqs.MCState(
       sampler,
       model,
       n_samples=4096,
       n_discard_per_chain=128
   )
   opt = nk.optimizer.Adam(learning_rate=2e-3)
   sr = nk.optimizer.SR(diag_shift=1e-2)
   vmc = nk.driver.VMC(H, opt, variational_state=vs, preconditioner=sr)
   log = vmc.run(n_iter=n_iter, out=None)
   energy = np.array(log["Energy"]["Mean"])
   var = np.array(log["Energy"]["Variance"])
   return vs, energy, var

We describe a visual structure feature and a direct diagonalization benchmark for verification. We perform full VMC training using the MetropolisExchange sample and Stochastic Reconfiguration. We return the power and variance of the array so that we can analyze the convergence and physical accuracy.

L = 24
J2_values = np.linspace(0.0, 0.7, 6)


energies = []
structure_peaks = []


for J2 in tqdm(J2_values):
   vs, e, var = run_vmc(L, J2)
   energies.append(e[-1])
   q, Sq = structure_factor(vs, L)
   structure_peaks.append(np.max(Sq))

L = 24
J2_values = np.linspace(0.0, 0.7, 6)


energies = []
structure_peaks = []


for J2 in tqdm(J2_values):
   vs, e, var = run_vmc(L, J2)
   energies.append(e[-1])
   q, Sq = structure_factor(vs, L)
   structure_peaks.append(np.max(Sq))

We sweep multiple values of J2 to check the frustrated phase diagram. We train a different strength condition for each session and record the final strength. We calculate the maximum value of the structure factor for each point to find the possible order changes.

L_ed = 14
J2_test = 0.5
E_ed = exact_energy(L_ed, J2_test)


vs_small, e_small, _ = run_vmc(L_ed, J2_test, n_iter=200)
E_vmc = e_small[-1]


print("ED Energy (L=14):", E_ed)
print("VMC Energy:", E_vmc)
print("Abs gap:", abs(E_vmc - E_ed))


plt.figure(figsize=(12,4))


plt.subplot(1,3,1)
plt.plot(e_small)
plt.title("Energy Convergence")


plt.subplot(1,3,2)
plt.plot(J2_values, energies, 'o-')
plt.title("Energy vs J2")


plt.subplot(1,3,3)
plt.plot(J2_values, structure_peaks, 'o-')
plt.title("Structure Factor Peak")


plt.tight_layout()
plt.show()

We benchmark our model against diagonalization directly at small lattice sizes. We calculate the total energy gap between VMC and ED to check the accuracy. We visualize joint behavior, power-phase trends, and structural element responses to summarize the visual information we obtain.

In conclusion, we have combined advanced neural architectures with quantum Monte Carlo methods to probe perturbed magnetization beyond the reach of direct methods for small systems. We verified our Transformer ansatz against Lanczos diagonalization, analyzed the convergence behavior, and derived physically meaningful observations such as the height of the structural element to determine the phase transformation. Also, we have developed a flexible framework that can be extended to high-dimensional lattices, quantization-expected states, dynamic diagnostics, and time-dependent quantum simulations.

Check out Full Use Codes here. Also, feel free to follow us Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.

Need to work with us on developing your GitHub Repo OR Hug Face Page OR Product Release OR Webinar etc.? contact us

admin April 16, 2026

0 0 4 minutes read

Building a Transformer-Based NQS for Frustrated Spin Systems with NetKet

admin

Leave a Reply Cancel reply

How to Keep Peace Between Wife and Mother in Indian Families

2026 Digital Kickoff: Predictions, Trends, and What to Watch

8 Most Productive Blogs to Subscribe to in 2026

INIU B7 Handy Magsafe 5,500mAh Power Bank Review » JaypeeOnline

What UCP Means for Ecommerce SEO: Preparing for Agentic Marketing – International SEO Consultant, Author & Speaker

Why Do We Combine More Effort for Better Results?

admin

Government invites EdTech and AI firms to build safe AI tutors for Disadvantaged Learners

Luma introduces an AI-powered production studio with a faith-based Wonder design

Related Articles

What to expect at WWDC 2026: The much-anticipated Siri update and Apple Intelligence updates

Beyond Instagram: Introducing the next generation of social apps

Moonshot AI Releases Kimi Code CLI: Terminal AI Coding Agent Built with TypeScript for Next-Gen Agents

A Tutorial on Hand-Coding Qualcomm AI Hub Models for Classification, Object Discovery, and Hardware Deployment