Building Transformer-Based NQS for Frustrated Spin Systems with NetKet

The intersection of many-body physics and deep studying has opened a brand new frontier: Neural Quantum States (NQS). While conventional strategies wrestle with high-dimensional annoyed techniques, the worldwide consideration mechanism of Transformers offers a robust instrument for capturing advanced quantum correlations.

In this tutorial, we implement a research-grade Variational Monte Carlo (VMC) pipeline utilizing NetKet and JAX to resolve the annoyed J1–J2 Heisenberg spin chain. We will:

Build a customized Transformer-based NQS structure.
Optimize the wavefunction utilizing Stochastic Reconfiguration (pure gradient descent).
Benchmark our outcomes towards actual diagonalization and analyze emergent quantum phases.

By the top of this information, you should have a scalable, bodily grounded simulation framework able to exploring quantum magnetism past the attain of classical actual strategies.

Copy Code

!pip -q set up --upgrade pip
!pip -q set up "netket" "flax" "optax" "einops" "tqdm"


import os
os.environ["XLA_PYTHON_CLIENT_PREALLOCATE"] = "false"


import netket as nk
import jax
import jax.numpy as jnp
import numpy as np
import matplotlib.pyplot as plt
from flax import linen as nn
from tqdm import tqdm


jax.config.replace("jax_enable_x64", True)
print("JAX units:", jax.units())


def make_j1j2_chain(L, J2, total_sz=0.0):
   J1 = 1.0
   edges = []
   for i in vary(L):
       edges.append([i, (i+1)%L, 1])
       edges.append([i, (i+2)%L, 2])
   g = nk.graph.Graph(edges=edges)
   hello = nk.hilbert.Spin(s=0.5, N=L, total_sz=total_sz)
   sigmaz = np.array([[1,0],[0,-1]], dtype=np.float64)
   mszsz = np.kron(sigmaz, sigmaz)
   change = np.array(
       [[0,0,0,0],
        [0,0,2,0],
        [0,2,0,0],
        [0,0,0,0]], dtype=np.float64
   )
   bond_ops = [
       (J1*mszsz).tolist(),
       (J2*mszsz).tolist(),
       (-J1*exchange).tolist(),
       (J2*exchange).tolist(),
   ]
   bond_colors = [1,2,1,2]
   H = nk.operator.GraphOperator(hello, g, bond_ops=bond_ops, bond_ops_colors=bond_colors)
   return g, hello, H

We set up all required libraries and configure JAX for secure high-precision computation. We outline the J1–J2 annoyed Heisenberg Hamiltonian utilizing a customized coloured graph illustration. We assemble the Hilbert house and the GraphOperator to effectively simulate interacting spin techniques in NetKet.

Copy Code

class TransformerLogPsi(nn.Module):
   L: int
   d_model: int = 96
   n_heads: int = 4
   n_layers: int = 6
   mlp_mult: int = 4


   @nn.compact
   def __call__(self, sigma):
       x = (sigma > 0).astype(jnp.int32)
       tok = nn.Embed(num_embeddings=2, options=self.d_model)(x)
       pos = self.param("pos_embedding",
                        nn.initializers.regular(0.02),
                        (1, self.L, self.d_model))
       h = tok + pos
       for _ in vary(self.n_layers):
           h_norm = nn.LayerNorm()(h)
           attn = nn.SelfAttention(
               num_heads=self.n_heads,
               qkv_features=self.d_model,
               out_features=self.d_model,
           )(h_norm)
           h = h + attn
           h2 = nn.LayerNorm()(h)
           ff = nn.Dense(self.mlp_mult*self.d_model)(h2)
           ff = nn.gelu(ff)
           ff = nn.Dense(self.d_model)(ff)
           h = h + ff
       h = nn.LayerNorm()(h)
       pooled = jnp.imply(h, axis=1)
       out = nn.Dense(2)(pooled)
       return out[...,0] + 1j*out[...,1]

We implement a Transformer-based neural quantum state utilizing Flax. We encode spin configurations into embeddings, apply multi-layer self-attention blocks, and combination world data by means of pooling. We output a posh log-amplitude, permitting our mannequin to symbolize extremely expressive many-body wavefunctions.

Copy Code

def structure_factor(vs, L):
   samples = vs.samples
   spins = samples.reshape(-1, L)
   corr = np.zeros(L)
   for r in vary(L):
       corr[r] = np.imply(spins[:,0] * spins[:,r])
   q = np.arange(L) * 2*np.pi/L
   Sq = np.abs(np.fft.fft(corr))
   return q, Sq


def exact_energy(L, J2):
   _, hello, H = make_j1j2_chain(L, J2, total_sz=0.0)
   return nk.actual.lanczos_ed(H, okay=1, compute_eigenvectors=False)[0]


def run_vmc(L, J2, n_iter=250):
   g, hello, H = make_j1j2_chain(L, J2, total_sz=0.0)
   mannequin = TransformerLogPsi(L=L)
   sampler = nk.sampler.MetropolisExchange(
       hilbert=hello,
       graph=g,
       n_chains_per_rank=64
   )
   vs = nk.vqs.MCState(
       sampler,
       mannequin,
       n_samples=4096,
       n_discard_per_chain=128
   )
   choose = nk.optimizer.Adam(learning_rate=2e-3)
   sr = nk.optimizer.SR(diag_shift=1e-2)
   vmc = nk.driver.VMC(H, choose, variational_state=vs, preconditioner=sr)
   log = vmc.run(n_iter=n_iter, out=None)
   vitality = np.array(log["Energy"]["Mean"])
   var = np.array(log["Energy"]["Variance"])
   return vs, vitality, var

We outline the construction issue observable and the precise diagonalization benchmark for validation. We implement the complete VMC coaching routine utilizing MetropolisExchange sampling and Stochastic Reconfiguration. We return vitality and variance arrays in order that we are able to analyze convergence and bodily accuracy.

Copy Code

L = 24
J2_values = np.linspace(0.0, 0.7, 6)


energies = []
structure_peaks = []


for J2 in tqdm(J2_values):
   vs, e, var = run_vmc(L, J2)
   energies.append(e[-1])
   q, Sq = structure_factor(vs, L)
   structure_peaks.append(np.max(Sq))

Copy Code

L = 24
J2_values = np.linspace(0.0, 0.7, 6)


energies = []
structure_peaks = []


for J2 in tqdm(J2_values):
   vs, e, var = run_vmc(L, J2)
   energies.append(e[-1])
   q, Sq = structure_factor(vs, L)
   structure_peaks.append(np.max(Sq))

We sweep throughout a number of J2 values to discover the annoyed section diagram. We prepare a separate variational state for every coupling energy and report the ultimate vitality. We compute the construction issue peak for every level to detect doable ordering transitions.

Copy Code

L_ed = 14
J2_test = 0.5
E_ed = exact_energy(L_ed, J2_test)


vs_small, e_small, _ = run_vmc(L_ed, J2_test, n_iter=200)
E_vmc = e_small[-1]


print("ED Energy (L=14):", E_ed)
print("VMC Energy:", E_vmc)
print("Abs hole:", abs(E_vmc - E_ed))


plt.determine(figsize=(12,4))


plt.subplot(1,3,1)
plt.plot(e_small)
plt.title("Energy Convergence")


plt.subplot(1,3,2)
plt.plot(J2_values, energies, 'o-')
plt.title("Energy vs J2")


plt.subplot(1,3,3)
plt.plot(J2_values, structure_peaks, 'o-')
plt.title("Structure Factor Peak")


plt.tight_layout()
plt.present()

We benchmark our mannequin towards actual diagonalization on a smaller lattice measurement. We compute absolutely the vitality hole between VMC and ED to judge accuracy. We visualize convergence conduct, phase-energy traits, and structure-factor responses to summarize the bodily insights we receive.

In conclusion, we built-in superior neural architectures with quantum Monte Carlo methods to discover annoyed magnetism past the attain of small-system actual strategies. We validated our Transformer ansatz towards Lanczos diagonalization, analyzed convergence conduct, and extracted bodily significant observables similar to construction issue peaks to detect section transitions. Also, we established a versatile basis that we are able to lengthen towards higher-dimensional lattices, symmetry-projected states, entanglement diagnostics, and time-dependent quantum simulations.

Check out the Full Implementation Codes here. Also, be at liberty to observe us on Twitter and don’t neglect to hitch our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to accomplice with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and many others.? Connect with us

The submit Building Transformer-Based NQS for Frustrated Spin Systems with NetKet appeared first on MarkTechPost.

Building Transformer-Based NQS for Frustrated Spin Systems with NetKet

AbstRaL: Teaching LLMs Abstract Reasoning via Reinforcement to Boost Robustness on GSM Benchmarks

AREAL: Accelerating Large Reasoning Model Training with Fully Asynchronous Reinforcement Learning

Meet ‘North Mini Code’: Cohere’s 30B Open-Weight Mixture-of-Experts Model With 3B Active Parameters for Agentic Coding

A Coding Implementation of Crawl4AI for Web Crawling, Markdown Generation, JavaScript Execution, and LLM-Based Structured Extraction

Sakana AI and NVIDIA Introduce TwELL with CUDA Kernels for 20.5% Inference and 21.9% Training Speedup in LLMs

Zyphra Release Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models That Cut Time-to-First-Token by About an Order of Magnitude

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!