For a few years now, a remarkable empirical rule has governed the advance of artificial intelligence: the more compute you pour into a neural network, the smaller its error, and that improvement follows a clean power law. This rule, the scaling law, has become so reliable that it serves as a crude fortune-teller for the impending performance of the next generation of language models. But its grip on the machine‑learning world has inspired an audacious leap. What if the same kind of law governs our ability to approximate the most stubbornly complicated objects in physics — the ground states of correlated quantum many‑body systems? A preprint posted on arXiv (arXiv:2606.02794) by a team from the Flatiron Institute in New York and EPFL in Lausanne now suggests that it does. And the exponent of that power law, it turns out, is a surprisingly sharp measure of a system’s quantum frustration.

To appreciate what this means, it helps to remember where scaling laws first took root. In early 2020, Kaplan and collaborators observed that the test loss of large language models decayed as a power of the training compute, model size, and data, each contributing with its own exponent. That discovery reframed the quest for better AI: suddenly you could draw a curve on a log‑log plot and read off, with some confidence, how much more compute was needed to halve the error. The idea spread rapidly to other modes of machine learning — images, video, even protein folding. Yet until now, no one had asked whether the same lens could be pointed at a problem of a different ontological kind: not predicting the next word in a sentence, but guessing the ground‑state wavefunction of a quantum magnet with hundreds of entangled spins. That is precisely the terrain that Riccardo Rende and his collaborators have begun to map.

The V‑Score and the Promise of a Power Law

The specific challenge the team took on is the antiferromagnetic Heisenberg model on a two‑dimensional lattice, with nearest‑ and next‑nearest‑neighbour couplings — a classic testbed for quantum magnetism and, crucially, for the competition between order and disorder that physicists call frustration. The model’s ground state is not known exactly except in a few special limits, so researchers have to approximate it. In recent years, neural‑network quantum states (NQS) have emerged as a powerful variational strategy: you represent the wavefunction as the output of a neural network, adjusting its weights to minimise the quantum‑mechanical energy. The accuracy of the resulting state is captured by the V‑score, a quantity based on the energy variance that equals zero for the exact ground state and grows with the error.

Into this arena Rende and his colleagues threw a transformer — an architecture originally designed for natural language. But instead of processing sequences of words, it processes “patches” of local spin configurations, akin to a loom that weaves short‑range spin textures into a globally consistent quantum tapestry. The team trained transformers of increasing depth and for increasing numbers of optimisation steps on square and triangular lattices with up to 20 × 20 sites, systematically measuring the V‑score as a function of the total floating‑point operations (FLOPs) consumed.

What they found was a quiet surprise: the data, when plotted on a log‑log scale, traced a clean straight line. In other words, the V‑score followed a power law, V-score ∝ f⁻alpha, where f is the training compute and alpha is an exponent characteristic of the Hamiltonian. You might imagine that a bigger system, with more degrees of freedom, would break this simple rule. But the team noticed something deeper. By rescaling the compute with a factor involving the number of spin sites N — specifically, plotting V‑score against f N^{-ÎČ/alpha} with fitted exponents alpha and beta — the curves for system sizes L = 12, 16, 18, and 20 collapsed onto a single master curve. This is strongly reminiscent of the way thermodynamic data near a phase transition, when expressed in terms of rescaled variables, collapse onto a universal scaling function. Physics had come full circle: a tool imported from machine learning was now revealing the sort of scaling that physicists have long cherished in critical phenomena.

Frustration, Measured by a Single Number

The most striking result, however, lies in how the exponent alpha varies across different spin Hamiltonians. For the unfrustrated square‑lattice Heisenberg model, where spins on neighbouring sites prefer to anti‑align without inner conflict, the team extracted alpha ≈ 1.29. When next‑nearest‑neighbour interactions were turned on, introducing frustration, the exponent dropped to alpha ≈ 0.86. For the triangular lattice, where the geometry itself is geometrically frustrated even without next‑nearest‑neighbour couplings, the unfrustrated version gave alpha ≈ 0.54, while the fully frustrated triangular J₁‑J₂ model plummeted to alpha ≈ 0.40. Smaller alpha means that more compute is needed to cut the V‑score by a fixed factor — the neural network has to work much harder to capture the ground state.

This monotonic decrease with frustration suggests that the exponent is not just a descriptive number but a genuine measure of the representational difficulty of the quantum state. Solving a highly frustrated magnet feels like navigating an escape room where each extra particle adds a puzzle whose clues are contradictory — you cannot satisfy all competing interactions simultaneously, so the ground state becomes a delicate quantum superposition of exponentially many local arrangements. The transformer’s power‑law exponent faithfully registers that increase in complexity, grading the frustration with a single figure.

It’s fitting, in a way, that the upstart quantum many‑body community is now camped just outside the citadel of machine learning, adopting its most powerful empirical laws and plotting how to use them. The scaling law promises a universal benchmark: no matter which variational ansatz one invents, its V‑score versus compute data can be fitted, and the exponent compared. An architecture with a larger alpha is, by this measure, more efficient at learning that particular Hamiltonian.

But any edifice so new carries cracks worth inspecting. An important question, one that echoes the careful distinctions drawn in the original language‑model scaling analyses, is whether the exponent alpha genuinely represents the intrinsic difficulty of the Hamiltonian or instead reflects a specific choice of scaling trajectory. The team varied only the number of transformer layers and the number of training steps; increasing the embedding dimension, the number of attention heads, or the patch size might shift the allocated compute in a different way, potentially altering the observed alpha. The collapse across system sizes argues for a robust relation, but a true clinching test would require varying model size and training steps independently, as Kaplan et al. emphasised.

Then there is the question of universality. The current study tests only one architecture — the transformer — leaving open whether a recurrent neural network or a different NQS design would exhibit the same scaling exponent for the same Hamiltonian. The transformer’s attention mechanism, which proved so effective in language, may be particularly well‑suited to capturing long‑range correlations in quantum spin systems. But if a distinct architecture produces a different exponent, the claim that alpha is a Hamiltonian‑specific “difficulty fingerprint” would need qualification.

The predictive power of the scaling law also awaits scrutiny. The team performed a scaling collapse using data up to L = 20, but the ultimate promise of such a law is to extrapolate to much larger system sizes — say L = 24 or 30 — where exact diagonalisation is impossible and variational benchmarks are scarce. The paper asserts that the transformer ansatz is size‑consistent, yet no concrete prediction for an unseen larger lattice is provided. Demonstrating that the power law holds for L = 24 or 30 with comparable exponents would transform an elegant observation into a practical tool.

The Road Ahead

These caveats do not diminish the significance of the discovery; they sketch the work that lies ahead. As NQS methods continue to evolve, the scaling‑law framework may become the standard yardstick for measuring progress — a kind of “difficulty score” that cuts through the noise of hype and architecture details. Perhaps one day the exponent alpha will be listed alongside the Hamiltonian itself, a fingerprint of how hard it is to represent its ground state. The first step has been taken, and the power law is on the table. The next challenge is to see how far it stretches before it breaks.

References