A neural network learns division
by growing a Fourier Lissajous structure

mod-97 division · seed — · grokking step ≈ —
The 96 invertible remainders self-organize around two learned Fourier modes. As they crystallize, the classes leave a cluster at origin and trace a 3D Lissajous curve — every axis a real projection of WU.
96 classes (invertible remainders)
dlog-order trace
idealized (K₁, K₂) Lissajous reference

Fourier Lissajous embedding

zKi(c) = ⟨WU[c], fKi⟩,  θi = arg zKi,   ri = |zKi|
(x, y, z) = (r1·cos θ1,  r2·cos θ2,  r1·sin θ1)
Pre-grok ri ≈ 0 ⇒ cluster at origin. Post-grok ⇒ (, ) Lissajous.
W_U — unembedding (learned)
(cos 25t, sin 11t)

Spectral heartbeat Fourier power, %

k = —
k = —
k = —

Accuracy test set

W_E — embedding (control)

Same basis, no Fourier structure.
Training progress
1. Before grokking — noise
step —
No structure. Points scatter; spectrum is flat.
2. Spectral ignition
step —
Modes k = , , begin to grow. Torus emerges.
3. Algorithm locks in
step —
Rapid organization. The (, ) knot crystallizes.
4. Inside the Lissajous
post-roll
The algorithm is stable. The camera flies inside the (K₁, K₂) Lissajous curve.