Neural operator vs. numerical solver. Race.
Two algorithms solve the same problem: the 1D viscous Burgers equation. The finite-difference solver crawls timestep by timestep, the way numerical PDE codes have for decades. The operator surrogate jumps directly from initial condition to final state in one shot. Same answer (to ~2% relative error), 5–50× faster depending on the regime — and the gap only grows with grid size and viscosity. This is the story neural operators tell — FNO, DeepONet, GNO — and the reason they're the hottest direction in scientific ML right now.
What you're actually watching
Both sides solve the same equation — the 1D viscous Burgers equation, the canonical benchmark for nonlinear PDEs combining advection and diffusion:
The numerical solver (left)
Classical explicit finite difference: central differences for spatial derivatives, forward Euler in time. The CFL stability condition forces tiny timesteps — for ν = 0.01 and Nx = 256, that's ~1600 steps to reach t = 0.5. Each step is fast on its own (~30 µs), but they're sequential by construction: you cannot evaluate step n until step n−1 finishes. The animation deliberately renders every intermediate frame so you can see the time marching.
The operator surrogate (right)
Burgers admits the Cole-Hopf transformation: if u = −2ν φx/φ, then φ satisfies the linear heat equation. Heat-equation solutions are diagonal in Fourier modes — each mode just decays exponentially in time. So we compute φ(0, x), transform to frequency space, store the coefficients, and evaluating u at any time t becomes a small O(Nx · K) operation where K is the number of modes we keep.
Why this is the "neural operator story" in miniature
A neural operator like FNO (Li et al. 2021) does conceptually the same thing — it learns a mapping from initial conditions to solution trajectories that's evaluated in frequency space. The big difference: FNO learns this mapping from data for problems where no analytical transform exists (Navier-Stokes, Darcy flow, weather, etc.), trained on solver outputs once, then deployed for thousands of inference-time evaluations at constant cost. The speedup ratio you see here (~200×) is on the low end of what FNO achieves in published benchmarks (often 1000×+).
Try this
- Crank ν up to 0.05 and Nx to 384 — the FD solver gets crushed (~6500 steps), the operator barely moves.
- Drop modes K to 6 — see how the surrogate loses accuracy at sharp shocks but stays fast.
- Try the smoothed step initial condition — Burgers turns this into a propagating shock front.
- The lowest ν = 0.003 is near the edge of stability for the mode-truncated operator — push modes to 40+ to recover accuracy.