

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product ift.tt/mCS5hTE
Neil Mallinar
110 posts

@nmallinar
PhD student @ UCSD, prior to that: Research Intern @ Google Research & MSR NE, Research Engineer at Pryon Inc & IBM Watson.


Emergence in non-neural models: grokking modular arithmetic via average gradient outer product ift.tt/mCS5hTE


Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks. Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically:

Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks. Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically:





Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks. Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically:


Emergence in non-neural models: grokking modular arithmetic via average gradient outer product ift.tt/mCS5hTE

Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks. Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically:


