Thang Bui
36 posts

Thang Bui
@thdbui
Senior Lecturer (US equiv Assoc Prof) in Machine Learning @ANUComputing. Previously @Sydney_Uni, @Uber AI and @CambridgeMLG




Collegues in Europe are running this poll about #NeurIPS2025 participation. If in Europe, highly recommended to participate.





Grokking modular arithmetic is widely studied for the seemingly unique emergent abilities of neural networks. Instead, we find that iteratively solving a kernel machine and estimating the Average Gradient Outer Product (AGOP) recovers this phenomenon identically:


🚨Paper alert! Grokking Beyond Neural Networks has just been published at TMLR. openreview.net/forum?id=ux9Br…

Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

We have recently written a new paper on grokking: arxiv.org/abs/2310.17247. We show that the phenomenon is not limited to neural networks, this motivates a new hypothesis which seeks to explain it. See the thread below for more details. @thdbui and @oneill_c.

We have recently written a new paper on grokking: arxiv.org/abs/2310.17247. We show that the phenomenon is not limited to neural networks, this motivates a new hypothesis which seeks to explain it. See the thread below for more details. @thdbui and @oneill_c.




📄 Fresh paper out, led by an awesome undergrad 🌟! We dove into a strategy using classifier guidance & negative prompting to spread word tokens, moving them from the 'mundane' zone 🔄. Result? More creativity in hypothesis without losing clarity! 💡 arxiv.org/abs/2308.07645



