Brandon Hudgens

28 posts

Brandon Hudgens

Brandon Hudgens

@agentengineer

Owner/CEO of Agentic Solutions | ML/AI Engineer | I like to build things | 2006 Time Magazine's Person of the Year

Katılım Haziran 2025
91 Takip Edilen3 Takipçiler
Brandon Hudgens retweetledi
Mark Gurman
Mark Gurman@markgurman·
Yet still no Instagram for iPad 🤣
Mark Gurman tweet media
English
22
22
492
57.5K
Brandon Hudgens retweetledi
Greg Kamradt
Greg Kamradt@GregKamradt·
We got a call from @xai 24 hours ago “We want to test Grok 4 on ARC-AGI” We heard the rumors. We knew it would be good. We didn’t know it would become the #1 public model on ARC-AGI Here’s the testing story and what the results mean: Yesterday, we chatted with Jimmy from the xAI team, who wanted us to validate their Grok 4 score. They did their own testing on the ARC-AGI-1 & 2 public evaluation set To validate their score (and measure possible overfitting), we self-tested the new model on our semi-private evaluation set We walked them through our testing policy: * No data retention * Model checkpoint must be intended for public use * Temporary increase in rate limits for burst testing They were on board, so we got started Initially, we ran into timeout errors with normal requests, so we switched to streaming. That resolved the issue So, what do these results mean? First, the facts: Grok 4 is now the top-performing publicly available model on ARC-AGI. This even outperforms purpose-built solutions submitted on Kaggle. Second, ARC-AGI-2 is hard for current AI models. To score well, models have to learn a mini-skill from a series of training examples, then demonstrate that skill at test time. The previous top score was ~8% (by Opus 4). Below 10% is noisy Getting 15.9% breaks through that noise barrier, Grok 4 is showing non-zero levels of fluid intelligence But the mission isn’t over. We need new ideas to solve ARC-AGI-2. Scale alone won’t get us there Come work on ARC-AGI with us
ARC Prize@arcprize

Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9% This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA

English
290
787
7.1K
14.7M
Brandon Hudgens retweetledi
Theo - t3.gg
Theo - t3.gg@theo·
Surprise! Grok 4 is not dropping on the API today. I'm sure it will happen in a few months...
English
38
3
218
97.1K
Brandon Hudgens
Brandon Hudgens@agentengineer·
For anyone interested in AI, I can't recommend @natebjones YT channel enough. A refreshing voice in a forest of ill-informed channels
English
0
0
0
2
Brandon Hudgens retweetledi
Mark Gurman
Mark Gurman@markgurman·
It is incredible that Apple design decisions developed over multiple years can be influenced by a week of Twitter and YouTube commentary.
English
323
307
6.6K
480.4K
Brandon Hudgens retweetledi
Theo - t3.gg
Theo - t3.gg@theo·
I've used Pocket for managing my "read later" list for over 15 years. It shuts down today. RIP to a real one.
English
104
23
1.4K
122.1K
Brandon Hudgens retweetledi
allen institute
allen institute@AllenInstitute·
The fireworks in your mind. 🧠✨ This sparkling video shows the neurotransmitter glutamate being released into synapses, made possible by an indicator developed by @abhi_aggarwal1, @PodgorskiLab, and team. #HappyNewYear #NYE
English
19
210
687
71.6K
Brandon Hudgens retweetledi
Simon Willison
Simon Willison@simonw·
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.
English
356
1.2K
12.1K
778.9K
Brandon Hudgens retweetledi
Chris
Chris@Chrisgpt·
Now the only question is did they achieve these with cons@n or no..
Chris tweet media
English
3
1
27
2.7K
Brandon Hudgens retweetledi
nil
nil@dhvanil·
would you rather fight 1000 4o-mini sized agents, or 1 o3-pro sized agent?
English
1
2
14
2.1K