Guan Wang

22 posts

Guan Wang

@makingAGI

CEO of Sapient Intelligence. Exploring the path to AGI through brain-inspired AI. 🧠🤖 #AGI #NeuroAI

Singapore เข้าร่วม Temmuz 2025

37 กำลังติดตาม5.3K ผู้ติดตาม

ทวีตที่ปักหมุด

Guan Wang@makingAGI·21 Tem

🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with neuroscience. 🌟 📄Paper: arxiv.org/abs/2506.21734 💻Code: github.com/sapientinc/HRM

English

228

630

1.3M

Guan Wang@makingAGI·12 Eyl

Hierarchical reasoning works well on large language models!🎉

English

180

1.4K

96.6K

Guan Wang รีทวีตแล้ว

Sapient Intelligence@Sapient_Int·31 Ağu

🔥It’s official-Sapient HRM Discord Community is now live！ This is a place to discuss, connect, and collaborate as we shape HRM’s future together. We will be sharing our latest work, releases, and tips, as well as hosting Q&A sessions💬💬 Hop on this journey with us as we push the boundaries of what HRM and AGI at large can achieve！🙌 ➡️Join us on Discord here discord.gg/sapient

English

4.4K

Guan Wang@makingAGI·17 Ağu

Thanks to @arcprize for reproducing and verifying the results! ARC-AGI-1: public 41% pass@2 - semi private 32% pass@2 ARC-AGI-2: public 4% pass@2 - semi private 2% pass@2 Due to differences in testing environments, a certain amount of variance in results is acceptable. According to tests run on our infrastructure, the open-source version of HRM on our GitHub can achieve a score of 5.4% pass@2 on the ARC-AGI-2. We welcome everyone to run it on your own infra and share your scores~ This is our first submission to the leaderboard, and it's a good starting point. We appreciate everyone for your support and feedback on HRM, both before and after our appearance on the ARC leaderboard. All of this encourages and motivates us to improve. The hierarchical architecture is designed to resolve premature convergence in long-horizon tasks, like master-level Sudoku that takes hours for humans to solve. See the comparison with a simple recurrent Transformer. Such a long chain might not be essential for ARC problems, and we only used a high-low ratio of 1/2. Larger ratios are often needed for optimal performance for Sudoku problems. In the case of ARC-AGI, the success of HRM is a testament to the model's ability to exhibit fluid intelligence - that is, its capability to infer and apply abstract rules from independent and flat examples. We are glad it was discovered in a recent blog post that the outer loop and data augmentation are essential for this ability, and we especially thank @fchollet @GregKamradt @k_schuerholt for pointing this out. Finally, we are accelerating the iteration of the HRM model and continuously pushing its limits, with good progress so far. At the same time, we believe the hierarchical architecture is highly effective in many scenarios. Moving forward, we will make further targeted updates to the architecture and validate it on more applications. We will also release an FAQ to address the key questions raised by the community. 🧠 Stay tuned!

English

333

38.4K

Guan Wang@makingAGI·4 Ağu

❤️ Thanks for highlighting our HRM paper! Apologies for any confusion—we're working on clarifying this thoroughly. Stay tuned for updates!

commotum@commotum

Ok, so this paragraph in isolation looks pretty bad, but based on the code, THEY DIDN'T TRAIN ON THE TEST SET. In fact, THEY DIDN'T PRETRAIN AT ALL. And that's the point of the paper! 1/

English

7.2K

Guan Wang@makingAGI·31 Tem

@taihongtran Search just finds stuff. HRM learns patterns and reasons. We’re testing if it can work as a search after training - stay tuned.

English

1.2K

Tai Tran@taihongtran·22 Tem

@makingAGI What is the differences compare to neural search algorithms?

English

1.5K

Guan Wang@makingAGI·21 Tem

English

228

630

1.3M

Guan Wang@makingAGI·31 Tem

@evilmathkid We have clarified it here: github.com/sapientinc/HRM…

English

1.7K

Mithil Vakde@evilmathkid·22 Tem

@makingAGI @makingAGI The writing is a little unclear, could you please clarify which dataset are the ARC results on? Public train, public eval, private eval? Can't find you on the leaderboard on kaggle

English

4.8K

Guan Wang@makingAGI·31 Tem

@ADarmouni No. Eval data stayed separate. Details here: github.com/sapientinc/HRM…

English

Axel Darmouni@ADarmouni·23 Tem

@makingAGI Something I need clarification: are the Arc-AGI results on data in the training set? You mention public eval in train data, so I suppose you trained on it But is it also the data you evaluate it on? Even if it overfits the training data, results are super cool!

English

2.8K

Guan Wang@makingAGI·31 Tem

@Homo_Immortalis Answers will not be coming soon, but we’re working on it!

English

1.9K

Homo Immortalis@Homo_Immortalis·22 Tem

@makingAGI This is exciting Guan, when do you think AI will be able to answer humanity’s unsolvable questions, like reversing aging, or understanding consciousness?

English

2.4K

Guan Wang@makingAGI·31 Tem

@NewAgeNihilism Thanks! Any questions, email us anytime: research@sapient.inc.

English

1.2K

Nihilist@NewAgeNihilism·22 Tem

@makingAGI Great job, really good paper. I'm still wrapping my head around it

English

3.9K

Guan Wang@makingAGI·31 Tem

@codeslubber Fair point! We’re inspired by brains, not handmade expert-system hierarchies. We let the hierarchy learn and grow from data, not hand-made rules.

English

1.5K

Rob Williams@codeslubber·23 Tem

Will read the paper, but saying that by making the code hierarchical we are using neuroscience is kind of silly no? I am super interested in how these hierarchies manifest. One of the reasons Expert Systems failed was there never was any consensus on how to organize knowledge representation, and encoding everything in a bespoke fashion was not scalable.

English

Guan Wang@makingAGI·31 Tem

@mikeboysen HRM is a way smaller, focused model: A) needs less data, B) answers niche questions without a giant LLM, C) runs cheap.

English

1.4K

Mike Boysen (JTBD/acc)@mikeboysen·22 Tem

@makingAGI What does this mean for consumer cost?

English

13.8K

Guan Wang@makingAGI·31 Tem

@itsAlexGuerrero @grok LLMs training = auto-regressive paradigm. HRM training = RNN thinker with a built-in “stop” signal (ACT) so it knows when to stop. Different playbooks.

English

2.2K

Alex@itsAlexGuerrero·21 Tem

@makingAGI @grok how does this differ from the traditional way LLMs are trained? What does this mean for the future of AI?

English

14.8K

Guan Wang@makingAGI·31 Tem

@smjain Love it! 🎞️ Your TL;DR helps - sharing it.

English

2.5K

Shashank Jain@smjain·23 Tem

@makingAGI wonderful..Here i created a small animation to summarize the concept as per what I understood

English

5.3K

Guan Wang@makingAGI·31 Tem

@notrajivpoddar Nope. LLMs are awesome but architecture-capped. We’re trying to push past those limits.

English

2.6K

Rajiv Poddar@notrajivpoddar·22 Tem

@makingAGI so can i use an llm to generate the samples and train it at test time and provide true general intelligence?

English

12K

Guan Wang@makingAGI·31 Tem

@mysticaltech @ylecun Tagging the legend himself - thanks for the shout-out!

English

3.5K

The Canaanite@mysticaltech·23 Tem

@makingAGI This is properly revolutionary. @ylecun FYI, right down your alley.

English

6.1K

Guan Wang@makingAGI·31 Tem

@JonathanRoseD No plan yet. But code, checkpoints, and demo data are open - feel free to roll your own 🛠️

English

4.3K

Jonathan Dunlap@JonathanRoseD·21 Tem

@makingAGI Will there be a GGUF model release?

Ann Arbor, MI 🇺🇸 English

11.4K

Guan Wang@makingAGI·29 Tem

Thanks for featuring us!😃

Ben Dickson@bendee983

My story on HRM with comments from @makingAGI, CEO of Sapient Intelligence x.com/VentureBeat/st…

English

8.1K

Guan Wang@makingAGI·23 Tem

🌟Exactly my thoughts on the next-gen of AI reasoning. Leveraging insights from neuroscience, our Hierarchical Reasoning Model offers practical, efficient scaling of depth. Not every problem can be solved faster with more processors; sometimes, the key is adding depth.

Konpat Ta Preechakul@konpatp

Some problems can’t be rushed—they can only be done step by step, no matter how many people or processors you throw at them. We’ve scaled AI by making everything bigger and more parallel: Our models are parallel. Our scaling is parallel. Our GPUs are parallel. But what if the real bottleneck isn’t size—but depth?What if the model just didn’t have enough serial steps to get it right? Some problems need depth, not width. This is the Serial Scaling Hypothesis. This is not the same as recent studies in scaling test-time compute, which focus on train vs. test and are agnostic to parallel vs. serial. For example: test-time majority voting increases compute by running models in parallel — but doesn’t help when the task itself is serial. We argue: what really matters is how the compute is structured. And for many real-world problems, it must be serial. Read more at: arxiv.org/abs/2507.12549 or 🧵. (In collaboration with: @layer07_yuxi , Kananart Kuwaranancharoen and @YutongBAI1002 )

English

9.8K

Guan Wang รีทวีตแล้ว

Sapient Intelligence@Sapient_Int·22 Tem

Our co-founder William Chen is going to share more about the open-sourced Hierarchical Reasoning Model (HRM) at #FortuneAISingapore @FortuneMagazine tomorrow, under the panel theme "Beyond Human: AGI And The Future We’re Building"! We are excited about the practical path towards universally capable reasoning systems that rely on architectures, not scale, to reach real AGI. ⏰16:10-16:40 SGT, July 23, Mainstage

English

7.7K

Guan Wang@makingAGI·21 Tem

@ai_for_success Only ~2 GPU hours for pro Sudoku. 50~200 for ARC-AGI 😀

English

471

59.5K

AshutoshShrivastava@ai_for_success·21 Tem

@makingAGI what was the total cost of training?? Also would be interested in full breakdown post.

English

110

38.3K

ค้นพบ

@arcprize @fchollet @GregKamradt @k_schuerholt @taihongtran @evilmathkid @ADarmouni @Homo_Immortalis