Sapient Intelligence

19 posts

Sapient Intelligence

@Sapient_Int

We are building self-evolving Machine Intelligence to solve the world's most challenging problems.

Palo Alto, CA 가입일 Temmuz 2024

28 팔로잉1.6K 팔로워

Sapient Intelligence@Sapient_Int·11 Şub

We were honored to support the global AI community as a Gold Sponsor of the #AAAI26 Conference on Artificial Intelligence. It was truly inspiring to connect with so many brilliant minds across the industry. The future of AGI isn’t just being imagined, it is being built.

English

434

Sapient Intelligence@Sapient_Int·21 Oca

Our Staff Research Scientist, Tech Lead Yasin Abbasi Yadkori will be giving a presentation in HALL 4 at 11am. Come join us in discussing the path to AGI 👏 #AAAI2026 #SINGAPOREEXPO #sapientintelligence #HRM

English

637

Sapient Intelligence@Sapient_Int·21 Oca

Join us at Booth A17, HALL 2 at the AAAI conference from Jan 22-Jan 25🔥 See HRM reason in live! #AAAI2026 #SINGAPOREEXPO #sapientintelligence #HRM

English

621

Sapient Intelligence@Sapient_Int·20 Kas

Thank you @Bloomberg for featuring us! We are guided by the belief that brain-inspired reasoning is the road to AGI, and we continue to advance this vision with unwavering determination🚀 bloomberg.com/news/features/…

English

845

Sapient Intelligence@Sapient_Int·17 Kas

Proud to share that TRM, derived from our HRM model, is highlighted in Nature ! 🎉🎉🎉 This marks an important step forward for HRM-based reasoning systems, demonstrating the strength of small, structured models in complex reasoning tasks.💡

English

805

Sapient Intelligence@Sapient_Int·12 Eyl

Wang et al.’s new paper aligns with our belief that hierarchical reasoning is a working principle contributing to the scaling and strengthening of reasoning in many, if not all, AI models (LLMs, in this case). We welcome further exploration of hierarchical reasoning across models and domains to help equip AI with advanced reasoning capabilities!

Guan Wang@makingAGI

Hierarchical reasoning works well on large language models!🎉

English

2.7K

Sapient Intelligence@Sapient_Int·31 Ağu

🔥It’s official-Sapient HRM Discord Community is now live！ This is a place to discuss, connect, and collaborate as we shape HRM’s future together. We will be sharing our latest work, releases, and tips, as well as hosting Q&A sessions💬💬 Hop on this journey with us as we push the boundaries of what HRM and AGI at large can achieve！🙌 ➡️Join us on Discord here discord.gg/sapient

English

4.4K

Sapient Intelligence@Sapient_Int·22 Ağu

Google is expanding its AI mode to 180 countries, offering users a personalized restaurant reservation service. By providing specific details such as date, time, and number of people, AI can precisely filter and recommend restaurants that meet the criteria, greatly enhancing the efficiency and personalization of the search experience. This innovation demonstrates the potential of AI in daily life, further integrating artificial intelligence with user needs. However, as AI continues to enhance its personalized recommendations and predictive capabilities, balancing data privacy, user autonomy, and technological advancement remains an important issue for the industry to continuously address.theverge.com/news/763367/go…

English

1.3K

Sapient Intelligence@Sapient_Int·18 Ağu

We are on the @arcprize leaderboard now - a good starting point! Meanwhile, we are accelerating the iteration and application of the HRM model; stay tuned!

Guan Wang@makingAGI

Thanks to @arcprize for reproducing and verifying the results! ARC-AGI-1: public 41% pass@2 - semi private 32% pass@2 ARC-AGI-2: public 4% pass@2 - semi private 2% pass@2 Due to differences in testing environments, a certain amount of variance in results is acceptable. According to tests run on our infrastructure, the open-source version of HRM on our GitHub can achieve a score of 5.4% pass@2 on the ARC-AGI-2. We welcome everyone to run it on your own infra and share your scores~ This is our first submission to the leaderboard, and it's a good starting point. We appreciate everyone for your support and feedback on HRM, both before and after our appearance on the ARC leaderboard. All of this encourages and motivates us to improve. The hierarchical architecture is designed to resolve premature convergence in long-horizon tasks, like master-level Sudoku that takes hours for humans to solve. See the comparison with a simple recurrent Transformer. Such a long chain might not be essential for ARC problems, and we only used a high-low ratio of 1/2. Larger ratios are often needed for optimal performance for Sudoku problems. In the case of ARC-AGI, the success of HRM is a testament to the model's ability to exhibit fluid intelligence - that is, its capability to infer and apply abstract rules from independent and flat examples. We are glad it was discovered in a recent blog post that the outer loop and data augmentation are essential for this ability, and we especially thank @fchollet @GregKamradt @k_schuerholt for pointing this out. Finally, we are accelerating the iteration of the HRM model and continuously pushing its limits, with good progress so far. At the same time, we believe the hierarchical architecture is highly effective in many scenarios. Moving forward, we will make further targeted updates to the architecture and validate it on more applications. We will also release an FAQ to address the key questions raised by the community. 🧠 Stay tuned!

English

1.7K

Sapient Intelligence@Sapient_Int·13 Ağu

Bigger ≠ Better. The GPT-5 rollout reminded everyone that raw scale isn’t a strategy. Real value now lives in agent reliability, not leaderboard one-shots. Our stance: optimize for closed-loop task success (plans → tools → checks → handoff), not just next-token accuracy. We benchmark Sapient HRM against process metrics: tool-call precision, recovery after tool error, and end-to-end SLA success.

English

1.5K

Sapient Intelligence@Sapient_Int·24 Tem

At the “Beyond Human: AGI And The Future We’re Building” Town Hall at Fortune Brainstorm AI 2025 in Singapore, William outlined our thoughts for AGI: “We’re exploring new architectures to push the boundaries - to make AI think like a human, not just model probabilities. True AGI will not only advance the AI frontier but also help with everyday tasks and generate real‑world revenue.” #AGI #FortuneAISingapore Want to see the full discussion? 📺 Watch here: fortune.com/videos/watch/t…

English

3.4K

Sapient Intelligence@Sapient_Int·22 Tem

Our co-founder William Chen is going to share more about the open-sourced Hierarchical Reasoning Model (HRM) at #FortuneAISingapore @FortuneMagazine tomorrow, under the panel theme "Beyond Human: AGI And The Future We’re Building"! We are excited about the practical path towards universally capable reasoning systems that rely on architectures, not scale, to reach real AGI. ⏰16:10-16:40 SGT, July 23, Mainstage

English

7.7K

Sapient Intelligence@Sapient_Int·22 Tem

We are sharing the code openly so everyone can build on it. ❤️

Guan Wang@makingAGI

🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with neuroscience. 🌟 📄Paper: arxiv.org/abs/2506.21734 💻Code: github.com/sapientinc/HRM

English

7.6K

Sapient Intelligence@Sapient_Int·11 Mar

Hierarchical Recurrent Models towards AGI Excited to have Sapient Intelligence’s Yue Wu share insights on our Sapient-H Architecture alongside @nvidia in Wuhan, China. Stay tuned—something interesting is brewing!

English

2.5K

Sapient Intelligence@Sapient_Int·27 Oca

Add on to point #1 As for DeepSeek's fine-grained MoE architecture, the first advantage is that it can reduce the communication volume of MoE's dispatch and combine by 50% compared to BF16, while based on DeepSeek's tech report, the communication-to-computation ratio is roughly 1:1 with FP8. The second advantage is that FP8 GEMMs are faster than BF16 GEMMs on Hopper GPUs (hardware spec is 2× throughput, but the practical speedup is lower, and DeepSeek adopted an online block-wise/tile-wise quantization strategy which has a larger overhead). The third advantage is memory saving, which can be translated into training efficiency by e.g., increasing the number of micro-batches in PP.

English

1.7K

Sapient Intelligence@Sapient_Int·27 Oca

Respectfully disagree: 1. Most SoTA models are trained in BF16 (some operation is mixed precision, but main activations and GEMMs are in BF16), so it's not a FP32→FP8 leap. Also, memory savings won't directly translate into training efficiency. 2. DeepSeek won't do this compression during pretraining. However, the low rank structure of the Q/K/V projection can maintain a low computation cost while increasing the number of attention heads (DeepSeek-R1 has significantly more attention heads than Qwen/Llama), which can increase the capacity of the model. Of course, this optimization can help RL rollouts, but DeepSeek didn't disclose its RL training efficiency. 3. Inference speed can only help RL rollouts, but DeepSeek didn't disclose its RL training efficiency. MTP won't make pretraining faster, but it will make pretraining better -- effectively make it more efficient. 4. DeepSeek didn't train their model on consumer-grade GPUs.

Jared Friedman@snowmaker

Lots of hot takes on whether it's possible that DeepSeek made training 45x more efficient, but @doodlestein wrote a very clear explanation of how they did it. Once someone breaks it down, it's not hard to understand. Rough summary: * Use 8 bit instead of 32 bit floating point numbers, which gives massive memory savings * Compress the key-value indices which eat up much of the VRAM; they get 93% compression ratios * Do multi-token prediction instead of single-token prediction which effectively doubles inference speed * Mixture of Experts model decomposes a big model into small models that can run on consumer-grade GPUs

English

10K

Sapient Intelligence@Sapient_Int·22 Oca

DeepSeek just did something very cruel and killed the rabbit... now when people talk about R1, they don't mean the orange box anymore.

DeepSeek@deepseek_ai

🚀 DeepSeek-R1 is here! ⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & technical report 🏆 MIT licensed: Distill & commercialize freely! 🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today! 🐋 1/n

English

1.6K

Sapient Intelligence@Sapient_Int·1 Oca

Greetings from Sapient Intelligence! Happy New Year! LFG AGI 2025!