Binhang Yuan

59 posts

Binhang Yuan

@Hades317

Asst. Prof.@HKUST, Postdoc.@ETH, Ph.D.@RiceUniversity, ML Systems, fan of @LFC.

Hong Kong Katılım Şubat 2014

403 Takip Edilen227 Takipçiler

Binhang Yuan retweetledi

Yi Wu@jxwuyi·4 Mar

AReaL v1.0 released: Effortless #RL to make your #OpenClaw self-evolve 🚀: •🛠️ One-click agentic RL for any existing agent •📈 Open-source SOTA on tau2-bench •💎 A new PyTorch-native 5D-Parallel Engine Archon •🤖A full #opencode recipe GitHub: github.com/inclusionAI/AR…

English

145

60.2K

Binhang Yuan@Hades317·27 Kas

@BeidiChen 👍

QME

245

Binhang Yuan retweetledi

Beidi Chen@BeidiChen·26 Kas

📘 Holiday read! From Software Engineer to AI Environment Architect 🚀 Tldr of our blog: We see an exciting future where engineers 👩‍💻 won’t stop coding — but the highest leverage shifts to designing the environments 🛝 where AI can think, build, and evolve. 🎬 Demo: Inspired by opinions from @karpathy @RichardSSutton, our newly built framework Vortex shows this in a concrete action: by architecting the right environment in LLM serving systems, an agent from @OpenHandsDev can generate and implement new Sparse Attention algorithms on @sgl_project in a single run and deliver up to 4× ⏩ gains — work that normally takes an ML-systems engineer weeks. - In the short term, these environments let AI agents contribute meaningfully to real engineering work today. - In the long term, they become the playgrounds where future agents learn to surpass today’s limitations 💖. Get too excited by the demo, write a blogpost before the holidays with my great students @chenzhuoming911 @IronSteveZhou who built it: infini-ai-lab.github.io/ai-environment… Vortex code: github.com/Infini-AI-Lab/… Vortex doc: infini-ai-lab.github.io/vortex_torch #MLOps #AIAgents #SystemsEngineering #AIInfrastructure #OpenSourceAI #AIOps #AIFrameworks #SparseAttention #AIResearch

English

155

40.6K

Binhang Yuan retweetledi

Yi Wu@jxwuyi·13 Ağu

Sorry for a typo in my previous post. The correct arxiv paper link is: arxiv.org/pdf/2508.07976…

Yi Wu@jxwuyi

🔍We introduce ASearcher, a search agent trained by end2end RL Large-scale (up to 128 turns) RL with AReaL unlocks Long-Horizon Agentic Search (+20.8/+46.7% on GAIA/xBench) 💻Data, Code&Model: github.com/inclusionAI/AS… 📄Paper: arxiv.org/abs/2508.07976v #Agent #OpenSource #LLM #AGI

English

4.3K

Binhang Yuan retweetledi

Yi Wu@jxwuyi·1 Ağu

Tired intricate system code for RL training? 🤯 We release AReaL-lite – A lightweight AReaL version for AI researchers! 🚀#opensource ✨ Algorithm-first design & APIs🎉 ✨ 80% less code w. 90% AReaL's full efficiency 🎉 ✨ Customizable agentic RL🎉 🔗 github.com/inclusionAI/AR…

English

9.1K

Binhang Yuan retweetledi

Beidi Chen@BeidiChen·24 Tem

🥳

Infini-AI-Lab@InfiniAILab

Huge thanks to @tinytitans_icml for an amazing workshop — see you next year! Honored to receive a Best Paper Award 🏆 Let’s unlock the potential of sparsity! Next up: scaling to hundreds/thousands of rollouts? Or making powerful R1/K2-level LLMs (not just 8B 4-bit models) run on edge devices? Big kudos to @RJ_Sadhukhan, @chenzhuoming911, @haizhong_zheng, @IronSteveZhou, collaborator Emma Strubell, and our advisor @BeidiChen!

ART

145

17.6K

Binhang Yuan retweetledi

Infini-AI-Lab@InfiniAILab·17 Haz

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧵 1/n

GIF

English

222

120.9K

Binhang Yuan retweetledi

Yi Wu@jxwuyi·4 Haz

We release fully async RL system AReaL-boba² for LLM & SOTA code RL w. Qwen3-14B! @Alibaba_Qwen #opensource 🚀system&algorithm co-design → 2.77x faster ✅ 69.1 on LiveCodeBench 🔥 multi-turn RL ready 🔗 Project: github.com/inclusionAI/AR… 📄 Paper: arxiv.org/pdf/2505.24298 1/3👇

English

154

131.6K

Binhang Yuan@Hades317·23 Kas

🤗

Hong Kong 🇭🇰 QME

172

Binhang Yuan retweetledi

Ying Sheng@ying11231·1 Kas

Another angle: What I always encourage people is to believe there is no barrier on whatever you want to work on. Whatever training or inference, nothing is really hard if you spent time and focus. They are all paper tigers. The secret is to just ignore people who told you it’s too late/hard and only listen to people who’d like to help you achieve your goal.

Stas Bekman@StasBekman

Future ML specialization: Inference or Training? Very soon training LLMs will become a domain of a few companies and there will be very little need in experts in LLM training. Especially when LLMs will be at the level of CV cats-vs-dogs quality. Inference expertise on the other hand has a tiny barrier to entry, so it'll become commoditized in no time. It'd be very difficult to compete with others and differentiate one self doing inference work. I suppose finetuning/RAG is going to be somewhat needing experts for some years to come - until foundational models will have all that provided out of the box. And then what? Where do you feel one still could be an ML expert 5-10 years from now? I worked though the dot com bubble and this ML bubble feels too too similar. Hence the asking. Thank you for your insights.

English

498

65.2K

Binhang Yuan retweetledi

HKUST Computer Science and Engineering@HKUSTCSE·25 Eki

We are recruiting! Applications including 1) a cover letter, 2) a full curriculum vitae, 3) names and contact information of at least three referees, 4) a research statement, and 5) a teaching statement should be submitted via facrecruit.hkust.edu.hk.

HKUST Computer Science and Engineering tweet media

English

5.3K

Binhang Yuan retweetledi

VITA Group@VITAGroupUT·18 Eki

1/ 🌟 Excited to announce #Model-#GLUE (#neurips2024 D&B), a new framework designed by an extensive team from UNC, UMD, UT Austin, HKUST, Google, and CMU to #scale pre-trained LLMs efficiently! 🚀 Tackling the challenge of #aggregating disparate pre-trained LLM, we introduce a holistic guideline and benchmarking if you have a large, diverse model zoo "in the wild"! #LLM #AIresearch

English

7.9K

Binhang Yuan retweetledi

Together AI@togethercompute·25 Eyl

🚀 Big news! We’re thrilled to announce the launch of Llama 3.2 Vision Models & Llama Stack on Together AI. 🎉 Free access to Llama 3.2 Vision Model for developers to build and innovate with open source AI. api.together.ai/playground/cha… ➡️ Learn more in the blog together.ai/blog/llama-3-2…

English

252

60.3K

Binhang Yuan retweetledi

Together AI@togethercompute·5 Eyl

🚀 NVIDIA H200 and the Together Kernel Collection (TKC) are coming to Together GPU Clusters: delivering accelerated performance, efficiency, and scalability for AI training, fine-tuning, and inference workloads. ⚡ 🔗 Read the blog post together.ai/blog/nvidia-h2…

GIF

English

36.5K

Binhang Yuan retweetledi

Tianqi Chen@tqchenml·4 Eyl

#MLSys2025 call for papers is out! The conference will be led by the general chair @matei_zaharia , PC chairs @CelineLinatGT, and Gauri Joshi. Consider submitting and bringing your latest works in AI and systems—more details at mlsys.org.

English

23.4K

Binhang Yuan retweetledi

Max Tegmark@tegmark·3 May

I'm excited that people are so interested in our latest paper!

Carlos E. Perez@IntuitMachine

1/n Math Meets AI: Kolmogorov-Arnold Networks Unleash the Power of Composition Imagine a world where deep learning models, the enigmatic engines driving the AI revolution, are no longer shrouded in mystery. What if we could peer into their inner workings, understand their reasoning, and even collaborate with them to uncover the secrets of the universe? This is the promise of Kolmogorov-Arnold Networks (KANs), a revolutionary new architecture poised to transform the landscape of artificial intelligence. Step aside, Multi-Layer Perceptrons (MLPs), the workhorses of deep learning. While your contributions are undeniable, your limitations are becoming increasingly apparent. Your black-box nature hinders interpretability, your inefficiency restricts your potential, and your struggle with high-dimensional data leaves vast realms of knowledge unexplored. The time has come for a new breed of neural networks, one that combines the power of deep learning with the elegance of mathematics and the transparency of human understanding. The core issue with MLPs lies in their structure. While their universal approximation capabilities are well established, their fixed activation functions on nodes and reliance on linear transformations limit their ability to efficiently represent complex functions, especially those with compositional structures. This inefficiency leads to larger models with increased computational costs and hinders interpretability, as understanding the reasoning behind their predictions becomes challenging. Additionally, MLPs often struggle with the curse of dimensionality, where their performance deteriorates as the input data dimensionality increases. KANs address these pain points by drawing inspiration from the Kolmogorov-Arnold representation theorem, which states that any continuous multivariate function can be decomposed into a composition of univariate functions and addition. Instead of fixed activation functions on nodes, KANs employ learnable activation functions on edges, represented by splines. This key difference allows KANs to efficiently learn both the compositional structure of a function and the individual functions within that composition. As a result, KANs achieve superior accuracy compared to MLPs, particularly when dealing with high-dimensional data and complex functions. Furthermore, KANs offer significant advantages in terms of interpretability. Their structure allows for intuitive visualization of the learned functions, providing insights into the model's decision-making process. Additionally, the paper introduces techniques for simplifying KANs without sacrificing accuracy, further enhancing their transparency. This interpretability is crucial for scientific applications where understanding the underlying mechanisms and reasoning behind predictions is essential. The paper demonstrates the capabilities of KANs through various experiments. In data fitting tasks, KANs outperform MLPs in approximating high-dimensional functions and exhibit better scaling laws, meaning their performance degrades less with increasing data dimensionality. In PDE solving, KANs achieve remarkable accuracy with significantly fewer parameters compared to MLPs. Moreover, KANs showcase their potential for scientific discovery by rediscovering known mathematical laws and identifying complex physical phenomena. Prior research has explored the Kolmogorov-Arnold representation theorem in the context of neural networks, but these efforts were limited by restrictions on network depth and width, lack of modern training techniques, and insufficient empirical validation. KANs overcome these limitations by allowing for arbitrary depths and widths, utilizing backpropagation for efficient training, and providing extensive empirical evidence of their superior performance and interpretability. In conclusion, KANs represent a significant advancement in deep learning, offering a promising alternative to MLPs with improved accuracy, efficiency, and interpretability. Their ability to effectively handle compositional structures, high-dimensional data, and complex functions makes them particularly well-suited for scientific applications. As research and development in this area continue, KANs have the potential to revolutionize deep learning and accelerate scientific discovery across various domains.

English

261

37.3K

Binhang Yuan retweetledi

Together AI@togethercompute·13 Mar

Today we are thrilled to share that we’ve raised $106M in a new round led by @SalesforceVC with participation from @coatuemgmt and our existing investors. Our vision is to rapidly bring innovations from research to production and to ultimately build the best platform we can for developers, startups, and enterprises to run generative AI applications built on open-source models at production scale. together.ai/blog/series-a2

English

431

170.7K

Binhang Yuan retweetledi

Beidi Chen@BeidiChen·13 Mar

📢 Announcing our new speculative decoding framework Sequoia ❗️❗️❗️ It can now serve Llama2-70B on one RTX4090 with half-second/token latency (exact❗️no approximation) 🤔Sounds slow as a sloth 🦥🦥🦥??? Fun fact😛: DeepSpeed -> 5.3s / token; 8 x A100: 25ms / token (costs 8 x $18,000 = $140,000+ but an RTX4090 is $1000+😉) You can serve with your 2080Ti too! Curious how? Check it out 👇 Website: infini-ai-lab.github.io/Sequoia-Page Paper: arxiv.org/abs/2402.12374 Code: github.com/Infini-AI-Lab/…

GIF

English

119

682

104.1K

Binhang Yuan retweetledi

Christopher De Sa@chrismdesa·28 Şub

We are excited to announce this year’s keynote speakers for #MLSys2024: Jeff Dean @JeffDean, Zico Kolter @zicokolter, and Yejin Choi @YejinChoinka! MLSys this year will be held in Santa Clara on May 13–16. More details at mlsys.org.

English

70.6K

Keşfet

@BeidiChen @karpathy @RichardSSutton @OpenHandsDev @sgl_project @chenzhuoming911 @IronSteveZhou @Alibaba_Qwen