Sabitlenmiş Tweet
Bing He
27 posts

Bing He
@binghe2727
Applied Scientist II@Amazon | LLM | CS PhD@Georgia Tech. Research areas: ML/AI, NLP, LLM
Katılım Mart 2020
386 Takip Edilen97 Takipçiler

Check our article on why harness matters and key finding of self-evolving agent
Henry Lu@HenryL_AI
English

Our work on LLM Reasoning "A Reward-Guided Dual-Phase Framework for Adaptive Inference-Time Reasoning" (arxiv.org/pdf/2509.25420) has been accepted at ACL 2026 Findings! Key insight: LLM reasoning has a natural planning-execution structure, but existing tree-search methods ignore
English

@HenryL_AI @amazon Huge milestone for self-improving agents. A-Evolve makes it possible to turn a base agent into a continuously improving system with almost no manual overhead.
English

Launch Post🧬 A-Evolve: The PyTorch Moment for Self-evolving AI
Today we at @amazon launch the universal infrastructure that turns any agent into a self-improving SOTA agent — zero human intervention.
You give it a base agent → it returns a continuously evolving Top-10 agent.
3 lines of code. 0 hours of manual harness engineering:
🟢 MCP-Atlas → 79.4% (#1) +3.4pp
🔵 SWE-bench Verified → 76.8% (~#5) +2.6pp
🟣 Terminal-Bench 2.0 → 76.5% (~#7) +13.0pp
🟡 SkillsBench → 34.9% (#2) +15.2pp
Thanks @binghe2727 @YisiSang @sammyershi @linminhua16 for the contribution!
#AgenticAI #AEvolve #SelfImprovingAgents

English

Take a look at our self-evolving agent! The self-evolution era.
Henry Lu@HenryL_AI
Launch Post🧬 A-Evolve: The PyTorch Moment for Self-evolving AI Today we at @amazon launch the universal infrastructure that turns any agent into a self-improving SOTA agent — zero human intervention. You give it a base agent → it returns a continuously evolving Top-10 agent. 3 lines of code. 0 hours of manual harness engineering: 🟢 MCP-Atlas → 79.4% (#1) +3.4pp 🔵 SWE-bench Verified → 76.8% (~#5) +2.6pp 🟣 Terminal-Bench 2.0 → 76.5% (~#7) +13.0pp 🟡 SkillsBench → 34.9% (#2) +15.2pp Thanks @binghe2727 @YisiSang @sammyershi @linminhua16 for the contribution! #AgenticAI #AEvolve #SelfImprovingAgents
English

📌Blog link: binghe2727.github.io/Revisit-DeepSe…
Authors: Bing He, zhan shi, Rui Sun, Hanqing Lu
#LLM #DeepSeek #MoE #ReasoningModels #PostTraining #RLHF #AIResearch #GenAI
English

@KaiShu0327 Big congrats, Kai. Looking forward to seeing what is coming out from your research group.
English

we conduct a large-scale data-driven analysis and find that polite and evidenced correction works well. More can be found at our paper and shared code&data.
Arxiv:arxiv.org/abs/2403.04852
Code and Data: github.com/claws-lab/resp…
English

@UsmanNaseem87 Big congrats, Usman. This is amazing. Looking forward to the research output.🎉
English

Excited to announce that my application to OpenAI's API Researcher Access Program has been accepted! Grateful for this opportunity to delve deeper into cutting-edge AI research #OpenAI #AIResearch 🚀🔍

English

🌟 Proud Advisor Moment 🌟
Glad to share that my Ph.D. student Yushun Dong @Yushun_Dong will join the Florida State University, CS Department as a Tenure-track Assistant Professor this Fall! He will be the second Ph.D. student in my group to join academia after graduation.
English

📣 We have two memes papers accepted for #Web4Good #TheWebConf2024! 📷 "MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation" and "Modularized Networks for Few-shot Hateful Meme Detection" We will put up the pre-prints soon! Congrats to the team! 🥳
English





