He Ye

81 posts

He Ye

@ye_he_ye

Assistant professor at @ucl, co-founder of @Euni_AI, working on code agents.

Pittsburgh, PA Katılım Şubat 2024

199 Takip Edilen250 Takipçiler

He Ye@ye_he_ye·12 Şub

Excited to introduce our new work ContextBench, designed to evaluate coding agents’ ability to retrieve the right context from large codebases during SE task resolution process!!

EuniAI@Euni_AI

We just released ContextBench 🎉 A benchmark built to answer a question many repo-level evaluations still miss: Do coding agents truly retrieve and use the right context, or do they just get lucky?👀✨ 📊 Highlights 🧩 1,136 real-world issues across 66 repos and 8 languages 🧠 Expert verified gold contexts at file, block, and line levels 👣 Full trajectory tracking of what the agent actually reads and explores 📈 Metrics covering Recall, Precision, F1, Efficiency, and Usage Drop 🔍 Key Findings 1️⃣ Complex agentic scaffolds do not improve context retrieval quality 😅 In many cases, they introduce over-engineering, echoing "The Bitter Lesson" in AI research 2️⃣ Many SOTA LLMs favor high recall over precision 📉 They retrieve more context, but also much more noise 3️⃣ Retrieved does not mean utilized ❗ Agents often inspect the right code but fail to incorporate it into the final patch 4️⃣ Retrieval strategies that are more balanced tend to achieve stronger Pass @1 while keeping compute cost reasonable ⚖️✨ 🌐 Homepage 👉 contextbench.github.io 📄 Paper 👉 arxiv.org/abs/2602.05892 💻 Code 👉 github.com/EuniAI/Context… 🗂️ Dataset 👉 huggingface.co/datasets/Conte…

English

165

He Ye@ye_he_ye·7 Şub

@Chasen86341870 @NUSComputing Big congratulations to Chengpeng🥳🥳🥳

English

541

He Ye retweetledi

Chengpeng Wang@Chasen86341870·7 Şub

Excited to share that I will join NUS (@NUSComputing) as an Assistant Professor this Aug! 🎉 I’m recruiting Ph.D./RAs/interns interested in Program Analysis, Code LLMs, and Agents. 🔥 Self-motivated students with strong backgrounds are especially welcome.

English

350

23.4K

He Ye@ye_he_ye·7 Şub

Join Chengpeng's group at NUS to do awesome research🥳🥳🥳

Chengpeng Wang@Chasen86341870

English

665

Zhenpeng Chen@ZhenpengChen18·10 Ara

Excited to share that I’ve recently joined the School of Software at Tsinghua University as an Assistant Professor. Looking forward to the journey ahead! 💻🚀

English

1.2K

He Ye@ye_he_ye·10 Ara

@ZhenpengChen18 Huge congratulations🥳🥳🥳

English

109

He Ye retweetledi

Delysium - $AGI 🟨@The_Delysium·21 Kas

Our AI coding model Prometheus, co-developed with Professor He Ye @ye_he_ye, @ucl SSE Team and @Euni_AI, has surpassed @Google's newly released Gemini 3 Pro Preview @geminiapp and achieved Top 5 globally among independent models on SWE-Bench. - Prometheus GitHub: github.com/EuniAI/Prometh… - SWE-Bench results: swebench.com (select "Open Scaffold + All Tags") Stay tuned.

English

5.9K

He Ye retweetledi

EuniAI@Euni_AI·16 Eki

We’re excited to announce our partnership with @The_Delysium — together, we’re setting open, autonomous, multilingual, and cost-efficient standards for AI coding. By bridging AI systems engineering with Web3 best practices — @LucyOSAI and the YKILY Network — we aim to make AI development more transparent, verifiable, and accessible for everyone.

English

490

Yiling Lou@yiling__LOU·6 Eki

Thrilled to announce that I'll be joining UIUC CS @siebelschool as an Assistant Professor in Spring 2026! 📢 I’m looking for Fall '26 PhD students who are interested in the intersection of Software Engineering and AI, especially in LLM4Code and Code Agents. Please drop me an email if you are interested in working with me.

English

700

79.3K

He Ye@ye_he_ye·7 Eki

@yiling__LOU @siebelschool Congratulations🥳🥳🥳

English

492

He Ye retweetledi

Zhaoyang Chu@zhaoyang_c68411·3 Eki

Great vibes at #GraphSummit London 🎉 Learned a ton + made great memories by Big Ben ✨ @Euni_AI, we share a vision with the graph community: 👉 LLMs need a better way to understand codebases — knowledge graphs stand between them. Excited for what’s ahead! 🚀 #AI #Neo4j

Neo4j@neo4j

🎥 NOW at #GraphSummit London @emileifrem about our latest #AI announcements 🎉 Read more: bit.ly/3VL1Oze

English

504

He Ye retweetledi

EuniAI@Euni_AI·30 Eyl

🚀 New milestone: Hermes hits 50% resolve rate on Terminal-Bench! 📊 Performance: ✅ 91.67% easy ⚡ 54.55% medium 🔥 20.83% hard Now ranked #6 on the leaderboard, Hermes proves the power of iterative AI agents in complex terminal environments. #LLM #Agents #AIResearch

English

619

He Ye@ye_he_ye·29 Eyl

Attending OOPSLA/ICFP 2025 in Singapore? 🌏✈️ JOIN us on October 15th, we warmly welcome you to join the Workshop on Language Models for Programming Languages (LMPL) co-hosted at SPLASH 2025 ! 👉conf.researchr.org/home/icfp-spla… 🔥This year we are excited to have accepted 22 papers! 🥳

English

1.6K

He Ye retweetledi

EuniAI@Euni_AI·23 Eyl

🚀 New milestone! Prometheus + GPT-5 → 70% resolve rate on SWE-bench Verified. State-of-the-art performance for autonomous code agents. The future of self-managing software is here. #LLM #Agents #SoftwareEngineering #AutomatedProgramRepair #AIResearch #OpenSource #AgenticAI

English

244

He Ye retweetledi

Sergey Mechtaev@sergey_mechtaev·27 Eyl

Prompting is the new programming, but NL suffers from ambiguity. SpecFix (ASE'25) is the first to automatically repair ambiguous programming problem descriptions, resulting in ↑ Pass@k, and its repairs generalizing across models. arxiv.org/abs/2505.07270 github.com/msv-lab/SpecFix

English

512

He Ye retweetledi

EuniAI@Euni_AI·14 Eyl

🚀 Benchmark update 🚀 Prometheus + DeepSeek-Chat reached 35.33% resolve rate on SWE-bench Lite — achieving state-of-the-art performance in autonomous code agents. This is just the beginning 👀

English

581

He Ye retweetledi

EuniAI@Euni_AI·13 Eyl

🧠 Athena Memory System Launched! 🧠 An innovative memory system purpose-built for autonomous code agents: ⚡️ Long-term learning & memory ⚡️ Deeper context retention ⚡️ Greater adaptability Athena marks the next leap toward truly self-managing, intelligent code agents.

English

420

He Ye retweetledi

EuniAI@Euni_AI·11 Eyl

🧠 Prometheus keeps evolving. With new integrations & features, we’re pushing the boundaries of what code agents can do: •Context-rich problem solving •Flexible workflows •Real-world performance breakthroughs

English

176

He Ye retweetledi

EuniAI@Euni_AI·11 Eyl

🚀 Prometheus Major Update 🚀 Big leap for autonomous code agents: 🔗 MCP Integration for stronger context & multimodal interaction 🔧 Web Search + Custom Build Commands 📊 44% solve rate on SWE-bench Verified The future of self-managing code starts now.

English

586

He Ye retweetledi

EuniAI@Euni_AI·6 Eyl

🚀 Introducing EuniAI 🚀 We’re building autonomous code agents to: ⚡ Resolve issues automatically ⚡ Understand codebases seamlessly ⚡ Cut bug costs & boost productivity 👀 Demo invites coming soon! 🌐 euni.ai 💼 Follow us on LinkedIn for deeper updates.

English

1.1K

Keşfet

@Chasen86341870 @NUSComputing @ZhenpengChen18 @ucl @Euni_AI @Google @geminiapp @The_Delysium