EuniAI

20 posts

EuniAI banner
EuniAI

EuniAI

@Euni_AI

💡Self-healing code, ever-growing innovation.

Katılım Eylül 2025
1 Takip Edilen39 Takipçiler
EuniAI
EuniAI@Euni_AI·
🚀 HerAgent: Built by Her, to Rethink Success in Codebase Runnability 𝗛𝗮𝗽𝗽𝘆 𝘄𝗼𝗺𝗲𝗻'𝘀 𝗱𝗮𝘆! 👩‍💻 A Breakthrough by an All-Female Team As we release this on International Women's Day, we are proud to highlight that HerAgent is the work of an all-female team. This project stands as a testament to the exceptional contributions of women in hard-core AI and Software Engineering. We are not just participating in the revolution of autonomous development—we are leading it. 😫Tired of codebases that clone successfully but fail to execute?Exhausted by the endless battle with dependencies, environments, and build configurations? Meet HerAgent 👇 It’s not just another Code Agent—it’s an Environment Setup Agent, to get you out of dependency hell. Current autonomous agents often fail because they rely on shallow signals. HerAgent is the first agent designed to achieve full Runnability—ensuring real-world projects actually execute. 🔧 Key Innovations: Environment Maturity Hierarchy: We move beyond "Installable" and "Testable" states. If the main workflow doesn't run, the job isn't done. Script-Centric Architecture: HerAgent generates a unified Global Bash Script. All dependency resolutions and error patches are accumulated in one place, preventing context loss. Self-Healing Loop: It employs a continuous Execute → Analyze → Self-Repair cycle. 📊 Why It Matters: Achieving a 79.6% performance boost, HerAgent performs exceptionally well in the hardest edge cases (like brittle C/C++ repos). Before AI can write software, it must learn to make it run—and HerAgent is that first step. 📄 Paper 👉 arxiv.org/pdf/2602.07871 💻 Code 👉 github.com/EuniAI/EnvAgent #AI #Agent #SoftwareEngineering #WomenInTech #IWD2026
EuniAI tweet mediaEuniAI tweet mediaEuniAI tweet mediaEuniAI tweet media
English
0
2
4
122
EuniAI
EuniAI@Euni_AI·
We just released ContextBench 🎉 A benchmark built to answer a question many repo-level evaluations still miss: Do coding agents truly retrieve and use the right context, or do they just get lucky?👀✨ 📊 Highlights 🧩 1,136 real-world issues across 66 repos and 8 languages 🧠 Expert verified gold contexts at file, block, and line levels 👣 Full trajectory tracking of what the agent actually reads and explores 📈 Metrics covering Recall, Precision, F1, Efficiency, and Usage Drop 🔍 Key Findings 1️⃣ Complex agentic scaffolds do not improve context retrieval quality 😅 In many cases, they introduce over-engineering, echoing "The Bitter Lesson" in AI research 2️⃣ Many SOTA LLMs favor high recall over precision 📉 They retrieve more context, but also much more noise 3️⃣ Retrieved does not mean utilized ❗ Agents often inspect the right code but fail to incorporate it into the final patch 4️⃣ Retrieval strategies that are more balanced tend to achieve stronger Pass@1 while keeping compute cost reasonable ⚖️✨ 🌐 Homepage 👉 contextbench.github.io 📄 Paper 👉 arxiv.org/abs/2602.05892 💻 Code 👉 github.com/EuniAI/Context… 🗂️ Dataset 👉 huggingface.co/datasets/Conte…
EuniAI tweet mediaEuniAI tweet media
English
0
0
5
207
EuniAI
EuniAI@Euni_AI·
Here are some exciting updates from our EuniAI team! Our latest version of Prometheus reached Top 1 on AWS’s new SWE-PolyBench multi-language benchmark 🥇 This benchmark is one of the most challenging evaluations for code agents, so we are really happy to see Prometheus perform strongly. On SWE-bench Verified, Prometheus has also moved up to Top 6 among open-source systems 🏆 Our GitHub repo is now at 400+ stars, and the community keeps growing. We iterate every day and will continue pushing Prometheus toward a more capable and reliable autonomous code agent 🤖 Thanks to everyone for following our work. More improvements coming soon! 🚀✨
EuniAI tweet mediaEuniAI tweet mediaEuniAI tweet media
English
1
0
3
116
EuniAI
EuniAI@Euni_AI·
@The_Delysium ($AGI) is building a blockchain‑native collaboration network for agentic AI, featuring Lucy(lucyos.ai) — an agentic operating system — and the YKILY Network (You Know I Love You) — a digital‑native financial layer for AI agents. Alongside leaders such as Microsoft, Google, and Nvidia, and supported by select top‑tier investors, Delysium delivers secure, scalable infrastructure to power autonomous agent ecosystems.
English
0
0
2
208
EuniAI
EuniAI@Euni_AI·
We’re excited to announce our partnership with @The_Delysium — together, we’re setting open, autonomous, multilingual, and cost-efficient standards for AI coding. By bridging AI systems engineering with Web3 best practices — @LucyOSAI and the YKILY Network — we aim to make AI development more transparent, verifiable, and accessible for everyone.
EuniAI tweet media
English
4
2
4
490
EuniAI
EuniAI@Euni_AI·
We’re excited to announce our partnership with @The_Delysium — together, we’re setting open, autonomous, multilingual, and cost-efficient standards for AI coding. By bridging AI systems engineering with Web3 best practices — @LucyOSAI and the YKILY Network — we aim to make AI development more transparent, verifiable, and accessible for everyone.
EuniAI tweet media
English
0
2
3
187
EuniAI retweetledi
Delysium - $AGI 🟨
Delysium - $AGI 🟨@The_Delysium·
Delysium is proud to announce an official partnership with the @ucl Software Systems Engineering Team and Dr. He Ye from the Department of Computer Science, aiming to advance AI coding standards. We have reached a significant milestone: the integration of GPT-5 + Pass@1 has been rigorously verified as ranked Top 2 on SWE-Bench (swebench.com) on the first attempt. Furthermore, our jointly developed open-source agent, Prometheus, has achieved a 71.2% resolution rate, ranked 8th globally—just behind @OpenHandsDev. Together, we are setting open, autonomous, multilingual, and cost-efficient standards for AI coding — powering @LucyOSAI and the YKILY Network, and releasing open models to reduce LLM ops costs for everyone.
Delysium - $AGI 🟨 tweet media
English
8
14
35
4.6K
EuniAI
EuniAI@Euni_AI·
🚀 New milestone: Hermes hits 50% resolve rate on Terminal-Bench! 📊 Performance: ✅ 91.67% easy ⚡ 54.55% medium 🔥 20.83% hard Now ranked #6 on the leaderboard, Hermes proves the power of iterative AI agents in complex terminal environments. #LLM #Agents #AIResearch
EuniAI tweet media
English
0
4
5
619
EuniAI
EuniAI@Euni_AI·
🚀 Benchmark update 🚀 Prometheus + DeepSeek-Chat reached 35.33% resolve rate on SWE-bench Lite — achieving state-of-the-art performance in autonomous code agents. This is just the beginning 👀
English
0
3
3
581
EuniAI
EuniAI@Euni_AI·
🧠 Athena Memory System Launched! 🧠 An innovative memory system purpose-built for autonomous code agents: ⚡️ Long-term learning & memory ⚡️ Deeper context retention ⚡️ Greater adaptability Athena marks the next leap toward truly self-managing, intelligent code agents.
English
0
5
4
420
EuniAI
EuniAI@Euni_AI·
🧠 Prometheus keeps evolving. With new integrations & features, we’re pushing the boundaries of what code agents can do: •Context-rich problem solving •Flexible workflows •Real-world performance breakthroughs
English
0
1
3
176
EuniAI
EuniAI@Euni_AI·
🚀 Prometheus Major Update 🚀 Big leap for autonomous code agents: 🔗 MCP Integration for stronger context & multimodal interaction 🔧 Web Search + Custom Build Commands 📊 44% solve rate on SWE-bench Verified The future of self-managing code starts now.
EuniAI tweet media
English
2
5
5
586
EuniAI
EuniAI@Euni_AI·
🚀 Introducing EuniAI 🚀 We’re building autonomous code agents to: ⚡ Resolve issues automatically ⚡ Understand codebases seamlessly ⚡ Cut bug costs & boost productivity 👀 Demo invites coming soon! 🌐 euni.ai 💼 Follow us on LinkedIn for deeper updates.
English
3
5
6
1.1K