Segan

5 posts

Segan

Segan

@SeganBoss

Hyderabad, India Katılım Kasım 2017
137 Takip Edilen1 Takipçiler
Segan retweetledi
Sai Rajeswar
Sai Rajeswar@RajeswarSai·
🧵 Introducing 𝐄𝐧𝐭𝐞𝐫𝐩𝐫𝐢𝐬𝐞𝐎𝐩𝐬-𝐆𝐲𝐦🚀 : a rigorous new benchmark for stateful agentic planning and tool use in real enterprise environments. 1,150 expert-curated tasks · 512 tools · 164 DB tables · 8 domains. Every task verified by hand-written SQL, checking goal completion, state integrity and policy compliance🔥 𝐓𝐡𝐞 𝐡𝐞𝐚𝐝𝐥𝐢𝐧𝐞: Claude Opus 4.5 — our best-performing model succeeds on just 37.4% of tasks. With oracle tool access. No tool discovery required. 📄 arxiv.org/abs/2603.13594 (trending #4 on daily-papers) 🌐 enterpriseops-gym.github.io 🤗 huggingface.co/datasets/Servi… 💻 github.com/ServiceNow/Ent…
Sai Rajeswar tweet media
English
2
26
54
5.7K
Segan retweetledi
ServiceNow AI Research
ServiceNow AI Research@ServiceNowRSRCH·
Apriel-1.5-15B-Thinker sits in the “most attractive quadrant” of the Intelligence vs. Parameters tradeoff — alongside much larger proprietary models. And among 4B–40B models, it ranks at the top. The secret? 🤔
ServiceNow AI Research tweet mediaServiceNow AI Research tweet media
English
1
2
24
1.6K
Segan retweetledi
ServiceNow AI Research
ServiceNow AI Research@ServiceNowRSRCH·
SLAM Labs presents Apriel-1.5-15B-Thinker 🚀 An open-weights multimodal reasoning model that hits frontier-level performance with just a fraction of the compute.
ServiceNow AI Research tweet media
English
14
77
334
59.6K