Nicholas Edwards

13 posts

Nicholas Edwards

Nicholas Edwards

@nedwards99

Katılım Ekim 2023
46 Takip Edilen12 Takipçiler
Nicholas Edwards
Nicholas Edwards@nedwards99·
RExBench is now available in Terminal Bench (@harborframework)! 🎉 We integrate 2 tasks (cogs, othello) along with a local testing framework so you can test if your agents can autonomously implement novel AI research extensions.
English
1
2
6
1.5K
Nicholas Edwards
Nicholas Edwards@nedwards99·
🧵 Do coding agents know when to ask for help? Real-world coding tasks are rarely fully specified, yet most agents are optimized to execute autonomously rather than clarify.
English
1
3
7
933
Nicholas Edwards retweetledi
Sarah Breckner
Sarah Breckner@hieristSarah·
Diffusion LLMs can think EoS-by-EoS! The higher the generation length, the better the performance of Masked Diffusion LLMs, even though they generate the same amount of words and only augment them with more and more EoS tokens  👀
Sarah Breckner tweet media
English
1
3
4
243