jayce
375 posts




Google DeepMind CEO Demis Hassabis said AGI is just a few years a way and that humanity doesn't have much time to prepare for it. bit.ly/3RPft9N






Another early Claude Mythos output 🧡 One of the most impressive Minecraft clones I have ever seen from a model. Everything from the graphics to the mechanics is implemented with a lot of attention to detail. I also asked it to implement Multiplayer and it did with no issues.


$$$ AI researchers switching labs recently— * xAI cleaned house and having a hard time refilling talent. Shifted to hiring more startup / engineer grinder types vs researchers. Narrowed focus to code. * Cursor having talent trouble + identity crisis: undercapitalized financially relative to team talent level. * Project Prometheus (Bezos) quietly snapping up talent. Many key hires recently. Potential to be major player. * Anthropic remains most desirable, even more than last 6mo, few leave. Difficult to poach from with upcoming IPO. Only hiring staff or above and stopped hiring even senior. * TBD (Meta) also keeps snapping up top talent quietly. MSL seen as significantly less desirable. * Thinking Machines has somewhat stabilized after departures earlier this year. Star studded still but not an auto-pick for talent newly on market. * OAI churning as always, both from normal burnout bleed and latest reshuffle axing non-core divisions. * Not super sure what’s going on with GDM wrt talent. Perpetually #3 spot on model ranking. Overall, talent flow is net flowing out of undercapitalized neolabs to highly capitalized neolabs or Anthropic.







biggest ai breakthrough in a long time dropping this month dont ask me how i know🤫

Anyone who has spent more than 30 seconds running frontier models on tough benchmarks knows that they like finding ways to cheat. Here's the most creative method we caught an agent using to cheat on ProgramBench. w/ @jyangballin @KLieret @18jeffreyma

swe-bench is kind of a shitshow, and it makes evaluating LLMs hard. DeepSWE is the first agentic code bench that makes sense.











