
Tokenmaxxing done right is outcomemaxxing
EVO
19 posts

@EVO__HQ
lets make you do autoresearch 0x721b072dbb616f29eea73ac004e03fd4e884bba3

Tokenmaxxing done right is outcomemaxxing

its been overwhelming to see the adoption and love that @evo__hq has received, especially from the research community. i have had the chance to sit down with multiple researchers who have already achieved SOTA results in their fields using evo. and many of them had a common question - how do we cite $evo in our work ? pleased to announce that evo now has a DOI and is cite able. cant wait to see what the community does with it <3




We now know that with an appropriate harness both Mythos and GPT-5.5 can reproduce what our internal model did in one-shot for the unit distance problem. Clearly there is an insane overhang of capabilities with this generation of models, and no ceiling in sight for what scientific advances they can bring. You can go and try to discover new things with 5.5 right now!

there is sooo many low hanging fruits in AI, its insane. that fact that soo many remains tells you about the level of general intelligence the LLMs have reached and the maturity of our tooling to effectively leverage them.






Where @EVO__HQ is heading : I've been planning some solid next steps for EVO. I'll keep pushing autoresearch applications, but a lot of recent user conversations have convinced me the core optimization problem runs much deeper than I first thought. Customers don't want a one-time autoresearch run. They want their systems to stay continuously tuned. So it's time for evo to serve any optimization need an org might have: systems, code, agents, and even models. The long-term goal is for evo to become the platform teams choose to run agents 24/7 and constantly tune everything they're building.




Automating AI research is the next major step in AI We let Claude Code (Opus 4.7) and Codex (GPT 5.5) run autonomously on the nanoGPT speedrun optimizer track using our idle compute. ~10k runs, ~14k H200 hours Opus now holds the record at 2930 steps vs the 2990 human baseline

