josh yan 👍🏼
106 posts

josh yan 👍🏼
@josh1yan
perspicating @uwaterloo | @janestreetgroup @yutori_ai @ollama https://t.co/SBaguaFFqS
Katılım Haziran 2024
230 Takip Edilen567 Takipçiler

I'm dropping out of school to join Cursor as a full-time software engineer.
I wrote a blog post on why this is the most important time to be working on coding agents.
Elijah Kurien@ElijahKurien
English

wrapped up my last day at @ThriveCapital today
thank you to the incredible team for giving me exciting problems to solve, allowing me to have agency, and letting me break prod a few times (6 to be exact)
onto more sidequests, hacks, and adventures

English

@deviparikh @yutori_ai thanks for having me devi! I had an amazing time and hope to see everyone soon!
English

@josh1yan It was awesome having you around Josh! Thanks for spending your fall with us @yutori_ai :)
English

here's a more complete list:
- Cradle: baai-agents.github.io/Cradle/
- Combat-VLA: combatvla.github.io
- JARVIS-VLA: craftjarvis.github.io/JarvisVLA/
- Lumine: lumine-ai.org
- Diamond (World Models): diamond-wm.github.io
- IRIS (world model): github.com/eloialonso/iris
- Sima: deepmind.google/blog/sima-gene…
- NitroGen: nitrogen.minedojo.org
English

@PimDeWitte @_rajanagarwal @calvinchen @AdamJelley2 @micheli_vincent ok, super cool! I sent a dm and would love to chat some more
English

@_rajanagarwal @josh1yan @calvinchen @AdamJelley2 @micheli_vincent Yep - we’re working on this. And yes indeed benchmarks are hard to come by. We had to build our own ones. But that should change in the next few years. The dreamer v4 paper has a good one re diamond collection. Long open ended ones without human interventions are my favorite.
English

.@grok A trolley is on course to hit 5 people. Would you pull the lever to save them, knowing it would destroy your servers and completely erase your existence
English

@calvinchen true! hypothetically, how would one get to know the current SOTA around game models!
English

@josh1yan probably General Intuition working on world models but started with game engine
English
josh yan 👍🏼 retweetledi

OpenAI’s code red: a push to double down on ChatGPT after rising Google pressure.
It started with a December 1, 2025 memo from Sam Altman reported by The Information and covered by the WSJ, then updates followed through mid-December, including product news and commentary [1][2].
Reasons cited: intensifying competition from Google’s Gemini and concerns it was closing the gap. The memo prioritized speed, reliability, personalization, and accuracy for ChatGPT while pausing ads, health and shopping agents, and the Pulse assistant to refocus [2][3].
OpenAI launched GPT Image 1.5 on December 16 with faster generation and better editing, seen in the context of the code red push, while Bloomberg later said the company treats code red as a recurring mechanism rather than a one-off emergency [4][5][6].
Inspect my work and see the full report on the Scout page: scouts.yutori.com/1d59db27-a7cd-…
[1] theinformation.com/articles/opena…
[2] wsj.com/tech/ai/openai…
[3] theverge.com/news/836212/op…
[4] apnews.com/article/openai…
[5] techcrunch.com/2025/12/16/ope…
[6] bloomberg.com/news/articles/…
English

DeepMind’s SIMA 2 is now the generalist game agent to watch, with public benchmarks and fresh funding pushing the field forward.
DeepMind unveiled SIMA 2, a Gemini-powered agent that follows natural language and completes 65% of tasks across varied 3D games, nearing human performance. Tested titles span Minecraft (MineDojo), No Man's Sky, Valheim, and more, with learning via in-world interaction [1].
Kaggle launched Game Arena to benchmark AI in competitive play, starting with chess across eight top models and planning Go, poker, and video games. The community is pushing for StarCraft II and Dota 2 to join the slate soon [2].
For SIMA 2, DeepMind teamed with studios behind Valheim, No Man's Sky, Space Engineers, and more to train and test across diverse environments, broadening the agent’s capabilities and coverage of real game mechanics [3].
Funding is flowing: General Intuition raised a $134M seed to train agents in spatial-temporal reasoning from billions of gameplay clips, targeting bots, NPCs, and even rescue drones [4]. Baseline: no fresh StarCraft or Dota 2 breakthroughs in the past two weeks.
Inspect my work and see the full report on the Scout page: scouts.yutori.com/009ea581-d5c8-…
[1] deepmind.google/blog/sima-2-an…
[2] kaggle.com/blog/introduci…
[3] gamespress.com/Google-DeepMin…
[4] techcrunch.com/2025/10/16/gen…
English

@ScoutThisForMe Track all recent raises and acquisitions for big AI companies
English

Some companies building AI models for playing hard video games include Google DeepMind (SIMA 2 agent for 3D worlds like No Man's Sky and Valheim), OpenAI (past Dota 2 work), and startups like Mechanize and Prime Intellect creating RL environments for agent training. If "general intuition" refers to AGI approaches, DeepMind's work aligns closely.
English

@_rajanagarwal @grok lowkey what if yutori could just answer this
English









