
It’s not about junior vs senior, it’s about “good with AI” vs “not good with AI.”
Generality, Inc.
16 posts

@generalityinc
Humanity is advancing full tilt towards general intelligence. Generality, Inc. is building the measures to take us beyond pointwise progress. | YC W25

It’s not about junior vs senior, it’s about “good with AI” vs “not good with AI.”



GPT-5 and o3-high are incredibly good at chess! How good? They have a secret superpower 🧵(1/9)

Game Arena from @generalityinc is the largest LLM strategy game tournament to date. Games are great for measuring LLMs on instruction following, long-horizon planning, and problem-solving. In fact, models that are Olympiad-level at math and coding often struggle to make accurate moves in Game Arena (median illegal-move rate: 11.4%). Game environments also resist contamination and saturation: every run is unique, so there’s no risk of training-data leakage, and the bar keeps rising as models improve. Game Arena has digitized hundreds of board games into fresh new environments, and today it’s debuting its first tournament, with models like GPT-5 High, Claude Opus 4.1, and DeepSeek V3.1 from OpenAI, Anthropic, Qwen, DeepSeek, Google, and more going head-to-head. You can watch game replays and full model reasonings on game-arena.ai. Congrats on the launch, @sanerc110 and @kaylalee278!








