Basis

78 posts

Basis banner
Basis

Basis

@BasisOrg

Basis

เข้าร่วม Temmuz 2022
0 กำลังติดตาม883 ผู้ติดตาม
ทวีตที่ปักหมุด
Basis
Basis@BasisOrg·
New paper from Basis' Project MARA team and collabs. The ability to learn and use world models is a key aspect of human intelligence, but evaluating this ability remains elusive. In this work we propose WorldTest, a representation-agnostic, behavior-based agent eval framework.
Basis tweet media
English
1
10
20
3.3K
Basis
Basis@BasisOrg·
We're hiring research scientists in PL and other areas. Join us! #careers" target="_blank" rel="nofollow noopener">basis.ai/join-us/#caree
English
0
0
2
268
Basis
Basis@BasisOrg·
We're attending and sponsoring #POPL2026 in Rennes, France 🇫🇷 -- if you're around, stop by our sponsor booth to chat about research and open opportunities at Basis. We'll be there Wednesday 10:00-19:30 and Friday 10:00-18:00.
English
2
0
6
331
Basis รีทวีตแล้ว
Yichao Liang
Yichao Liang@yichao_liang·
New preprint on learning abstract world models for robotics planning. Paper + code below. 🤖🌐 Must an agent plan by simulating pixels frame by frame, or can it think in abstractions? Consider planning an international flight: we can reason about buying tickets, changing airplanes, and crossing borders without committing to the color of the airplane or the milliseconds before takeoff. Absent abstraction, planning over long time horizons would be intractable, because every minute detail of the world would need to be simulated. [1/7]
Yichao Liang tweet media
English
2
10
20
2.8K
Basis รีทวีตแล้ว
Alex Prompter
Alex Prompter@alex_prompter·
🚨 MIT and Basis Research just dropped a new way to measure if AI actually understands the world and the results are brutal. It’s called "WorldTest", and it doesn’t just check how well an AI predicts the next frame or maximizes reward. It checks whether the model can build an internal model of reality and use it to handle new situations. They built 'AutumnBench', a suite of 43 interactive worlds and 129 tasks where AIs must: • Predict hidden parts of the world (masked-frame prediction) • Plan sequences of actions to reach a goal • Detect when the environment’s rules suddenly change Then they tested 517 humans vs. top AI models Claude, Gemini 2.5 Pro, and o3. Humans crushed every model. Even massive compute scaling barely helped. The takeaway is wild... current AIs don’t understand environments; they pattern-match inside them. They don’t explore strategically, revise beliefs, or run experiments like humans do. WorldTest might be the first benchmark that actually measures understanding, not memorization. The gap it reveals isn’t small it’s the next grand challenge in AI cognition. Paper: Benchmarking World-Model Learning (arxiv. org/abs/2510.19788)
Alex Prompter tweet media
English
54
217
932
109.5K
Basis รีทวีตแล้ว
Basis รีทวีตแล้ว
Basis
Basis@BasisOrg·
We'll also be at NeurIPS; come talk to us! Visit our booth or register for our social: luma.com/ivw952te
English
0
0
0
344
Basis
Basis@BasisOrg·
New paper from Basis' Project MARA team and collabs. The ability to learn and use world models is a key aspect of human intelligence, but evaluating this ability remains elusive. In this work we propose WorldTest, a representation-agnostic, behavior-based agent eval framework.
Basis tweet media
English
1
10
20
3.3K
Basis
Basis@BasisOrg·
And we're hosting a social at NeurIPS. If you want to come chat with us, RSVP: luma.com/ivw952te
English
0
0
3
310