Jonathan Scholz

152 posts

Jonathan Scholz

Jonathan Scholz

@JonathanScholz2

Katılım Eylül 2018
314 Takip Edilen259 Takipçiler
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
@MaxLobovsky Yep this is really nice to see. Big e2e models get all the attention, but work like this shows how far you can get with modern vision and a little controls elbow-grease. Ship it!
English
0
0
0
155
Maxim Lobovsky
Maxim Lobovsky@MaxLobovsky·
This is the most promising direction in robotics today. Problem: industrial robot arms are too expensive to install Solution: low cost arms + vision + targeted machine learning technoques to make programming easier and system more robust to variation without lots of hardware engineering
Yunho Kim@awesomericky99

We present "hybrid system" that supplements conventional automation with "learning" for task & safety-level adaptiveness Deployed in factory for motor cable soldering (< 0.6 mm tolerance), resulting 108 motors, 99.4% SR with < 20 min data per task Paper: arxiv.org/abs/2604.22235

English
5
19
210
19.8K
Pulkit Agrawal
Pulkit Agrawal@pulkitology·
Eka means unity -- “one,” in Sanskrit and “first” in Finnish. We’re building intelligence for the physical world in its native language: forces. Until now, robotics faced a tradeoff — generality or speed. The real world requires both. Robotics also faced a data problem. Our Vision–Force–Action (VFA) model — the first of its kind — breaks the generality-speed tradeoff and the data barrier. It's a new foundation uniting performance, generality, and safety for putting capable robots in everyone's hands. Today, I am excited to share our journey of pushing robots beyond human limits. Today, dexterity becomes scalable. Today, I welcome you to the Era of Eka. Co-founded with @haarnoja, and so thrilled and grateful to be working with a dream team at @EkaRobotics. Learn more: ekarobotics.com
English
63
221
1.9K
301.5K
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
*this* is what people should be excited about in robotics. Not skill demos - people using the platform for things the designers didn't anticipate.
Ben Pekarek@ben_pekarek

Today marks the end of my first full week @GeneralistAI Last Monday, I was given a challenge: use our GEN-1 model to teach a robot a task of my choosing, using the same no-code platform our customers use. I picked the ball-and-vase magic trick. It was one of my favorites as a kid, and it felt like the right mix of fun and surprisingly hard. A few days later, GEN-1 pulled it off. I left Friday having watched the robot nail it 14 times in a row. What’s wild is that even 4 months ago, if you told me you could go from idea to on-robot skill in a couple of days, I probably wouldn’t have believed you. Really excited to be building with an incredible team. Can’t wait to see what week two brings 🤖

English
0
0
1
61
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
This is great news for the (already great) London AI scene
Seb Johnson@SebJohnsonUK

Anthropic has announced that it is massively expanding its London presence. It’s just secured a new office for 800 people - a huge jump from its 200 current employees. OpenAI announced its first permanent office in London this week and now @AnthropicAI is doubling down. Meta, OpenAI, DeepMind, wayve and so many others have huge offices in London. It’s becoming the leading AI hub outside of the US. LETS GO

English
0
0
2
155
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
Really nice results, really nice analysis steerability is *the* missing thing needed to get robotics outside the lab. Even more important than generalisation (though not unrelated)
Lucy Shi@lucy_x_shi

1/ We just released π0.7 — a steerable generalist robot model with emergent capabilities. I want to share a bit of the backstory, because π0.7 taught me something surprising about where robot learning is heading. A thread on bittersweet lessons 🧵

English
0
0
2
181
Jonathan Scholz retweetledi
Lucy Shi
Lucy Shi@lucy_x_shi·
1/ We just released π0.7 — a steerable generalist robot model with emergent capabilities. I want to share a bit of the backstory, because π0.7 taught me something surprising about where robot learning is heading. A thread on bittersweet lessons 🧵
English
31
102
849
82.2K
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
@DrJimFan Nice work @DrJimFan, another banger! Can you give a sense of how well your core agent can hide the brittleness of robot perception and control primitives? Like, does it genuinely compose without expert intervention, or does pass-through to a VLA still feel like the path?
English
0
0
0
64
Jim Fan
Jim Fan@DrJimFan·
The power of the Claw, in the palm of a robot hand. Agentic robotics is here! Today, we open-source CaP-X: vibe agents, alive in the physical world. They incarnate as robot arms and humanoids with a rich set of perception APIs, actuation APIs, and auto synthesize skill libraries as they go. CaP-X is a strict superset of our old stack, because policies like VLAs are “just” API calls as well. It solves many tasks zero-shot that a learned policy would struggle with. And we are doing much more than vibing. CaP-X is our most systematic, scientific study on agentic robotics so far: - We build a comprehensive agentic toolkit: perception (SAM3 segmentation, Molmo pointing, depth, point cloud), control (IK solvers, grasp planner, navigation), and visualization (EEF, mask overlays) that work across different robots. - CaP-Gym: LLM’s first Physical Exam! 187 manipulation tasks across RoboSuite, LIBERO-PRO, and BEHAVIOR. Tabletop, bimanual, mobile manipulation. Sim and real. Can’t wait to see the gradients flow from CaP-Gym to the next wave of frontier LLM releases. - CaP-Bench: we benchmark 12 frontier LLMs/VLMs (Gemini, GPT, Opus, Qwen, DeepSeek, Kimi, and more) across 8 evaluation tiers. We systematically vary API abstraction level, agentic harness, and visual grounding methods. Lots of insights in our paper. - CaP-Agent0: a training-free agentic harness that matches or exceeds human expert code on 4 out of 7 tasks without task-specific tuning. - CaP-RL: if you get a gym, you get RL ;). A 7B OSS model jumps from 20% to 72% success after only 50 training iterations. The synthesized programs transfer to real robots with minimal sim-to-real gap. 3 years ago, our team created Voyager, one of the earliest agentic AI that plays and learns in Minecraft continuously. Its key ideas — skill libraries, self-reflection loops, and in-context planning — have since influenced many modern agentic designs. Today, the agent graduates from Minecraft and gets a real job. It’s April Fool’s, but this Claw is getting its hands dirty for real! Link in thread:
English
100
114
722
70.9K
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
Nice to finally see a worthy counterpoint to VLA in the language-conditioned robotics space! I'm personally quite bullish on CLAW approaches, which lean on more sophisticated agentic cores than even the best VLAs. This'll spark lots of ideas on the executive->control interface
Jim Fan@DrJimFan

The power of the Claw, in the palm of a robot hand. Agentic robotics is here! Today, we open-source CaP-X: vibe agents, alive in the physical world. They incarnate as robot arms and humanoids with a rich set of perception APIs, actuation APIs, and auto synthesize skill libraries as they go. CaP-X is a strict superset of our old stack, because policies like VLAs are “just” API calls as well. It solves many tasks zero-shot that a learned policy would struggle with. And we are doing much more than vibing. CaP-X is our most systematic, scientific study on agentic robotics so far: - We build a comprehensive agentic toolkit: perception (SAM3 segmentation, Molmo pointing, depth, point cloud), control (IK solvers, grasp planner, navigation), and visualization (EEF, mask overlays) that work across different robots. - CaP-Gym: LLM’s first Physical Exam! 187 manipulation tasks across RoboSuite, LIBERO-PRO, and BEHAVIOR. Tabletop, bimanual, mobile manipulation. Sim and real. Can’t wait to see the gradients flow from CaP-Gym to the next wave of frontier LLM releases. - CaP-Bench: we benchmark 12 frontier LLMs/VLMs (Gemini, GPT, Opus, Qwen, DeepSeek, Kimi, and more) across 8 evaluation tiers. We systematically vary API abstraction level, agentic harness, and visual grounding methods. Lots of insights in our paper. - CaP-Agent0: a training-free agentic harness that matches or exceeds human expert code on 4 out of 7 tasks without task-specific tuning. - CaP-RL: if you get a gym, you get RL ;). A 7B OSS model jumps from 20% to 72% success after only 50 training iterations. The synthesized programs transfer to real robots with minimal sim-to-real gap. 3 years ago, our team created Voyager, one of the earliest agentic AI that plays and learns in Minecraft continuously. Its key ideas — skill libraries, self-reflection loops, and in-context planning — have since influenced many modern agentic designs. Today, the agent graduates from Minecraft and gets a real job. It’s April Fool’s, but this Claw is getting its hands dirty for real! Link in thread:

English
0
1
2
178
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
Nice to see someone taking a theoretical position on intelligence that diverges from "RL is the definition and solution for everything" I love me some RL, but human intelligence feels much more mimetic to me. It lives in the culture
Pedro A. Ortega@AdaptiveAgents

Agency is usually formalized as utility maximization. But must it be? LLMs suggest a different foundation: intelligence as acquiring behavioral schemas from interaction structure. My new paper: "Universal AI as Imitation" investigates the limit-case of LLM-style models.

English
0
0
2
124
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
This is a narrative-breaking observation that should have people sitting up in their chairs The moral isn't "give up on pretraining", its "find a way to train without losing too much rank" (the language poeple got this right - step 1 shouldn't be "train on task data")
Tony Zhao@tonyzzhao

Maybe the bitter lesson for robotics that goes against all the current narratives is that a small amount of in-domain demos is really hard to beat, especially when it comes to e.g. industrial use cases, where variability is limited.

English
1
0
5
260
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
@tonyzzhao yep, you sacrifice too much steerability to get generality. you had the right idea with ACT :)
English
0
0
1
105
Tony Zhao
Tony Zhao@tonyzzhao·
Maybe the bitter lesson for robotics that goes against all the current narratives is that a small amount of in-domain demos is really hard to beat, especially when it comes to e.g. industrial use cases, where variability is limited.
Alper Canberk@alpercanbe

i was visiting a hackathon where 80+ participants were training pi0/0.5, gr00t, smolvla, ACT, DP, etc. on lerobot arms the best and most sample efficient policies were trained *from scratch* we still do not have an open source x-embodied GPT-2, but i'm hopeful for this year

English
15
22
408
53.6K
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
i realised i really like kid shows 🤣 For intellectual stuff appreciate nuance, but for food and art and media i need to be hit over the head. share.google/ZZBOqeHYheEv2d… The themes in this show are super exaggerated, like the high fructose corn syrup of life lessons, and i love it
English
0
1
1
87
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
That’s the only way i know to make a team greater than the sum of its parts thank you for coming to my TED talk
English
0
0
2
87
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
disagree? great! take your gloves off, so i can take mine off too, and we’ll get to the bone faster but lets not try win the fight, lets try to win the peace. The atoms of debate should be ideas, not people.
English
1
0
1
96
Jonathan Scholz
Jonathan Scholz@JonathanScholz2·
I finally figured out why I think “disagree and commit” is bullshit: It makes teams mode-seeking, not mean-seeking. it’s the last bastion of the fixed-minded. my style is more like “disagree and converge”
English
1
0
2
113
AgileX Robotics
AgileX Robotics@AgilexRobotics·
NERO 7-DoF: Flexibility That Moves Like a Human Redefine dexterity with NERO – perfect for lab setups and humanoid R&D. ✅ ±0.1mm Repeatability ✅ Multi-angle Mounting (upright/inverted/side) ✅ 4.8kg Ultra-lightweight Mobility ✅ 3.0kg Payload ✅ 580mm Working Radius
English
5
18
149
7.3K