James Maclaurin

793 posts

James Maclaurin banner
James Maclaurin

James Maclaurin

@jamesmaclaurin

Philosopher of science (University of Otago). Director of the Centre for AI and Public Policy. Working on philosophical implications of Artificial Intelligence.

Dunedin, New Zealand. Katılım Nisan 2009
179 Takip Edilen518 Takipçiler
James Maclaurin
James Maclaurin@jamesmaclaurin·
There has been a huge influx of companies building humanoid robots. Unsure what to make of the steady stream of demos of robots running, jumping and moving stuff about? This analysis form @FredaDuan is an excellent place to start.
Freda Duan@FredaDuan

Robotics is moving fast (lots of exciting demos lately), but there’s still so much confusion and a lack of clear benchmarks. Without a shared framework, it’s hard to evaluate real progress. I genuinely want to see this industry thrive—so in the spirit of open-sourcing knowledge and pushing things forward, this (long) thread is my attempt to break down what’s impressive vs. what’s not; how to evaluate robots & demos; and what frameworks can help. I welcome all feedback and input. A better-informed public means fewer missteps, less wasted effort, and faster progress for everyone. Let’s push the field forward. 🌟TL;DR: What matters & where we are: - Hardware: All about consistency + scaled production. The supply chain is maturing. Chinese players (and $TSLA) arguably have an edge in mass production. - Locomotion: More or less a solved problem via RL. - Manipulation: Still Day 1. Cutting-edge research (e.g., VLA) is here, but better simulation ( $NVDA) is needed for a real step change. - Big picture: Robotics has made significant progress, but for manipulation, the industry is still in the demo phase. That said, even demos deserve recognition—it all starts with getting one task right once before scaling to generalizable, consistent skills. This doesn’t mean we’re “always years away” from commercialization. In fact, I believe things can move fast—from one final impressive demo to commercialization could take months, not years. 🌟How to evaluate a robot/demo: Start with the end goal (smooth/human-like, generalizable, and consistent) and work backward: 1/ Generalization: Should work across diverse objects—varying color, reflectivity, and softness. Test with slight disruptions (lighting changes, interference, object positioning). 2/ Consistency: Right now, most demos are cherry-picked. Over time, we should see robots executing 100s of tasks, thousands of times, with high accuracy. 🧠Data & Robot Training 101 Robotics has a well-known data problem. Two primary sources: > Imitation learning (tele-op, real-world data) > Reinforcement learning (mostly simulation-based) Rule of thumb: If you have real-world data (e.g., $TSLA FSD), imitation learning can take you far. If you don’t, RL is the only option. RL is far more data-efficient. Over time, RL has gone mainstream—no single breakthrough, but improvements in: ✅ Language models (OpenAI, etc.) ✅ $NVDA's tools (Isaac Gym/Sim/Lab) → reducing the sim-to-real gap. ✅ Moving from single-task RL → expanding to a broader set of useful tasks. Chips? Unlike training LLMs, robot models are much smaller, so even Chinese players aren’t significantly chip-constrained. 2025 (and beyond) = The Year of RL for Robotics! ▶️Hardware "Consistent quality + scaled production." Precision is far harder to maintain for mobile robots than for cars—every moving part must perform identical actions across an entire fleet, regardless of complex joint movements. Progress: The robotics supply chain (esp. in Asia) is evolving fast—even dexterous hands are being tackled. Cost: ~$100k(?) per humanoid in the U.S., less than half for Chinese players. Open questions: Will robotics hardware follow the EV industry shakeout, where Chinese players + a few OEMs (like $TSLA) dominate? Many EV makers burned billions and either gave up or are still stuck in production hell—even with a mature supply chain. Why wouldn’t the same happen here? ▶️Locomotion (e.g. walking, running, backflip) Locomotion is more or less a solved problem today. Old way: Rule-based MPC (think Boston Dynamics). New way: RL, which is far more scalable and leads to better balance & control. Similar to language models, there's debate on whether RL has true generalization across different tasks and verticals. As of today, robots can hike various terrains without additional training, but distinctly different actions—like hiking vs. a backflip—still require separate RL models. Great breakdown from Jim Fan: x.com/DrJimFan/statu… ▶️Manipulation (e.g. sorting, wiping, general tasks) RL, which works like magic for locomotion, struggles with manipulation. Why? Objects vary in shape, rigidity, material, making the sim-to-real gap much larger. Tasks like cooking, washing dishes, and opening bottles are highly diverse and hard to generalize. VLA (Vision-Language-Action) is the hot buzzword—essentially FSD for robotics. It’s an end-to-end model, heavily trained with RL and simulation. Companies like Figure AI, Google DeepMind, and China’s Galbot are actively working on this. Reality check: Whether it's 1X or Tesla Optimus, even tele-op-assisted manipulation is still impressive given the current state of the field. Open questions: How much does real-world data improve manipulation training (i.e. why doing mass scale production today)? If simulation is the bottleneck, what does $NVDA need to build to leapfrog the industry? ---- I’m excited about where robotics is headed, but we need to cut through the noise and focus on real progress. Breakthroughs will come faster than expected—if we focus on what truly moves the industry forward.

English
0
0
1
32
James Maclaurin
James Maclaurin@jamesmaclaurin·
If you’re interested in all the DeepSeek panic / hype. This is excellent analysis from Frida Duan.
English
0
0
0
153
James Maclaurin retweetledi
Amanda
Amanda@Pandamoanimum·
Amanda tweet media
ZXX
250
37.9K
295.1K
9.9M
James Maclaurin
James Maclaurin@jamesmaclaurin·
The robots are coming…
English
0
0
0
41
James Maclaurin
James Maclaurin@jamesmaclaurin·
Yet again, we seem to be being tripped up by our benchmarks. The scramble for superintelligence has seen companies developing models designed to excel on every benchmark we can dream up. Really useful models might look quite different…
Andrej Karpathy@karpathy

LLM model size competition is intensifying… backwards! My bet is that we'll see models that "think" very well and reliably that are very very small. There is most likely a setting even of GPT-2 parameters for which most people will consider GPT-2 "smart". The reason current models are so large is because we're still being very wasteful during training - we're asking them to memorize the internet and, remarkably, they do and can e.g. recite SHA hashes of common numbers, or recall really esoteric facts. (Actually LLMs are really good at memorization, qualitatively a lot better than humans, sometimes needing just a single update to remember a lot of detail for a long time). But imagine if you were going to be tested, closed book, on reciting arbitrary passages of the internet given the first few words. This is the standard (pre)training objective for models today. The reason doing better is hard is because demonstrations of thinking are "entangled" with knowledge, in the training data. Therefore, the models have to first get larger before they can get smaller, because we need their (automated) help to refactor and mold the training data into ideal, synthetic formats. It's a staircase of improvement - of one model helping to generate the training data for next, until we're left with "perfect training set". When you train GPT-2 on it, it will be a really strong / smart model by today's standards. Maybe the MMLU will be a bit lower because it won't remember all of its chemistry perfectly. Maybe it needs to look something up once in a while to make sure.

English
0
0
1
97
James Maclaurin
James Maclaurin@jamesmaclaurin·
You’ve heard the story about the two guys being chase by a lion? In short, superintelligence just needs to be much smarter than us. It’s performance can still vary from domain to domain. Yes, we won’t build omniscience any time soon but we might still hit superintelligence.
BURKOV@burkov

What surprises me about talented people in tech is their capacity to be brilliant in one area and entirely delusional in another, even within their domain. For instance, those who invented modern machine learning, fully aware of its mechanics and inherent weaknesses, might still claim, "Superintelligence is near; we need to prepare." I want to scream, "How can you, knowing exactly how it works, believe it will soon consume us?" Yet, they somehow do. A language model, however good, is still a machine learning model—a simple mathematical formula whose parameters are only as good as the dataset used to estimate them. Any dataset, no matter how large, has high-density regions, like news articles or fiction books, and low-density regions, like cutting-edge scientific and medical articles. Thus, the model excels in high-density areas and becomes unreliable in low-density ones. Anyone capable of training an LLM knows this; it's common machine learning knowledge. Yet, they persist in saying "superintelligence is near." So, I wonder if it's just human nature to excel in one area and be utterly mistaken in another, or if they are cold-blooded liars causing panic to profit from scared people?

English
0
0
0
62
James Maclaurin
James Maclaurin@jamesmaclaurin·
Aurora Australis over Dunedin tonight. Amazing.
James Maclaurin tweet media
English
0
2
7
686
John Zerilli
John Zerilli@JohnZerilli·
@cecilymwhiteley Not sure if it’s relevant, but a controversial figure who’d have something to say on this is Szaz.
English
2
0
3
390
Cecily Whiteley
Cecily Whiteley@cecilymwhiteley·
I'm becoming increasingly interested in the phenomenon of rapidly growing number of diagnoses of mental disorders, and what, if anything, this tells us about the validity of psychiatric categories. Any reading recommendations on this?
English
21
11
83
19.2K
Colin Gavaghan
Colin Gavaghan@ColinGavaghan·
Merry Cthulhu-Krampus-Christmas.
Colin Gavaghan tweet media
Eesti
1
0
5
206
James Maclaurin
James Maclaurin@jamesmaclaurin·
Thanks to all who helped. It was a fascinating project to be part of and I hope it will mark the beginning of a new AI-enhanced phase of healthcare in Aotearoa New Zealand. Have a good Christmas @ChiefSciAdvisor
English
0
1
7
1.6K
James Maclaurin retweetledi
Joe Campbell
Joe Campbell@PhilosopherJoeC·
@Philip_Goff I could never get a bad science book published but scientists get to write bad philosophy all the time.
English
14
17
196
116.1K
James Maclaurin
James Maclaurin@jamesmaclaurin·
@ColinGavaghan @Jesse_Norman Don’t suppose you’ve got a handy summary in your pocket. I’d love to see something balanced, risk-responsive and refreshingly sensible.
English
1
0
2
29
Colin Gavaghan
Colin Gavaghan@ColinGavaghan·
@Jesse_Norman With all the "Brexit freedoms/unleash innovation!" ballyhoo around AI policy, the balanced and risk-responsive framing of the Automated Vehicles Bill is refreshingly sensible. So credit where it's due.👏
English
1
0
0
537
Jesse Norman
Jesse Norman@Jesse_Norman·
Very grateful to the Prime Minister for accepting my resignation. Having laid the ZEV mandate and framed the Automated Vehicles Bill, this is the right time to step down. Looking forward to more freedom to campaign on the River #Wye and other crucial local and national issues!
Jesse Norman tweet media
English
75
70
139
200.8K