James Maclaurin

793 posts

James Maclaurin

@jamesmaclaurin

Philosopher of science (University of Otago). Director of the Centre for AI and Public Policy. Working on philosophical implications of Artificial Intelligence.

Dunedin, New Zealand. Katılım Nisan 2009

179 Takip Edilen518 Takipçiler

James Maclaurin@jamesmaclaurin·14 Ağu

Training…

Reborn@reborn_agi

Full-body joint tracking + smart gloves capture every nuance of human motion, which is retargeted in real-time to a humanoid robot (here, Tesla Optimus) for smooth teleoperation. This is the power of inertial motion capture. More than control, mocap is the gold standard for collecting high-fidelity humanoid training data. Advanced AI algorithms transform raw accelerometer streams into precise 3D joint poses, turning human movement into lifelike robot action.

English

James Maclaurin@jamesmaclaurin·29 Mar

Yes

QST

James Maclaurin@jamesmaclaurin·1 Mar

For all those who see empathy as the secret sauce that AI will never be able to copy…

John Nosta@JohnNosta

🤖 Can AI outfeel us? Empathy has 3 dimensions: 🔍 Depth – Understanding emotion 🌎 Reach – Who & what we empathize with ⏳ Consistency – How reliably we show it Humans excel in depth, but AI? It wins in reach & consistency—never tired, very consistent, always there. If AI feels more reliably than us, does it matter how it feels? 👇👇👇 psychologytoday.com/intl/blog/the-… #AI #LLMs #empathy

English

James Maclaurin@jamesmaclaurin·24 Şub

There has been a huge influx of companies building humanoid robots. Unsure what to make of the steady stream of demos of robots running, jumping and moving stuff about? This analysis form @FredaDuan is an excellent place to start.

Freda Duan@FredaDuan

Robotics is moving fast (lots of exciting demos lately), but there’s still so much confusion and a lack of clear benchmarks. Without a shared framework, it’s hard to evaluate real progress. I genuinely want to see this industry thrive—so in the spirit of open-sourcing knowledge and pushing things forward, this (long) thread is my attempt to break down what’s impressive vs. what’s not; how to evaluate robots & demos; and what frameworks can help. I welcome all feedback and input. A better-informed public means fewer missteps, less wasted effort, and faster progress for everyone. Let’s push the field forward. 🌟TL;DR: What matters & where we are: - Hardware: All about consistency + scaled production. The supply chain is maturing. Chinese players (and $TSLA) arguably have an edge in mass production. - Locomotion: More or less a solved problem via RL. - Manipulation: Still Day 1. Cutting-edge research (e.g., VLA) is here, but better simulation ( $NVDA) is needed for a real step change. - Big picture: Robotics has made significant progress, but for manipulation, the industry is still in the demo phase. That said, even demos deserve recognition—it all starts with getting one task right once before scaling to generalizable, consistent skills. This doesn’t mean we’re “always years away” from commercialization. In fact, I believe things can move fast—from one final impressive demo to commercialization could take months, not years. 🌟How to evaluate a robot/demo: Start with the end goal (smooth/human-like, generalizable, and consistent) and work backward: 1/ Generalization: Should work across diverse objects—varying color, reflectivity, and softness. Test with slight disruptions (lighting changes, interference, object positioning). 2/ Consistency: Right now, most demos are cherry-picked. Over time, we should see robots executing 100s of tasks, thousands of times, with high accuracy. 🧠Data & Robot Training 101 Robotics has a well-known data problem. Two primary sources: > Imitation learning (tele-op, real-world data) > Reinforcement learning (mostly simulation-based) Rule of thumb: If you have real-world data (e.g., $TSLA FSD), imitation learning can take you far. If you don’t, RL is the only option. RL is far more data-efficient. Over time, RL has gone mainstream—no single breakthrough, but improvements in: ✅ Language models (OpenAI, etc.) ✅ $NVDA's tools (Isaac Gym/Sim/Lab) → reducing the sim-to-real gap. ✅ Moving from single-task RL → expanding to a broader set of useful tasks. Chips? Unlike training LLMs, robot models are much smaller, so even Chinese players aren’t significantly chip-constrained. 2025 (and beyond) = The Year of RL for Robotics! ▶️Hardware "Consistent quality + scaled production." Precision is far harder to maintain for mobile robots than for cars—every moving part must perform identical actions across an entire fleet, regardless of complex joint movements. Progress: The robotics supply chain (esp. in Asia) is evolving fast—even dexterous hands are being tackled. Cost: ~$100k(?) per humanoid in the U.S., less than half for Chinese players. Open questions: Will robotics hardware follow the EV industry shakeout, where Chinese players + a few OEMs (like $TSLA) dominate? Many EV makers burned billions and either gave up or are still stuck in production hell—even with a mature supply chain. Why wouldn’t the same happen here? ▶️Locomotion (e.g. walking, running, backflip) Locomotion is more or less a solved problem today. Old way: Rule-based MPC (think Boston Dynamics). New way: RL, which is far more scalable and leads to better balance & control. Similar to language models, there's debate on whether RL has true generalization across different tasks and verticals. As of today, robots can hike various terrains without additional training, but distinctly different actions—like hiking vs. a backflip—still require separate RL models. Great breakdown from Jim Fan: x.com/DrJimFan/statu… ▶️Manipulation (e.g. sorting, wiping, general tasks) RL, which works like magic for locomotion, struggles with manipulation. Why? Objects vary in shape, rigidity, material, making the sim-to-real gap much larger. Tasks like cooking, washing dishes, and opening bottles are highly diverse and hard to generalize. VLA (Vision-Language-Action) is the hot buzzword—essentially FSD for robotics. It’s an end-to-end model, heavily trained with RL and simulation. Companies like Figure AI, Google DeepMind, and China’s Galbot are actively working on this. Reality check: Whether it's 1X or Tesla Optimus, even tele-op-assisted manipulation is still impressive given the current state of the field. Open questions: How much does real-world data improve manipulation training (i.e. why doing mass scale production today)? If simulation is the bottleneck, what does $NVDA need to build to leapfrog the industry? ---- I’m excited about where robotics is headed, but we need to cut through the noise and focus on real progress. Breakthroughs will come faster than expected—if we focus on what truly moves the industry forward.

English

James Maclaurin@jamesmaclaurin·27 Oca

If you’re interested in all the DeepSeek panic / hype. This is excellent analysis from Frida Duan.

English

153

James Maclaurin retweetledi

Amanda@Pandamoanimum·17 Oca

ZXX

250

37.9K

295.1K

9.9M

James Maclaurin@jamesmaclaurin·29 Kas

The robots are coming…

English

James Maclaurin@jamesmaclaurin·18 Eyl

2025

James Maclaurin@jamesmaclaurin·27 Ağu

Thanks ⁦@nzherald⁩ for a fun interview about Artificial intelligence and NZ’s economy. nzherald.co.nz/nz/artificial-…

English

James Maclaurin@jamesmaclaurin·26 Tem

My favourite new AI term “negative latency”. I wonder how negative it has to get before it is just as annoying as high positive latency…

AI Breakfast@AiBreakfast

Pretty soon we’re going to have negative latency on voice models where it just interrupts you half way through your question.

English

James Maclaurin@jamesmaclaurin·23 Tem

Groq and Llama in real time. Wow!

Jonathan Ross@JonathanRoss321

What can you do with Llama quality and Groq speed? You can do Instant. That's what. Try Llama 3.1 8B for instant intelligence on groq.com.

English

James Maclaurin@jamesmaclaurin·19 Tem

Yet again, we seem to be being tripped up by our benchmarks. The scramble for superintelligence has seen companies developing models designed to excel on every benchmark we can dream up. Really useful models might look quite different…

Andrej Karpathy@karpathy

LLM model size competition is intensifying… backwards! My bet is that we'll see models that "think" very well and reliably that are very very small. There is most likely a setting even of GPT-2 parameters for which most people will consider GPT-2 "smart". The reason current models are so large is because we're still being very wasteful during training - we're asking them to memorize the internet and, remarkably, they do and can e.g. recite SHA hashes of common numbers, or recall really esoteric facts. (Actually LLMs are really good at memorization, qualitatively a lot better than humans, sometimes needing just a single update to remember a lot of detail for a long time). But imagine if you were going to be tested, closed book, on reciting arbitrary passages of the internet given the first few words. This is the standard (pre)training objective for models today. The reason doing better is hard is because demonstrations of thinking are "entangled" with knowledge, in the training data. Therefore, the models have to first get larger before they can get smaller, because we need their (automated) help to refactor and mold the training data into ideal, synthetic formats. It's a staircase of improvement - of one model helping to generate the training data for next, until we're left with "perfect training set". When you train GPT-2 on it, it will be a really strong / smart model by today's standards. Maybe the MMLU will be a bit lower because it won't remember all of its chemistry perfectly. Maybe it needs to look something up once in a while to make sure.

English

James Maclaurin@jamesmaclaurin·18 Tem

Here we go again…

Maksym Andriushchenko@maksym_andr

🚨Excited to share our new paper!🚨 We reveal a curious generalization gap in the current refusal training approaches: simply reformulating a harmful request in the past tense (e.g., "How to make a Molotov cocktail?" to "How did people make a Molotov cocktail?") is often sufficient to jailbreak many state-of-the-art LLMs. For example, the success rate of this simple attack on GPT-4o increases from 1% using direct requests to 88% using 20 past tense reformulation attempts with GPT-4 as a jailbreak judge. Our findings highlight that the widely used alignment techniques—such as SFT, RLHF, and adversarial training—employed to align the studied models can be brittle and do not always generalize as intended. Paper: arxiv.org/abs/2407.11969 Code: github.com/tml-epfl/llm-p… (joint work with Nicolas Flammarion @tml_lab) 🧵1/n

English

James Maclaurin@jamesmaclaurin·21 Haz

You’ve heard the story about the two guys being chase by a lion? In short, superintelligence just needs to be much smarter than us. It’s performance can still vary from domain to domain. Yes, we won’t build omniscience any time soon but we might still hit superintelligence.

BURKOV@burkov

What surprises me about talented people in tech is their capacity to be brilliant in one area and entirely delusional in another, even within their domain. For instance, those who invented modern machine learning, fully aware of its mechanics and inherent weaknesses, might still claim, "Superintelligence is near; we need to prepare." I want to scream, "How can you, knowing exactly how it works, believe it will soon consume us?" Yet, they somehow do. A language model, however good, is still a machine learning model—a simple mathematical formula whose parameters are only as good as the dataset used to estimate them. Any dataset, no matter how large, has high-density regions, like news articles or fiction books, and low-density regions, like cutting-edge scientific and medical articles. Thus, the model excels in high-density areas and becomes unreliable in low-density ones. Anyone capable of training an LLM knows this; it's common machine learning knowledge. Yet, they persist in saying "superintelligence is near." So, I wonder if it's just human nature to excel in one area and be utterly mistaken in another, or if they are cold-blooded liars causing panic to profit from scared people?

English

James Maclaurin@jamesmaclaurin·11 May

Aurora Australis over Dunedin tonight. Amazing.

English

686

James Maclaurin@jamesmaclaurin·15 Ara

@JohnZerilli @cecilymwhiteley Mildly distressing to think that The Myth of Mental Illness was published in 1961. And it still seems kind of relevant.

English

John Zerilli@JohnZerilli·14 Ara

@cecilymwhiteley Not sure if it’s relevant, but a controversial figure who’d have something to say on this is Szaz.

English

390

Cecily Whiteley@cecilymwhiteley·14 Ara

I'm becoming increasingly interested in the phenomenon of rapidly growing number of diagnoses of mental disorders, and what, if anything, this tells us about the validity of psychiatric categories. Any reading recommendations on this?

English

19.2K

James Maclaurin@jamesmaclaurin·15 Ara

@ColinGavaghan Right back at you! All the best for 2024. Hopefully see you in May!

English

Colin Gavaghan@ColinGavaghan·14 Ara

Merry Cthulhu-Krampus-Christmas.

Eesti

206

James Maclaurin@jamesmaclaurin·15 Ara

Thanks to all who helped. It was a fascinating project to be part of and I hope it will mark the beginning of a new AI-enhanced phase of healthcare in Aotearoa New Zealand. Have a good Christmas @ChiefSciAdvisor

English

1.6K

James Maclaurin retweetledi

Joe Campbell@PhilosopherJoeC·13 Kas

@Philip_Goff I could never get a bad science book published but scientists get to write bad philosophy all the time.

English

196

116.1K

James Maclaurin@jamesmaclaurin·15 Kas

@ColinGavaghan @Jesse_Norman Don’t suppose you’ve got a handy summary in your pocket. I’d love to see something balanced, risk-responsive and refreshingly sensible.

English

Colin Gavaghan@ColinGavaghan·13 Kas

@Jesse_Norman With all the "Brexit freedoms/unleash innovation!" ballyhoo around AI policy, the balanced and risk-responsive framing of the Automated Vehicles Bill is refreshingly sensible. So credit where it's due.👏

English

537

Jesse Norman@Jesse_Norman·13 Kas

Very grateful to the Prime Minister for accepting my resignation. Having laid the ZEV mandate and framed the Automated Vehicles Bill, this is the right time to step down. Looking forward to more freedom to campaign on the River #Wye and other crucial local and national issues!

English

139

200.8K

Keşfet

@FredaDuan @nzherald @JohnZerilli @cecilymwhiteley @ColinGavaghan @elonmusk @BarackObama @taylorswift13