Joe Clinton

45 posts

Joe Clinton

@JoeClinton02

Developing better Long-Horizon behaviour models for robots.

เข้าร่วม Ocak 2018

98 กำลังติดตาม26 ผู้ติดตาม

Joe Clinton@JoeClinton02·24 Şub

@PgChiyo If you've kept the same motors yours arms now have a rated payload of 0g, and a peak payload of 250g. You need to at least replace the first few motors with STS3250.

English

Kohei Sendai@PgChiyo·18 Şub

Updated version of ALOHA Mini, called 🦂ALOHA SCORPION ver 2.0 is out !! github.com/k1000dai/Aloha… - better chassis - larger battery - updated gripper (using PincOpen) - better arm position to reach a higher place Here is the VIDEO which replays the collected dataset!

Kohei Sendai@PgChiyo

New version Work in progress

English

10.5K

Joe Clinton@JoeClinton02·13 Oca

Recently have been working with image-to-video generation models a lot more, so I put together this graph to help determine the best video model for any price point. Seedance-v1.5-pro stands out the most to me as the optimal choice to balance quality and cost.

English

567

Joe Clinton@JoeClinton02·13 Oca

@ihorbeaver Perhaps you could speed up the model and then learn a residual network to adapt the action decoder to the wobble with online-rl?

English

193

Igor Kulakov@ihorbeaver·13 Oca

The advantage of arms with industrial internals is that they don’t wobble, so the AI model can control it faster by just multiplying frames per second. Here’s we multiplied the FPS by 3× compared to teleoperation (180 instead of 60).

English

377

69.4K

Joe Clinton รีทวีตแล้ว

1X@1x_tech·12 Oca

NEO’s Starting to Learn on Its Own

English

296

414

3.1K

6.2M

Joe Clinton@JoeClinton02·11 Oca

@thealexbanks This doesn't account for developers moving to untrackable local agents like claude code, codex, cursor and copilot in the same timeframe. Claude Code is far ahead of Codex which is in turn ahead of Gemini.

English

Alex Banks@thealexbanks·7 Oca

OpenAI lost 22% market share in 12 months. Gemini is eating their lunch. The first Global AI Tracker of 2026 just dropped. Here's what caught my attention. Market share as of Jan 2nd: → ChatGPT: 64.5% → Gemini: 21.5% → DeepSeek: 3.7% → Grok: 3.4% → Perplexity: 2.0% → Claude: 2.0% → Copilot: 1.1% 12-month transformation: → ChatGPT dropped from 86.7% to 64.5% → Gemini exploded from 5.7% to 21.5% → Grok didn't exist and now approaching DeepSeek 12-week data change: → OpenAI: -22% (their worst period on record) → Gemini: +49% (relentless momentum) → Claude: -14% (stable but dipping) → Grok: +52% (fastest growing)

English

2.5K

Joe Clinton@JoeClinton02·9 Oca

@Ciszek @chris_j_paxton The vae has a 16x16x4 compression. The model begins with a 480x640 input, so 4 frames is compressed to 600 tokens. The input is 5 context frames + 4 noisy latent frames. The DIT generates this in a single step then passes to the action head. This is not a problem.

English

Tom Ciszek@Ciszek·8 Oca

@chris_j_paxton 1 frame = 196 tokens 16 frames = 3136 tokens

English

472

Chris Paxton@chris_j_paxton·8 Oca

It seems very clear to me that video models will replace image models for robotics over the next year Video models make so much more sense for robotics tasks which usually involve some occlusion and partial observability (Video from mimic-video paper, eth)

English

290

20.7K

Joe Clinton@JoeClinton02·9 Oca

@chris_j_paxton VLA's with a video model backbone are my PhD topic. Wholeheartedly believe they are the way forwards and will share some exciting progress on this front later in the year.

English

432

Joe Clinton@JoeClinton02·8 Oca

@ycombinator @gentrajectory So it's affordance driven grasping like arxiv.org/html/2508.0889… ?

English

Y Combinator@ycombinator·6 Oca

In just ~3 months, as a solo founder with no prior robotics experience, @gentrajectory trained a foundation model for dexterous manipulation that lets humanoid robots pick up unseen objects and perform real-world work. It generalizes to novel objects and scenes, including cases where prior SoTA models achieve 0% success. Congrats on the launch @joshuabelofsky! ycombinator.com/launches/P6q-g…

English

787

67.6K

Joe Clinton@JoeClinton02·8 Oca

@joshuabelofsky Doesn't look accurate enough to be useful unfortunately. I think the data collected would be low quality and it would impact the resulting policy.

English

Joshua@joshuabelofsky·6 Oca

Open-sourcing my internal dexterous teleop stack! github.com/GeneralTraject… It uses vision + Vive wrist trackers instead of data gloves → about $500 in hardware vs. ~$5,000.

English

629

42.2K

Joe Clinton@JoeClinton02·15 Ara

@mo_danesh @k7agar You can't guess anything from a such a tiny amount of information. Why even bother trying to help. There's hundreds of possible reasons a VLA model might underperform.

English

Mohamad H. Danesh@mo_danesh·15 Ara

@k7agar IG the LR is too small to force the model to learn the semantics.

English

atharva ☆@k7agar·14 Ara

unfortunately loss curves mean nothing in robotics </3

English

122

10.9K

Joe Clinton@JoeClinton02·15 Ara

@k7agar Loss curves are useful for comparisons between models that have same datasets and same loss function.

English

Joe Clinton@JoeClinton02·10 Ara

@chatgpt21 45s Christmas ad for mcdonalds with no speaking roles, 18 locations, 45 actors, 90 extras, 3 cgi shots would require a budget of >$1 million. It's likely they spent about 10x less on this ad and even negative attention is still attention.

English

Chris@chatgpt21·9 Ara

McDonald’s just dropped a new AI ad and it’s beautiful and I am genuinely tired of people pretending this is not the future of media. If this played on your TV during a normal commercial break, you would be disingenuous to say “its slop” or “I could easily tell it is AI.” It is a fantastic ad on its own merits, and it is obvious that AI video will eventually be one to one with reality, where you truly cannot tell the difference. if your of average intelligence and can extrapolate of course. When we get there, then what? Is it still “slop,” or does “slop” permeate as a label for anything made with AI, even when you cannot tell at all?

Culture Crave 🍿@CultureCrave

McDonald's has released an AI-generated Christmas ad The studio behind it says they 'hardly slept' for several weeks while writing AI prompts and refining the shots — 'AI didn't make this film. We did' Comments have been turned off on YouTube

English

1.7K

1.1K

2.2M

Joe Clinton@JoeClinton02·27 Kas

@lukas_m_ziegler I think this could have been done way cheaper by just waiting for the heated bed to cool down and then repeatedly ramming the part with the flat side of the extruder head until it unsticks and then pushing it off the ledge onto a cushion below.

English

406

Lukas Ziegler@lukas_m_ziegler·26 Kas

A robot for the 3D printing farm! 🖨️ 3D printing is often tied to a repetitive cycle: wait for the print to finish, remove it, clean up, start the next one—and repeat. But what if there were a solution that changed all of that? This robot powers an entire 3D printer farm! With this system, printing can run non-stop, as long as there’s filament to feed it. The robot handles the rest: collecting finished prints and placing them neatly on the rack, ready for the next job. Great engineering by DHR Engineering! 🦾 ~~ ♻️ Join the weekly robotics newsletter, and never miss any news → ziegler.substack.com

English

209

1.9K

138.8K

Joe Clinton@JoeClinton02·25 Kas

@KLieret Hi, when will you update with GLM 4.6, Kimi K2 thinking and Minimax M2? Would love to know how they compare.

English

338

Kilian Lieret@KLieret·25 Kas

Opus 4.5 reclaims the top of the official SWE-bench leaderboard with 74.4%, narrowly ahead of Gemini 3. Cheaper than Opus 4, but more expensive than Gemini. Takes less steps than Sonnet 4.5, but still run for >100 steps for optimal performance. Details in 🧵

English

16.7K

Joe Clinton@JoeClinton02·20 Kas

@liyitengx @RemiCadene Hi, first off this is amazing! Secondly, wanted to ask two questions: 1. why you didn't go for an off the shelf telescopic lift solution? 2. What is the payload of the lekiwi base and do you think it's overloaded?

English

334

Li Yiteng@liyitengx·18 Kas

AlohaMini wouldn’t exist without LeRobot. Thank you @remicadene for building such an inspiring open-source robotics framework. Using LeRobot, I built a dual-arm, 3D-printed robot with a lift — and now I’m open-sourcing everything too. GitHub: github.com/liyiteng/Aloha…

English

118

34.4K

Joe Clinton@JoeClinton02·21 Eyl

@vbingliu Could you test models with their preferred agent that they recommend (Claude with Claude Code, GPT-5 with Codex, Gemini with Gemini-Cli, Qwen with Qwen-code)? The right agent pairing should significantly boost performance.

English

418

Bing Liu@vbingliu·20 Eyl

🚀 Introducing SWE-Bench Pro — a new benchmark to evaluate LLM coding agents on real, enterprise-grade software engineering tasks. This is the next step beyond SWE-Bench: harder, contamination-resistant, and closer to real-world repos.

English

111

1.1K

567.6K

Joe Clinton@JoeClinton02·8 Ağu

gpt-oss is now on @ArtificialAnlys, and is absolutely dominating the Pareto frontier of intelligence vs cost!

English

119

Joe Clinton@JoeClinton02·7 Ağu

GPT-5 significantly underperforms expectations on SWE-bench verified with a score of 74.9%. This suggests progress on swe from ai is slowing down. We are yet to see GPT-5 + Codex scores which should be higher - hoping for a 78% score by the end of the month.

English

270

Joe Clinton@JoeClinton02·26 Tem

The new Qwen3 fills a much needed gap in intelligence vs cost. I'd recommend gpt wrapper startups that are using 4o, begin using Qwen3 instead, for a significant boost in intelligence whilst actually REDUCING cost.

English

251

Joe Clinton@JoeClinton02·27 Haz

@mark_k O3 did it without problem. chatgpt.com/share/685deb40…

English

Mark Kretschmann@mark_k·26 Haz

Grok 3.5 generated this, wanna bet? 👀👀

English

321

23.9K

ค้นพบ

@PgChiyo @ihorbeaver @thealexbanks @Ciszek @chris_j_paxton @ycombinator @gentrajectory @joshuabelofsky