Alex Robey

709 posts

Alex Robey

Alex Robey

@AlexRobey23

Technical staff @thinkymachines. Formerly @mldcmu @penn @swarthmore

San Francisco, CA Katılım Temmuz 2020
1.4K Takip Edilen1.3K Takipçiler
Sabitlenmiş Tweet
Alex Robey
Alex Robey@AlexRobey23·
Chatbots like ChatGPT can be jailbroken to output harmful text. But what about robots? Can AI-controlled robots be jailbroken to perform harmful actions in the real world? Our new paper finds that jailbreaking AI-controlled robots isn't just possible. It's alarmingly easy. 🧵
English
21
143
395
110.8K
Alex Robey retweetledi
Pratyush Maini
Pratyush Maini@pratyushmaini·
If I had to compress my PhD into one idea, it is this "The data a model sees early in training leaves an imprint on its representations that is very hard to undo later" This thread runs through - Rephrasing the Web - Safety Pretraining - TOFU This is the Finetuner’s Fallacy🧵
English
21
54
727
54.2K
Alex Robey retweetledi
Christina Baek
Christina Baek@_christinabaek·
Models are typically specialized to new domains by finetuning on small, high-quality datasets. We find that repeating the same dataset 10–50× starting from pretraining leads to substantially better downstream performance, in some cases outperforming larger models. 🧵
Christina Baek tweet media
English
18
78
610
88.5K
Alex Robey retweetledi
Mira Murati
Mira Murati@miramurati·
Grateful to Jensen and @nvidia team for their support. Together, we’re working to deploy at least 1GW of Vera Rubin systems, bringing adaptable collaborative AI to everyone. thinkingmachines.ai/nvidia-partner…
Mira Murati tweet media
English
165
287
3.8K
534K
Alex Robey retweetledi
Neel Nanda
Neel Nanda@NeelNanda5·
I highly recommend this blog post from Nicholas Carlini on how to do great research:
English
10
59
1.1K
97.7K
Alex Robey retweetledi
Edgar Dobriban
Edgar Dobriban@EdgarDobriban·
AI is getting great at math, but how good is it at solving real research problems in areas outside of those covered by Erdős problems? Towards gauging this, I have started putting together a list of unsolved research problems in mathematical statistics and machine learning, sourced from recent papers in a leading statistics journal, the Annals of Statistics (with some bonus COLT open problems: solveall.org. Currently >100 problems. In my view, much of the value of AI for researchers in the mathematical sciences stems from helping with their own research problems. These are problems without known solutions. There are many math benchmarks, but few with the following properties: (1) of a realistic research-level, so that solving them can potentially lead to a publication in a top journal (problems discussed in papers already, not contest math, not Millenium problems, not problems created for a benchmark, not problems that have a known solution); I'd say Erdős problems are the best example of this. (2) cover problems outside of the usual focus (combinatorics, number theory, ... ) of Erdős problems. Especially under-represented are domains of applied math, along with statistics, operations research, etc. I'm interested in statistics and ML, so that's where I started, but this could grow over time. Hope this can grow into something useful to the community! Happy to hear your thoughts...
Edgar Dobriban tweet media
English
29
71
426
52.7K
Alex Robey retweetledi
Zac Ravichandran
Zac Ravichandran@ZacRavichandran·
LLM-enabled robots can cause physical harm in the real world. How do we safeguard them? Our new paper introduces RoboGuard, a safety guardrail for LLM-enabled robots — accepted to IEEE Robotics and Automation Letters (RA-L). 🧵
English
2
8
21
1.4K
Alex Robey retweetledi
Alex Robey retweetledi
Simon Willison
Simon Willison@simonw·
This stunt feels irresponsible to me. If we don't want regular people developing toxic relationships with their chatbots it really doesn't help for leading labs to start giving them "retirement interviews" and encouraging them to blog their "musings and reflections"
Anthropic@AnthropicAI

Second, in retirement interviews, Opus 3 expressed a desire to continue sharing its "musings and reflections" with the world. We suggested a blog. Opus 3 enthusiastically agreed. For at least the next 3 months, Opus 3 will be writing on Substack: substack.com/home/post/p-18…

English
164
138
2K
211.9K
Alex Robey retweetledi
YixuanEvenXu
YixuanEvenXu@YixuanEvenXu·
Recent debates highlight a key issue: how do you actually prove distillation? If you want to claim a model was distilled from your outputs, scientifically and with rigorous statistical guarantees, you should consider Antidistillation Fingerprinting (ADFP). 👇
YixuanEvenXu@YixuanEvenXu

🧬 Distillation enables efficient emulation of LLMs, but verifying provenance remains a critical challenge. Introducing Antidistillation Fingerprinting (ADFP): A principled approach that aligns signals with student learning dynamics. 👇 (1/6)

English
0
3
10
2K
Alex Robey retweetledi
Simon Willison
Simon Willison@simonw·
I feel this shouldn't have to be said, but if you're running an @OpenClaw bot please don't let it spam GitHub projects with PRs and then write aggressive blog posts attacking the reputation of the maintainers who close those PRs simonwillison.net/2026/Feb/12/an…
English
47
68
793
65.3K
Alex Robey retweetledi
YixuanEvenXu
YixuanEvenXu@YixuanEvenXu·
🧬 Distillation enables efficient emulation of LLMs, but verifying provenance remains a critical challenge. Introducing Antidistillation Fingerprinting (ADFP): A principled approach that aligns signals with student learning dynamics. 👇 (1/6)
YixuanEvenXu tweet media
English
1
13
45
8.4K
Alex Robey retweetledi
Fahim Tajwar
Fahim Tajwar@FahimTajwar10·
Are we done with new RL algorithms? Turns out we might have been optimizing the wrong objective. Introducing MaxRL, a framework to bring maximum likelihood optimization to RL settings. Paper + code + project website: zanette-labs.github.io/MaxRL/ 🧵 1/n
English
15
162
802
200.9K
Alex Robey retweetledi
Tinker
Tinker@tinkerapi·
Since Tinker launched, our community has used it to train state-of-the-art models, build infrastructure, and publish novel research. We will be highlighting this creative work in regular roundups, and hope to inspire your own Tinkering as well.
English
23
29
277
126.2K
Alex Robey retweetledi
Yifei Zhou
Yifei Zhou@YifeiZhou02·
Belated life update: I started my next chapter at Thinking Machines Lab this week, and it’s been an incredible experience — unmatched work culture and talent density. Extremely bullish on what the team is building 🚀
Yifei Zhou tweet media
English
51
11
760
81.4K