Andrew Silva
1.3K posts

Andrew Silva
@andrewsilva9
Research Scientist @ | Previously @ Toyota Research Institute and Google | PhD from Georgia Tech. @andrewsilva9.bsky.social
Katılım Eylül 2010
261 Takip Edilen472 Takipçiler
Andrew Silva retweetledi

Neural nets don’t just forget. Sometimes, after long training, they lose the ability to learn at all.
In our #ICLR2026 poster, we model Loss of Plasticity as gradient dynamics trapped in invariant manifolds: 🔴 frozen units, 🔵 cloned units.
The video makes the traps visible.
English
Andrew Silva retweetledi

1/6 The "Self-Improvement" Paradox
Can an LLM get smarter using only its own raw, unverified outputs?
No verifiers. No teachers. No RL.
We found the answer is an emphatic YES.
Introducing SimpleSD: Embarrassingly Simple Self-Distillation. By simply sampling solutions from a model with specific temperature and truncation settings and then fine tuning the model on those exact samples, Qwen3-30B jumped from 42.4% to 55.3% (30% improvement) on LiveCodeBench v6 just by training on its own samples! 🚀
The gain is universal across different model sizes (4B, 8B, 30B) and model families (Llama, Qwen). The harder the problem is, the larger the gain. 📈
Kudos to my amazing colleagues @onloglogn, @richard_baihe, @UnderGroundJeg, Navdeep Jaitly, @trebolloc. Check out the paper and code below! 👇
paper: arxiv.org/abs/2604.01193
code: github.com/apple/ml-ssd
HF models: huggingface.co/collections/ap…

English
Andrew Silva retweetledi

New paper 🥳 RL relies a lot on an agent’s capability to explore. Our strategy-guided exploration makes the agent find new solutions more efficiently. It learns faster, and in some environments its Pass@1 surpasses the base model’s Pass@128. 🧵1/6
📄 arxiv.org/abs/2603.02045

English
Andrew Silva retweetledi

Autoregressive models dominate, but what if we treat multimodal generation as discrete order agnostic iterative refinement? Excited to share our systematic study on the design space of Tri-Modal Masked Diffusion Models (MDMs). We pre-trained the first Tri-Modal MDM from scratch on (text,), (image, text), and (audio, text). The same model can do ASR, TTS, T2I, captioning and native text generation.
What I'm the most proud of in this work is the scientific rigor. Over 3,500 training runs. Principled hyperparameter transfer. Honest results. Carefully controlled ablations across multiple different axis of entanglement.
A thread on our empirical findings (arXiV: arxiv.org/abs/2602.21472)

English
Andrew Silva retweetledi

SSMs promised efficient language modeling for long context, but so far seem to underperform compared to Transformers in many settings. Our new work suggests that this is not a problem with SSMs, but with how we are currently using them.
Arxiv: arxiv.org/pdf/2510.14826
🧵

English
Andrew Silva retweetledi

New preprint & open-source! 🚨 “SimpleFold: Folding Proteins is Simpler than You Think” (arxiv.org/abs/2509.18480). We ask: Do protein folding models really need expensive and domain-specific modules like pair representation? We build SimpleFold, a 3B scalable folding model solely built on general-purpose transformers + flow matching, and is trained on 9M structures. SimpleFold supports easy deployment and efficient inference on consumer-level hardware with PyTorch/MLX (try it on your MacBook!) (1/n)

English

@yoavgo I recently had one there for a month and a half maybe? I just waited and it cleared eventually…
English
Andrew Silva retweetledi

We’re now accepting ARR commitments for PALS 2025! If your ARR-reviewed paper fits with our themes on LLM personalization, submit by September 4: openreview.net/group?id=EMNLP…
See the full call for papers and topic details on our website: pals-nlp-workshop.github.io
#EMNLP25 @emnlpmeeting
English
Andrew Silva retweetledi

3 days remaining for direct submissions to PALS 2025! Share your findings or works in progress on LLM personalization here: openreview.net/group?id=EMNLP…
See our website for the call for papers and information about relevant topics: pals-nlp-workshop.github.io
#EMNLP25 @emnlpmeeting
English
Andrew Silva retweetledi

📣 We are excited to present our work on inferring user preferences from writing samples at @icmlconf Poster Session 3 (Wed. 11:00AM - 1:30PM)!
Come by to ✋ chat with us, 📄 learn about our method, and 💻 hear about our new interactive benchmark (🔗s below)!
English

Enjoyed this paper and the central idea, so I wrote a quick summary with some thoughts on future work: andrew-silva.github.io/posts/deepmind…
Thanks for the great work @yanming_wan @jiaxing_jxwu @marwaabdulhai @LiorShan @natashajaques
Yanming Wan@yanming_wan
Personalization methods for LLMs often rely on extensive user history. We introduce Curiosity-driven User-modeling Reward as Intrinsic Objective (CURIO) to encourage actively learning about the user within multi-turn dialogs. 📜 arxiv.org/abs/2504.03206 🌎 sites.google.com/cs.washington.…
English
Andrew Silva retweetledi

Our submission site is now live! Direct submissions for PALS 2025 can be made here: openreview.net/group?id=EMNLP…
See our website for the call for papers and information about relevant topics: pals-nlp-workshop.github.io
#EMNLP25 @emnlpmeeting
English

@Amireskndri @pals_nlp_wrkshp @emnlpmeeting Hi Amir, thanks for catching this! The deadline is August 1, we have just fixed the website.
English

@pals_nlp_wrkshp @emnlpmeeting Hi, I noticed that the deadline is listed as July 18 on the website (#important-dates" target="_blank" rel="nofollow noopener">pals-nlp-workshop.github.io/#important-dat…), but this post mentions August 1. Could you please clarify which one is the correct deadline?
English
Andrew Silva retweetledi

Join us at @emnlpmeeting for:
"Tailoring AI: Exploring Active and Passive LLM Personalization" 🎯🧠
To answer, when should LLMs personalize? What role do users play in LLM-personalization?
📅 Deadline Aug. 1
📝 Details in thread 🧵👇
#EMNLP2025 #LLM #AI #personalization
1/5
English

@priontific Haha yes that is me! I have MLX_LM + PPO here github.com/andrew-silva/m…, but unfortunately I did not document it _super_ well (much of the documentation on how to run stuff is in the docstrings at the top of each file!). I haven't tried to reimplement GRPO yet though!
English

@andrewsilva9 - any chance you’re the andrewsilva who made the MLX PPO cartpole example? I’m determined to get GRPO (or even just PPO) working in MLX_lm and I’ll be giving it a crack all week, but it’s already clear I’ll be needing help 😅
English

Turns out that to get this to work, I'm gonna have to reimplement @huggingface 's GRPO trainer from the trl library into MLX... which I don't think I'll be able to do, even with Sonnet's help 😅
But I've put @cursor_ai into agentic mode and I'm gonna see what it can do lol

English
Andrew Silva retweetledi

@lulumeservey "Faithless is he that says farewell when the road darkens"
- Gimli
English

@yoavgo I’ve noticed it also turns me into an integration engineer, spending a ton of time writing the bits of connector code between different AI generated modules. Which is also very unfun.
English





