Paria Rashidinejad

70 posts

Paria Rashidinejad

Paria Rashidinejad

@paria_rd

Assistant Professor @USC; Research Scientist @AIatMeta FAIR; PhD @berkeley_ai @CHAI_Berkeley

California, USA Katılım Mart 2023
517 Takip Edilen808 Takipçiler
Sabitlenmiş Tweet
Paria Rashidinejad
Paria Rashidinejad@paria_rd·
LLMs go stale daily: facts shift, discoveries land, hallucinations are uncovered. How do you continually keep up with knowledge drift without retraining? Our new work, CrispEdit, lets you apply 𝘁𝗵𝗼𝘂𝘀𝗮𝗻𝗱𝘀 𝗼𝗳 𝗲𝗱𝗶𝘁𝘀 to billion-parameter LLMs in 𝗷𝘂𝘀𝘁 𝗮 𝗳𝗲𝘄 𝗺𝗶𝗻𝘂𝘁𝗲𝘀 𝗼𝗻 𝗮 𝘀𝗶𝗻𝗴𝗹𝗲 𝗚𝗣𝗨, while keeping the model’s existing capabilities intact. That’s >𝟭𝟬𝟬𝘅 𝗳𝗮𝘀𝘁𝗲𝗿 than popular editors like AlphaEdit and MEMIT. 💡𝗖𝗼𝗿𝗲 𝗶𝗱𝗲𝗮: The landscape of existing capabilities is sharp in a few directions and flat in many others, so we apply edits only in the low-curvature subspace, where updates are “safe”. ✅ This avoids paying for full retraining and mitigates capability degradation and forgetting in existing editors. 𝗥𝗲𝘀𝘂𝗹𝘁𝘀: • 𝗛𝗶𝗴𝗵 𝗲𝗱𝗶𝘁 𝘀𝘂𝗰𝗰𝗲𝘀𝘀: +10% over best baselines under the real 𝘢𝘶𝘵𝘰𝘳𝘦𝘨𝘳𝘦𝘴𝘴𝘪𝘷𝘦 𝘨𝘦𝘯𝘦𝘳𝘢𝘵𝘪𝘰𝘯 (WILD), not just teacher-forced evaluation. • 𝗖𝗮𝗽𝗮𝗯𝗶𝗹𝗶𝘁𝗶𝗲𝘀 𝗶𝗻𝘁𝗮𝗰𝘁: <1% drop on average. • 𝗙𝗮𝘀𝘁: 3,000 edits on Llama-3-8B in <5 minutes on a single NVIDIA A40. • 𝗦𝗲𝗾𝘂𝗲𝗻𝘁𝗶𝗮𝗹 𝘂𝗽𝗱𝗮𝘁𝗲𝘀: Sequential CrispEdit effectively maintains both the capabilities and previous edits. 📝 arxiv.org/pdf/2602.15823
Paria Rashidinejad tweet media
English
2
5
19
1.8K
Kiana Ehsani
Kiana Ehsani@ehsanik·
This is a long post, mainly because I have a lot to say, but in case you are too busy: TLDR: @Vercept_ai is joining @AnthropicAI! We shared a mission, so we joined forces to accelerate it into reality. Couldn't be more excited! Why Vercept was started In 2024, AI coding tools were already becoming magical for developers, but other industries were ages behind. It felt insane that when my mom had IT issues, I still had to hop on a call and walk her through it step by step. Insane that sending a simple email took so many clicks. That's why we started @Vercept_ai : Build something that acts for users instead of telling them how to do it. Two goals: 1) help people do tasks they didn't know how to do, and 2) handle the zero-brainpower tasks so people spend more time on creative work. As simple as scheduling meetings, as complex as reconciling messy financials before tax season. Ultimate goal was to have people spend less time behind screens and more time walking in nature. (Very Pacific Northwest mission 😁) The ride The journey of building an AI native company in this day and age was wild. Going from researcher to founder meant trading “reviewer number 2” for business partners and users, but surprisingly a lot of the same paradigms applied. Come up with a hypothesis, design an experiment, analyze user behavior, change the model and product based on the findings, wash, rinse and repeat. There are some differences though. The pace and the adrenaline. Lows are low, highs are high. We were constantly being challenged and learned at a pace we had never learned before. NEVER! If you are an adrenaline junkie like we are, it's a blast. The joy of the startup adrenaline rush is truly underrated. Why Anthropic We raised more than $50M, had a comfortable runway and a successful product, were building full steam with a small team, and were truly enjoying every minute of it. But that's when the opportunity came to join forces with Anthropic. We already knew how great Anthropic was at building models and we admired their mission, but then we learned more about the vision. We went on hours of walks, had long conversations, talked to members across different orgs, and learned more about Anthropic's vision and commitment to core beliefs which were very similar to ours. The more we talked, the more we realized we had been working on the same mission but from complementary perspectives. We realized that joining forces meant we could build something much much bigger together. And beyond the mission, I am now a big believer that Anthropic's real moat isn't its best model. It's the people. Incredibly talented folks who genuinely care about mission and real impact over hype. A zero-ego culture obsessed with building something meaningful. The choices were clear: we could build independently and work toward the same vision as two separate versions of it, or join forces with an incredible team and accelerate that vision into reality. The decision became an easy choice. What's next for our mission Mission continues, just got a bigger stage and an expanded team. The goal is still to expand AI beyond just a chatbot, to enable non-technical users to leverage it just as much as technical ones. We're just getting started. It takes a village This journey wouldn't have happened without the people who made it what it was. First and foremost, my cofounders @LucaWeihs and @inkynumbers . Best people I could've wished for as cofounders. We never once got into an argument, always had communicative discussions and as a cherry on top shared the same sense of humor! I feel blessed and grateful to have these two in my life. Thankful to our team for trusting in the three of us and showing up day and night. Grateful for @sethbannon , our board member, lead investor, great mentor and the person whose energy is so infectious that whenever we were having a down moment we would say "channel your inner @fiftyyears energy!" And to our wonderful investors and supporters: @chrija and @PointNineCap , Yifan and Jacob and @ai2incubator , and @mattmcilwain and Ted Kummert from @MadronaVentures . Couldn't have done this without you. Onward 🐜
Kiana Ehsani tweet media
English
52
17
361
58.8K
Erdem Bıyık
Erdem Bıyık@ebiyik_·
I have received the 2026 ONR Young Investigator Program (YIP) Award. This honor is truly shared with my students in LiraLab, and the support from my colleagues at @USCViterbi and the broader robotics community have been immensely helpful. onr.navy.mil/2026-young-inv…
Erdem Bıyık tweet media
English
8
1
68
5.9K
Paria Rashidinejad
Paria Rashidinejad@paria_rd·
LLMs go stale daily: facts shift, discoveries land, hallucinations are uncovered. How do you continually keep up with knowledge drift without retraining? Our new work, CrispEdit, lets you apply 𝘁𝗵𝗼𝘂𝘀𝗮𝗻𝗱𝘀 𝗼𝗳 𝗲𝗱𝗶𝘁𝘀 to billion-parameter LLMs in 𝗷𝘂𝘀𝘁 𝗮 𝗳𝗲𝘄 𝗺𝗶𝗻𝘂𝘁𝗲𝘀 𝗼𝗻 𝗮 𝘀𝗶𝗻𝗴𝗹𝗲 𝗚𝗣𝗨, while keeping the model’s existing capabilities intact. That’s >𝟭𝟬𝟬𝘅 𝗳𝗮𝘀𝘁𝗲𝗿 than popular editors like AlphaEdit and MEMIT. 💡𝗖𝗼𝗿𝗲 𝗶𝗱𝗲𝗮: The landscape of existing capabilities is sharp in a few directions and flat in many others, so we apply edits only in the low-curvature subspace, where updates are “safe”. ✅ This avoids paying for full retraining and mitigates capability degradation and forgetting in existing editors. 𝗥𝗲𝘀𝘂𝗹𝘁𝘀: • 𝗛𝗶𝗴𝗵 𝗲𝗱𝗶𝘁 𝘀𝘂𝗰𝗰𝗲𝘀𝘀: +10% over best baselines under the real 𝘢𝘶𝘵𝘰𝘳𝘦𝘨𝘳𝘦𝘴𝘴𝘪𝘷𝘦 𝘨𝘦𝘯𝘦𝘳𝘢𝘵𝘪𝘰𝘯 (WILD), not just teacher-forced evaluation. • 𝗖𝗮𝗽𝗮𝗯𝗶𝗹𝗶𝘁𝗶𝗲𝘀 𝗶𝗻𝘁𝗮𝗰𝘁: <1% drop on average. • 𝗙𝗮𝘀𝘁: 3,000 edits on Llama-3-8B in <5 minutes on a single NVIDIA A40. • 𝗦𝗲𝗾𝘂𝗲𝗻𝘁𝗶𝗮𝗹 𝘂𝗽𝗱𝗮𝘁𝗲𝘀: Sequential CrispEdit effectively maintains both the capabilities and previous edits. 📝 arxiv.org/pdf/2602.15823
Paria Rashidinejad tweet media
English
2
5
19
1.8K
Paria Rashidinejad retweetledi
Mahdi Soltanolkotabi
Mahdi Soltanolkotabi@mahdisoltanol·
Popular “representation-preserving” editing methods are implicitly just moving in low-curvature subspaces. CrispEdit makes that explicit: we edit in low-curvature directions, and the tradeoffs improve. My inner optimizer is happy: 2nd-order ideas are making a comeback!
Zarif Ikram@TheZarifIkram

Teaching an LLM a new fact 𝐬𝐡𝐨𝐮𝐥𝐝𝐧'𝐭 𝐛𝐫𝐞𝐚𝐤 𝐞𝐯𝐞𝐫𝐲𝐭𝐡𝐢𝐧𝐠 𝐢𝐭 𝐚𝐥𝐫𝐞𝐚𝐝𝐲 𝐤𝐧𝐨𝐰𝐬. Model editing often feels like a zero-sum game: Every time you inject new facts, you risk degrading the model’s core capabilities or erasing prior data. It's the primary barrier to efficient Continual Learning. We’ve found a principled, scalable fix: 𝐂𝐫𝐢𝐬𝐩𝐄𝐝𝐢𝐭. arxiv.org/abs/2602.15823 Our approach treats capability preservation as a formal constraint. Instead of "hopeful" updates, we project edits into the low-curvature subspaces of the model's loss landscape—essentially hiding updates where the model is least sensitive. The Results: ✅ Preservation-edit efficacy: <1% capability degradation with high edit success (+10% on average over best baseline). ✅ Massive efficiency: Up to 100x reduction in compute compared to AlphaEdit/MEMIT. ✅ Scalable: Matrix-free projections that work for billion-parameter models. Our results suggest a promising path toward more reliable 𝐜𝐨𝐧𝐭𝐢𝐧𝐮𝐚𝐥 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 and mitigating 𝐟𝐨𝐫𝐠𝐞𝐭𝐭𝐢𝐧𝐠 and we’re eager to explore it further. 🧵👇

English
0
2
19
2.4K
Paria Rashidinejad
Paria Rashidinejad@paria_rd·
𝗥𝗲𝘀𝘂𝗹𝘁: Sequential CrispEdit effectively maintains both the base capabilities and prior edits, mitigating catastrophic forgetting.
Paria Rashidinejad tweet media
English
1
0
4
146
Paria Rashidinejad retweetledi
Zarif Ikram
Zarif Ikram@TheZarifIkram·
Teaching an LLM a new fact 𝐬𝐡𝐨𝐮𝐥𝐝𝐧'𝐭 𝐛𝐫𝐞𝐚𝐤 𝐞𝐯𝐞𝐫𝐲𝐭𝐡𝐢𝐧𝐠 𝐢𝐭 𝐚𝐥𝐫𝐞𝐚𝐝𝐲 𝐤𝐧𝐨𝐰𝐬. Model editing often feels like a zero-sum game: Every time you inject new facts, you risk degrading the model’s core capabilities or erasing prior data. It's the primary barrier to efficient Continual Learning. We’ve found a principled, scalable fix: 𝐂𝐫𝐢𝐬𝐩𝐄𝐝𝐢𝐭. arxiv.org/abs/2602.15823 Our approach treats capability preservation as a formal constraint. Instead of "hopeful" updates, we project edits into the low-curvature subspaces of the model's loss landscape—essentially hiding updates where the model is least sensitive. The Results: ✅ Preservation-edit efficacy: <1% capability degradation with high edit success (+10% on average over best baseline). ✅ Massive efficiency: Up to 100x reduction in compute compared to AlphaEdit/MEMIT. ✅ Scalable: Matrix-free projections that work for billion-parameter models. Our results suggest a promising path toward more reliable 𝐜𝐨𝐧𝐭𝐢𝐧𝐮𝐚𝐥 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 and mitigating 𝐟𝐨𝐫𝐠𝐞𝐭𝐭𝐢𝐧𝐠 and we’re eager to explore it further. 🧵👇
Zarif Ikram tweet media
English
1
3
12
3.1K
Paria Rashidinejad retweetledi
Guilherme Favaron
Guilherme Favaron@guifav·
One of the hard problems in LLM editing: you fix one behavior but quietly break general capabilities. CrispEdit by Zarif Ikram, Arad Firouzkouhi, @stephenltu, @mahdisoltanol, and @paria_rd from @USCViterbi tackles this with a second order constrained optimization approach. The key insight: project edit updates onto the low curvature subspace of the capability loss landscape using Kronecker factored approximate curvature (K FAC). This keeps edits surgical while preserving what the model already knows. Results across standard editing benchmarks: high edit success with capability degradation below 1% on average. That margin matters when you need to patch factual errors or safety issues in production LLMs without full retraining. The math is clean: Bregman divergence as the capability constraint yields the Gauss Newton Hessian exactly, even when the base model was not trained to convergence. A matrix free projector exploits Kronecker structure to stay efficient at scale.
Guilherme Favaron tweet media
English
1
2
2
335
Paria Rashidinejad retweetledi
Sang Michael Xie
Sang Michael Xie@sangmichaelxie·
@zijianwang30 @andrew_e_cohen @paria_rd @setlur_amrith Check out @setlur_amrith’s post about PrefixRL for his take on reusing the treasure trove of previously spent compute:
Amrith Setlur@setlur_amrith

Start “thinking” from scratch: That’s how RL samples long rollouts, even on problems it has already seen, burning 🔥 tons of FLOPs on exploration from scratch. In reality, we have an ever-growing treasure of good inference FLOPs spent on the base LLM or prior RL runs from it: rare correct traces on hard problems (very low pass@n). ♻️ How can we *reuse* these very very off-policy, stale but correct traces to get the most out of the costly inference FLOPs already spent? Some obvious ways: ⚡SFT on the off-policy traces: Model entropy collapses ⚡On-policy distillation or off-policy RL: Optimization destabilizes as traces are way too off-policy We introduce the simplest of ways to reuse FLOPs, PrefixRL: *condition* on a portion of the off-policy trace and get online RL to complete it.🧵⬇️

English
0
2
4
755
Paria Rashidinejad retweetledi
Surya Ganguli
Surya Ganguli@SuryaGanguli·
Our new paper "Deriving neural scaling laws from the statistics of natural language" arxiv.org/abs/2602.07488 lead by @Fraccagnetta & @AllanRaventos w/ Matthieu Wyart makes a breakthrough! We can predict data-limited neural scaling law exponents from first principles using the structure of natural language itself for the very first time! If you give us two properties of your natural language dataset: 1) How conditional entropy of the next token decays with conditioning length. 2) How pairwise token correlations decay with time separation. Then we can give you the exponent of the neural scaling law (loss versus data amount) through a simple formula! The key idea is that as you increase the amount of training data, models can look further back in the past to predict, and as long as they do this well, the conditional entropy of the next token, conditioned on all tokens up to this data-dependent prediction time horizon, completely governs the loss! This gets us our simple formula for the neural scaling law!
Surya Ganguli tweet media
English
20
117
576
60.1K
Paria Rashidinejad retweetledi
Ziming Liu
Ziming Liu@ZimingLiu11·
🚨Transformers don't learn Newton's laws? They learn Kepler's laws! Like us, transformers don't predict a flying ball via a differential equation, but by fitting a curve. Moreover, reducing context length steers a transformer from Keplerian to Newtonian. Compression in play.
Ziming Liu tweet media
English
25
204
1.2K
116.2K
Paria Rashidinejad
Paria Rashidinejad@paria_rd·
Practical takeaway for pure self-improvement (even without logged data): Our work suggests that to use your compute budget more effectively, it may be better to spend more compute on parallel offline search (e.g., rejection sampling to obtain a few correct traces), then recycle those traces safely with PrefixRL. No human-written solutions. No stronger teachers. No off-policy instability. PrefixRL is theoretically sound (same optima as standard RL + finite-sample guarantees) and delivers large reasoning gains with less compute.
English
1
0
4
212
Paria Rashidinejad
Paria Rashidinejad@paria_rd·
How to turn yesterday’s reasoning rollouts into today’s RL progress, on problems so hard that on-policy RL just stalls? In our new work, PrefixRL, we show that prefixing let us recycle off-policy traces effectively, with purely online updates: condition on an off-policy partial trace (prefix), then do on-policy RL only on the completion, with the prefix loss masked. This sidesteps the entropy collapse in SFT and avoids off-policy instability despite a big distribution shift. 🔑 Key mechanism: back-generalization: On-policy RL on partial completions generalizes backwards. At test time, the model solves the no-prefix task, often with better strategies than the prefix! Results: 2x compute efficiency in pure self-improvement + a big plateau jump over SFT→RL. arxiv.org/abs/2601.18795
Paria Rashidinejad tweet media
English
1
12
81
4.5K