Mahtab Sarvmaili

770 posts

Mahtab Sarvmaili

Mahtab Sarvmaili

@MahtabSarvmaili

love AI 🤖

🇦🇶 Katılım Haziran 2020
721 Takip Edilen74 Takipçiler
Mahtab Sarvmaili retweetledi
Lakshya A Agrawal
Lakshya A Agrawal@LakshyAAAgrawal·
Excited to share that my ICLR 2026 Oral Talk for GEPA is available on YouTube. I go deeper into why GEPA works better than prior optimization techniques, along with touching on many aspects of GEPA! youtu.be/HbGah-uP1fI
YouTube video
YouTube
Lakshya A Agrawal tweet media
Lakshya A Agrawal@LakshyAAAgrawal

Thrilled to present GEPA as an Oral Talk and Poster at ICLR 2026 this Friday in Rio! 🇧🇷 Apr 24 Oral Session 3A (Agents), 10:30 AM BRT, Amphitheater Poster Session 4, 3:15 PM, Pavilion 3 x.com/LakshyAAAgrawa… Let's recap what's happened since we released GEPA last year 🧵

English
9
46
241
29.4K
Mahtab Sarvmaili retweetledi
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
Did a very different format with @reinerpope – a blackboard lecture where he walks through how frontier LLMs are trained and served. It's shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk. It’s a bit technical, but I encourage you to hang in there - it’s really worth it. There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him. Recommend watching this one on YouTube so you can see the chalkboard. 0:00:00 – How batch size affects token cost and speed 0:31:59 – How MoE models are laid out across GPU racks 0:47:02 – How pipeline parallelism spreads model layers across racks 1:03:27 – Why Ilya said, “As we now know, pipelining is not wise.” 1:18:49 – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal 1:32:52 – Deducing long context memory costs from API pricing 2:03:52 – Convergent evolution between neural nets and cryptography
English
146
595
6.5K
1.2M
Mahtab Sarvmaili retweetledi
Utkarsh
Utkarsh@utk7arsh·
I thought robotics was for PhDs and billion dollar labs. Then I found this repo where NVIDIA open-sourced the entire stack for physical AI. Brain. Body. Physics. Simulation. Free. I wrote the full breakdown and what projects you can start building today to get ahead.
Utkarsh@utk7arsh

x.com/i/article/2048…

English
14
77
617
65.5K
Mahtab Sarvmaili retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
What happens when you put #LLM agents in a room and ask them to cooperate? They collapse. They free-ride. They form social networks. We spent 2+ years building a full research series on Multi-Agent LLM Safety. Here's a 50-min talk covering all of it: 🔗 youtube.com/watch?v=1MxpYJ…
YouTube video
YouTube
English
4
12
60
6.3K
Mahtab Sarvmaili retweetledi
Rishabh Agarwal
Rishabh Agarwal@agarwl_·
I gave a talk at ICLR 2026 about how we are scaling RL on frontier LLMs with 1T+ parameters, on experimental data from our physical lab at Periodic! Here's a rough recording of the talk:
English
13
171
1.8K
203.6K
Mahtab Sarvmaili retweetledi
Hanqi Yan
Hanqi Yan@yan_hanqi·
🧠 Mechanistic interpretability is obsessed with features. But what if gradients tell you more? 📐 Introducing GRADE — using gradient subspace dynamics to measure how far an LLM is from the correct answer, probing knowledge gaps at their root. 🔍 📄 Paper: Probing Knowledge Gaps in LLMs through Gradient Subspace Dynamics 🔗 arxiv.org/pdf/2604.02830
English
1
13
197
11.2K
Mahtab Sarvmaili retweetledi
MIT CSAIL
MIT CSAIL@MIT_CSAIL·
Today, MIT & the IMO released MathNet, the world’s largest dataset of International Math Olympiad problems & solutions 🌍 MathNet is 5x larger than previous datasets & is sourced from over 40 countries across 4 decades: bit.ly/4u1bhBC
MIT CSAIL tweet media
English
15
543
2.1K
193.6K
Mahtab Sarvmaili retweetledi
Nav Toor
Nav Toor@heynavtoor·
Your "hallucination-free" RAG system trusts its retrieval layer. Researchers just proved that 5 documents, planted in a database of 2.6 million, can hijack the LLM's answer 97% of the time. The attacker never touches your model. They never see your retriever. They just write a document. This is PoisonedRAG. 🧵
Nav Toor tweet media
English
24
127
577
44.5K
Mahtab Sarvmaili retweetledi
Nathan Lambert
Nathan Lambert@natolambert·
Excited to launch the accompanying free RLHF Course for my book. To kick it off, I've released: - Welcome video - Lecture 1: Overview of RLHF & Post-training - Lecture 2: IFT, Reward Models, Rejection Sampling - Lecture 3: RL Math - Lecture 4: RL Implementation I'm going to add question & answer videos throughout the lecture to go deeper on topics that need it, and potentially cover some topics that are too recent and in flux to go in print. I expect 10-15 videos in total over the next few months. At the same time, development around the code for the book is picking up. It's a great time to build the foundation for post-training methods. YT playlist and course landing page below.
Nathan Lambert tweet media
English
50
236
1.7K
184.5K
Mahtab Sarvmaili retweetledi
Vivo
Vivo@vivoplt·
Research papers you must read for AI Engineer interviews: 1. Attention is all you need (Transformers) 2. LoRA (Low rank adaption) 3. PEFT ( Parameter Efficient Fine Tuning) 4. VIT (Vision Transformers) 5. VAE (Variational Auto Encoder) 6. GANs ( Generative Adversarial Networks) 7. BERT ( Bidirectional Encoder Representation from Transformers) 8. Diffusion Models (Stable Diffusion) 9. RAG (Retrieval Augment Generation) 10. GPT (Generative Pre-trained Transformers)
English
53
292
2.5K
109.6K
Mahtab Sarvmaili retweetledi
Didier Lopes
Didier Lopes@didier_lopes·
This was a really good read. h/t @guohao_li
Didier Lopes tweet mediaDidier Lopes tweet media
English
3
32
393
45.5K
Mahtab Sarvmaili retweetledi
𒐪
𒐪@SHL0MS·
introducing Autoreason, a reasoning method inspired by @karpathy's AutoResearch which extends the strategy for subjective domains the paper was co-written with Hermes Agent by @NousResearch, using a research-paper-writing skill developed while writing it paper + results below
𒐪 tweet media
English
47
156
1.4K
305.6K
Mahtab Sarvmaili retweetledi
elvis
elvis@omarsar0·
NEW paper from Meta. (bookmark this one) What if the model wasn't just using the computer, but became the computer? New research from Meta AI and KAUST makes a serious case for Neural Computers (NCs). The paper proposes NCs as learned runtimes where computation, memory, and I/O live inside a single latent state. Their first prototypes use video models to roll out terminal and GUI interfaces from prompts, pixels, and user actions. Why does it matter? Today's agents still depend on external computers to store state, execute actions, and enforce system contracts. Neural Computers point to a different machine form: one where interface dynamics, working memory, and execution are learned together. The early results are promising but grounded. CLI rendering improves, GUI cursor control reaches 98.7% with explicit visual supervision, and reprompting boosts arithmetic-probe accuracy from 4% to 83%. But symbolic reliability, stable reuse, and runtime governance remain open. This is less "agents got better" and more "what comes after agents as a computing substrate?" Paper: arxiv.org/abs/2604.06425 Learn to build effective AI agents in our academy: academy.dair.ai
elvis tweet media
English
15
92
505
61K
Mahtab Sarvmaili retweetledi
Cas (Stephen Casper)
Cas (Stephen Casper)@StephenLCasper·
🧵🧵🧵 A provocation to the mechanistic interpretability researchers of the world...
English
6
6
154
17.7K
Mahtab Sarvmaili retweetledi
Brian Roemmele
Brian Roemmele@BrianRoemmele·
We at The Zero-Human Company have been testing MemPalace by the amazing @bensig and Milla Jovovich and are absolutely blown away! It is a freaking masterpiece and we have deployed it to 79 employees at the company. Each worker will be testing and expanding on MemPalace. I will have a lot to say about how we are using it and how you should to.
Ben Sigman@bensig

My friend Milla Jovovich and I spent months creating an AI memory system with Claude. It just posted a perfect score on the standard benchmark - beating every product in the space, free or paid. It's called MemPalace, and it works nothing like anything else out there. Instead of sending your data to a background agent in the cloud, it mines your conversations locally and organizes them into a palace - a structured architecture with wings, halls, and rooms that mirrors how human memory actually works. Here is what that gets you: → Your AI knows who you are before you type a single word - family, projects, preferences, loaded in ~120 tokens → Palace architecture organizes memories by domain and type - not a flat list of facts, a navigable structure → Semantic search across months of conversations finds the answer in position 1 or 2 → AAAK compression fits your entire life context into 120 tokens - 30x lossless compression any LLM reads natively → Contradiction detection catches wrong names, wrong pronouns, wrong ages before you ever see them The benchmarks: 100% recall on LongMemEval — first perfect score ever recorded. 500/500 questions. Every question type at 100%. 92.9% on ConvoMem — more than 2x Mem0's score. 100% on LoCoMo — every multi-hop reasoning category, including temporal inference which stumps most systems. No API key. No cloud. No subscription. One dependency. Runs on your machine. Your memories never leave. MIT License. 100% Open Source. github.com/milla-jovovich…

English
69
96
1.2K
186.9K
Mahtab Sarvmaili retweetledi
Yacine Mahdid
Yacine Mahdid@yacinelearning·
for those interested in distributed reinforcement learning I just finished a ~1h tutorial on the echo2 framework by @Gradient_HQ we check: - how to do async RL - infra split between rollout workers and centralized learner - interview with gradient cofounder eric yang himself!
Yacine Mahdid tweet media
English
14
49
402
40.8K
Mahtab Sarvmaili retweetledi
AVB
AVB@neural_avb·
People interested in model interpretability check out this gold. The "Circuits" Thread A series of exploratory research by Chris Olah himself and team when he was with OpenAI around 2020-2021. Circuits are sub-graphs of the network, consisting a set of linked features and the weights. These articles are trying to "reverse engineer" neural nets and finding these subgraphs. Shoutout to @exploding_grad for unearthing this and sharing. This is what I'll be passively consuming this week I guess... Link in attached tweet.
AVB tweet media
kendrick@exploding_grad

@neural_avb If you want to go deep down the rabbit hole - start with the distil circuits thread. distill.pub/2020/circuits/

English
9
32
329
25.9K