Rahul Madhavan

4.6K posts

Rahul Madhavan

@imrahulmaddy

Building self-improving agents Research @ GoogleDeepMind

Katılım Kasım 2020

1.3K Takip Edilen1.2K Takipçiler

Rahul Madhavan retweetledi

fly51fly@fly51fly·12h

[LG] Efficient RL Training for LLMs with Experience Replay C Arnal, V Cabannes, T Cohen, J Kempe… [FAIR at Meta] (2026) arxiv.org/abs/2604.08706

English

1.5K

Rahul Madhavan retweetledi

Chris Hayduk@ChrisHayduk·20h

In July 2024, DeepMind unveiled AlphaProof — an AlphaZero-inspired agent that constructs mathematical arguments in Lean, a programming language for proofs. It broke new ground in mathematical performance, achieving a silver medal in the 2024 International Math Olympiad. One year later, in July 2025, OpenAI announced that they had achieved a gold medal in the 2025 International Math Olympiad using a raw LLM — no reinforcement learning in Lean space, no translation between natural language and formal proof languages. In the span of a few weeks, this same model would go on to add a gold medal at the International Olympiad in Informatics and a 2nd place finish at the AtCoder World Tour Finals to its achievements. Since July 2025, I kept coming back to this puzzle: Why would a general-purpose language model, one just as comfortable answering questions about lasagna recipes in ChatGPT as it is answering mathematical questions, end up looking stronger at Olympiad math than a much more math-specific theorem-proving system? In my new blog post, I use legendary mathematician Jacques Hadamard's analysis of the phenomenology of mathematical discovery to attempt to answer this question. And to probe where LLMs are headed next. Link in the replies below.

English

8.8K

Rahul Madhavan@imrahulmaddy·9h

30% random noise does not reduce performance? 🤔 If this is reproducible across more models and datasets, one has to wonder why.

Anish Athalye@anishathalye

Does an imperfect verifier break reinforcement learning with verifiable rewards (RLVR)? Turns out it doesn’t! Why does this matter? As the world moves into reinforcement learning in semi-verifiable domains, perfect verifiers don’t exist. We added controlled and LLM-based noise to RLVR reward signals and found that up to 30% noise barely hurts training; performance stays within 4pp of the clean baseline. This research has already impacted how we build reinforcement learning environments at @joinHandshake. For a major benchmark we are launching tomorrow, we hill-climbed the verifier to 88% accuracy—above the 85% human inter-rater agreement—knowing from this research that this is good enough. With @andreas_plesner @guzmanhe

English

Rahul Madhavan retweetledi

Tomohiro Ohigashi / 大東智洋@tom_ohigashi·1d

The Illusion of Learning from Observational Data: An Empirical Bayes Perspective Bohan Wu, Sebastian Salazar, Donald P. Green, David M. Blei arxiv.org/abs/2604.08853

English

7.2K

Rahul Madhavan retweetledi

Peyman Milanfar@docmilanfar·2d

Outstanding researchers excel at the art of finding problems right at the edge of our understanding — that can actually be solved.

English

420

24.7K

Rahul Madhavan retweetledi

한준호@with_hanjunho·3d

대통령님의 말씀은 누군가의 고통을 외면하지 말자는 상식이었습니다. 그 상식에 반발한 이스라엘의 태도는 오히려 더 많은 질문을 남깁니다. 함께 아파하는 것, 침묵하지 않는 것, 우리가 지켜온 길입니다. 대한민국은 인권과 민주주의를 말하는 나라로 더 당당하게 나아가야 합니다.

이재명@Jaemyung_Lee

<끊임없는 반인권적 반국제법적 행동으로 고통받고 힘들어하는 전 세계인들의 지적을 한번쯤은 되돌아볼 만도 한데 실망입니다. 내가 아프면 타인도 그만큼 아픕니다. 나의 필요 때문에 누군가 고통받으면 미안한 것이 인지상정입니다. 아닌 밤중에 홍두깨라고 아무 잘못없는 우리 국민들께서 뜬금없이 겪고 있는 이 엄청난 고통과 국가적 어려움을 지켜보는 마음이 매우 불편합니다. 보편적 인권과 대한민국의 국익을 위해 할 수 있는 일을 더 열심히 찾아봐야겠습니다.> 이스라엘, ‘전시 살해=유대인 학살’ 李대통령 발언에 “용납 못해” v.daum.net/v/202604110641…

한국어

246

666

2.4K

50.8K

Rahul Madhavan retweetledi

alex zhang@a1zhang·3d

x.com/i/article/2041…

ZXX

130

1.1K

282.7K

Rahul Madhavan@imrahulmaddy·2d

never lose sight of universal human rights as the foundation of human civilization

박홍근 기획예산처 장관 / 국회의원@maumgil

보편적 인권을 강조한 이재명 대통령의 발언에 대해 “용납할 수 없다”고 응수한 이스라엘 정부측에 깊은 유감을 표합니다. 반만년의 역사 속에서 수많은 외침과 국권 상실의 아픔까지 겪은 우리 국민은, 지난 세기 이스라엘 국민이 겪은 참혹한 고통에 대해 충분히 공감하고 이해하고 있습니다. 그러나 그 어떤 이유로도 정도를 벗어난 반인륜 행위가 정당화될 수는 없습니다. 이러한 행위가 지속되며 그 여파가 우리 국민에게까지 미치고 있는 상황을 결코 좌시할 수 없습니다. 피해의 기억이 또 다른 가해로 이어지는 증오의 연쇄에서 이스라엘이 하루빨리 벗어나기를 촉구합니다. 아울러, 정략적 목적을 위해 사태의 본질을 흐리거나 일방의 입장을 두둔하는 국내의 움직임 또한 자제되어야 할 것입니다. 기획예산처는 대한민국의 미래를 설계함에 있어 민생과 국익을 최우선 가치로 삼되, 보편적 인권이라는 인류 문명의 근간을 결코 놓치지 않을 것입니다. n.news.naver.com/mnews/article/…

English

Rahul Madhavan retweetledi

이재명@Jaemyung_Lee·3d

한국어

2.5K

14K

51K

9.1M

Rahul Madhavan retweetledi

Chris Hayduk@ChrisHayduk·3d

Paper link: arxiv.org/abs/2510.25741

English

342

35.1K

Rahul Madhavan retweetledi

Kanjun 🐙@kanjun·4d

Twitter’s algorithm is optimized for addiction, not for us. We deserve better. We’re releasing Bouncer today so you can take back control of your feed. Describe what you don't want, and Bouncer removes it. It’s free, doesn’t collect your data, and will be open source soon.

English

210

292

3.1K

564.8K

Rahul Madhavan retweetledi

Rémi Lodh@LodhSpringer·3d

This year, the world is marking the 200th anniversary of the birth of mathematician Bernhard Riemann. Renown historian David Rowe has undertaken a deep study of Riemann's life and work, completing what should be his definitive biography. It will be published in June, stay tuned!

English

481

17.1K

Rahul Madhavan@imrahulmaddy·4d

Can the machine watch itself. Can it have a sense of self? Can it watch the universe, the effects of the actions of that self on the universe... Then is the universe observing itself? If it is, then it must be, that it is conscious.

English

Rahul Madhavan@imrahulmaddy·4d

Maybe in future a machine could also create an equivalent world, say in a distant planet. But taking action by itself does not create consciousness. The only question, maybe, is whether the Universe is observing itself through that machine.

English

Rahul Madhavan@imrahulmaddy·4d

Are we doomed to never knowing whether any entity other than us is conscious? What a profound question!

Eric Schwitzgebel@eschwitz

Last week I submitted my latest book manuscript to Cambridge (for their "Element" series of books about 100 pages long): AI and Consciousness: A Skeptical Overview -- because you haven't heard nearly enough about AI and consciousness recently, of course! ;-) Maybe you'll 1/3

English

145

Rahul Madhavan retweetledi

Patrick Shafto@patrickshafto·5d

The hypergraph of math arxiv.org/abs/2604.06107

English

146

8.8K

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry