Inderjit Dhillon

533 posts

Inderjit Dhillon

Inderjit Dhillon

@inderjit_ml

Google Fellow(VP) at Google, Professor at UT Austin, Machine Learning Researcher, Ex-VP/Distinguished Scientist at Amazon.

Katılım Mart 2020
195 Takip Edilen1K Takipçiler
Inderjit Dhillon retweetledi
Vivek Natarajan
Vivek Natarajan@vivnat·
Out of all the announcements at @Google I/O today, this is the one closest to my heart - our foundational research on Co-Scientist was published in @Nature and we announced its broad availability via @GeminiApp for Science. When you are suffering from a disease, time is everything. As our collaborator and @StanfordMed Professor Dr. Gary Peltz reminds us, there are thousands of diseases out there with zero treatments. There is simply so much left to solve. Our goal with Co-Scientist has been to give scientists superpowers and help them get to these answers faster - compressing the scientific process from months and years down to hours and days. Much like Galileo's telescope helped us look into the stars, Co-Scientist is designed to help us make sense of the vast complexity of biological and scientific data. It is among the first examples of a truly general-purpose multi-agent system for scientific discovery. The core research question behind it was: How can an AI system engage in the rigorous, structured thinking that’s the hallmark of science and scientists? To tackle this, Co-Scientist builds on the principles of self-play and self-improvement underpinning @GoogleDeepMind breakthroughs like AlphaGo, generalizing them to scientific reasoning through self-debates. Since our preprint last year, we have further improved its capabilities and have been validating it in collaborations with scientists across over 100 institutions globally, spanning both academia and industry. And we are thrilled to see the emergence of a new form of AI-human scientist collaboration that's already leading to important new insights, discoveries and peer reviewed publications - from understanding antimicrobial resistance (published in @CellCellPress) to decoding plant immunity, to identifying new treatments for liver fibrosis (Advanced Science), cancer, neurodegenerative diseases like ALS and the grand challenge of aging. I have always believed AI's greatest promise is accelerating scientific discovery and advancing human health. My genuine hope for the future is that AI tools like Co-Scientist help democratize science, giving anyone, anywhere the means to pursue their child-like curiosity and change the world. This work was done with stellar team mates spanning @GoogleDeepMind @GoogleResearch, @googlecloud and @GoogleLabs especially Juro Gottweis (@Mysiak ), who is the heart and soul of this effort. Special thanks also to all our wonderful collaborators: Gary Peltz, @CostaT_Lab, @jrpenades, @_e_d_v_ , @iambyronic, @OpsBug, @jgooten, @omarabudayyeh Ritu Raman, Ryan Flynn, Filippo Menolascina, Velia Siciliano, Clare Bryant, Matt Onsum, Katherine Labbé and more. Nature paper link - lnkd.in/e8qBEJFv Google DeepMind blog - lnkd.in/etYeahMy Gemini for Science - labs.google/science.
English
19
112
521
79.8K
Inderjit Dhillon retweetledi
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
Logan Kilpatrick tweet media
English
467
745
7.4K
656.7K
Inderjit Dhillon retweetledi
Jeff Dean
Jeff Dean@JeffDean·
2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency. It really helps you distill papers down to their essence and aid your understanding!
English
6
26
270
86.1K
Inderjit Dhillon retweetledi
Jeff Dean
Jeff Dean@JeffDean·
1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action. We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows. Gemini 3.5 Flash is our strongest model for coding and agent yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models. Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale. Some highlights we’re excited about 🔽
Jeff Dean tweet media
English
83
194
1.5K
128.5K
Inderjit Dhillon retweetledi
Rishabh Agarwal
Rishabh Agarwal@agarwl_·
Training LLMs is synonymous with updating their weights. However, LLMs can also learn in-context using *frozen* weights. There is no good reason for restricting learning to being in-context or in-weights. So a natural idea is "Learning, Fast and Slow" (FST). In FST, slow learning is LLM weights trained with RL while fast learning is context / prompt (fast weights) optimized with GEPA. Compared to RL, FST performs better while being more data efficient, adaptable (plasticity), and forgetting less (stays closer to base models). I think this idea of learning both fast-slow weights would be a good foundation for continual learning. PS: Geoff Hinton (the OG) described the idea of fast weights and slow weights several years ago, and back then I remember thinking it's a very cool idea. See more details here: gepa-ai.github.io/gepa/blog/2026…
Rishabh Agarwal tweet media
English
18
73
566
69.4K
Inderjit Dhillon retweetledi
Devvrit
Devvrit@Devvrit_Khatri·
ICL lets models adapt rapidly to changing tasks (✅), but the weights stay frozen - leaving performance gains on the table (⚠️). Fine-tuning (like SFT, RL) reaches a higher perf ceiling (✅), but is slow, can hurt OOD performance, and often reduces plasticity (⚠️). Why not combine the strengths (✅) of both? We introduce Fast-Slow Training (FST): fast weights (prompts) quickly capture task-specific nuances, while slow weights (model parameters) internalize the more general, task-agnostic reasoning patterns that should persist across tasks. FST reaches a higher perf asymptote while being more efficient. Since prompts absorb more of the task-specific information, the parameters do not need to move as much. As a result, the model stays closer to the base model, and preserves more plasticity for learning new tasks!
Rishabh Agarwal@agarwl_

Training LLMs is synonymous with updating their weights. However, LLMs can also learn in-context using *frozen* weights. There is no good reason for restricting learning to being in-context or in-weights. So a natural idea is "Learning, Fast and Slow" (FST). In FST, slow learning is LLM weights trained with RL while fast learning is context / prompt (fast weights) optimized with GEPA. Compared to RL, FST performs better while being more data efficient, adaptable (plasticity), and forgetting less (stays closer to base models). I think this idea of learning both fast-slow weights would be a good foundation for continual learning. PS: Geoff Hinton (the OG) described the idea of fast weights and slow weights several years ago, and back then I remember thinking it's a very cool idea. See more details here: gepa-ai.github.io/gepa/blog/2026…

English
1
14
51
12.6K
Inderjit Dhillon retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
Q1 earnings are in: 2026 is off to a terrific start. Our AI investments and full stack approach are lighting up every part of the business: Search queries are at an all-time high with AI continuing to drive usage. Google Cloud revenue grew 63%, Gemini models have incredible momentum, and it was our strongest quarter ever for consumer AI subs, driven by @GeminiApp. Thanks to our partners + employees around the world. Much more to share on our earnings call in 20 minutes… and at Google I/O in 20 days!
Sundar Pichai tweet media
English
383
949
9.8K
1M
Inderjit Dhillon retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
TPU 8t, optimized for training and TPU 8i, optimized for inference. Looking good!
Sundar Pichai tweet media
English
453
931
15.1K
1.3M
rohan anil
rohan anil@_arohan_·
Ok I did leave anthropic, a few weeks ago, it was one of the best places to work for a researcher. Jerry Tworek nerdsniped me into starting this with him and others The pretraining team at ant is one of the well functioning research team in the industry, and anthropic has great culture - I miss the fun times and claude! Thank you for everything!
English
82
27
1.7K
172.2K
Inderjit Dhillon retweetledi
Nilesh Gupta
Nilesh Gupta@nileshgupta2797·
Very cool long-context scaling work! Glad to see BlockRank (arxiv.org/pdf/2510.05396) ideas in the paper - (a) contrastive L_aux; (b) document-wise block sparse attention; (c) per-document position encodings. The scale of MSA is def a step above and has more well executed ideas for scaling (topk attention routing, better training recipe)!
Nilesh Gupta tweet media
艾略特@elliotchen100

论文来了。名字叫 MSA,Memory Sparse Attention。 一句话说清楚它是什么: 让大模型原生拥有超长记忆。不是外挂检索,不是暴力扩窗口,而是把「记忆」直接长进了注意力机制里,端到端训练。 过去的方案为什么不行? RAG 的本质是「开卷考试」。模型自己不记东西,全靠现场翻笔记。翻得准不准要看检索质量,翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理,就抓瞎了。 线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了,但越压越糊,长了就丢。 MSA 的思路完全不同: → 不压缩,不外挂,而是让模型学会「挑重点看」 核心是一种可扩展的稀疏注意力架构,复杂度是线性的。记忆量翻 10 倍,计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」 用了一种叫 document-wise RoPE 的位置编码,让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制,让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录,而是把线索串成链。 结果呢? · 从 16K 扩到 1 亿 token,精度衰减不到 9% · 4B 参数的 MSA 模型,在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属,这是创业公司买得起的成本。 说白了,以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是,让它真正「记住」。 我们放 github 上了,算法的同学不容易,可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

English
0
1
5
1.2K
Inderjit Dhillon
Inderjit Dhillon@inderjit_ml·
@dem_fier @hugo_larochelle Nice! The tool would be even more helpful if it summarizes how the paper is relevant, and cites it properly. I find that 90% of paper citations have a very superficial reason to cite.
English
1
0
3
775
Gaurav Sahu
Gaurav Sahu@dem_fier·
ever been here? open overleaf → write a paragraph → "hmm...this needs a citation" → open 15 different tabs → skim 8 abstracts → find the 1 actually relevant paper → format bibtex → paste it back on overleaf if so, i built a plugin just for you. meet openleaf: → reads your paper paragraph by paragraph → searches major academic databases → filters out irrelevant papers using ai → one click to add BibTeX to your .bib you'll also find the 🤝 friendly and 🔥 fire reviewers there. i don't think i need to tell you what they do :) free. open source. no account. no data collection. works with ollama, openrouter, openai api and more. github.com/demfier/openle… dear algorithm, please show this to my fellow researchers in need 🙏 #overleaf #latex #opensource #academictwitter
English
27
106
816
1.1M
Inderjit Dhillon retweetledi
Prabhakar Raghavan
Prabhakar Raghavan@WittedNote·
Excited for this promising start in extremal graph theory, building on our earlier work using AlphaEvolve for complexity theory for the TSP and other problems arxiv.org/abs/2509.18057
Pushmeet Kohli@pushmeet

Happy to share new progress in AI for Maths @GoogleDeepMind . In extremal combinatorics, AlphaEvolve has helped establish new lower bounds for FIVE classical Ramsey numbers - a problem so challenging that even Erdős commented on its difficulty. Historically, computationally deriving these bounds required bespoke, human-designed search algorithms. For many of these bounds, the best previous results are at least a decade old. AlphaEvolve changes this by acting as a single meta-algorithm that automatically discovers the search procedures needed to find these new bounds. 📷

English
1
1
16
1.8K
Inderjit Dhillon retweetledi
Pushmeet Kohli
Pushmeet Kohli@pushmeet·
Happy to share new progress in AI for Maths @GoogleDeepMind . In extremal combinatorics, AlphaEvolve has helped establish new lower bounds for FIVE classical Ramsey numbers - a problem so challenging that even Erdős commented on its difficulty. Historically, computationally deriving these bounds required bespoke, human-designed search algorithms. For many of these bounds, the best previous results are at least a decade old. AlphaEvolve changes this by acting as a single meta-algorithm that automatically discovers the search procedures needed to find these new bounds. 📷
English
60
323
3K
466.7K
Inderjit Dhillon retweetledi
Davis Blalock
Davis Blalock@davisblalock·
🚀 Today we’re releasing FlashOptim: better implementations of Adam, SGD, etc, that compute the same updates but save tons of memory. You can use it right now via `pip install flashoptim`. 🚀 arxiv.org/abs/2602.23349 A bunch of cool ideas make this possible: [1/n]
Davis Blalock tweet media
English
31
228
1.6K
218.4K
Inderjit Dhillon retweetledi
Tekedra N Mawakana
Tekedra N Mawakana@TechTekedra·
A new era for Waymo. We’ve raised $16B to accelerate our mission, valuing the company at $126B. This capital is an investment in a future where more cities get a safer, more reliable way to move. Let's go. 🚀 waymo.com/blog/2026/02/w…
English
47
117
1.3K
525.2K
Inderjit Dhillon retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
Now Google can help with your Googly:)
Google Gemini@GeminiApp

We partnered with @ICC to show how Gemini 3 Pro can analyze video content. By uploading a segment of the Cricket World Cup, Gemini can seamlessly process visual and audio data to identify key players, explain techniques, and highlight crucial turning points. 🏏

English
231
304
5.1K
515.9K