David Zhang

54 posts

David Zhang

David Zhang

@dzhang03

@Yale @GoogleDeepMind | prev research @david_van_dijk @calico

New York, NY Katılım Şubat 2024
292 Takip Edilen68 Takipçiler
David Zhang retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
Thrilled to launch Project Genie, an experimental prototype of the world's most advanced world model. Create entire playable worlds to explore in real-time just from a simple text prompt - kind of mindblowing really! Available to Ultra subs in the US for now - have fun exploring!
English
381
950
7.9K
964.1K
David Zhang retweetledi
Jeff Dean
Jeff Dean@JeffDean·
We’ve pushed out the Pareto frontier of efficiency vs. intelligence again. With Gemini 3 Flash ⚡️, we are seeing reasoning capabilities previously reserved for our largest models, now running at Flash-level latency. This opens up entirely new categories of near real-time applications that require complex thought. It’s available in the API, and rolling out today as the default model in AI Mode in Search and Gemini app globally. Read more on the blog at: bit.ly/4pTo5YU More in thread ⬇️
Jeff Dean tweet media
English
53
194
1.8K
159K
David Zhang retweetledi
ARC Prize
ARC Prize@arcprize·
Gemini 3 models from @Google @GoogleDeepMind have made a significant 2X SOTA jump on ARC-AGI-2 (Semi-Private Eval) Gemini 3 Pro: 31.11%, $0.81/task Gemini 3 Deep Think (Preview): 45.14%, $77.16/task
ARC Prize tweet media
English
190
605
4.1K
2.2M
David Zhang retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
We’ve been intensely cooking Gemini 3 for a while now, and we’re so excited and proud to share the results with you all. Of course it tops the leaderboards, including @arena, HLE, GPQA etc, but beyond the benchmarks it’s been by far my favourite model to use for its style and depth, and what it can do to help with everyday tasks.
Demis Hassabis tweet media
English
218
485
5.7K
589.6K
David Zhang retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵
English
213
1.1K
6.5K
1.7M
David Zhang retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
It's nearly 3 here, my favourite part of the night shift… locked in... 💪🚀
English
312
319
6.8K
1M
David Zhang retweetledi
David van Dijk
David van Dijk@david_van_dijk·
C2S is now open for everyone. The biological LLM that learns the language of cells. Free for academic and commercial use. c2s.bio Join the growing community building with C2S. 🌱
English
11
44
192
27.6K
David Zhang retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells.  With more preclinical and clinical tests, this discovery may reveal a promising new pathway for developing therapies to fight cancer.
English
543
3.2K
21.8K
6.9M
David Zhang retweetledi
David van Dijk
David van Dijk@david_van_dijk·
🚨 Thrilled to announce our paper “Non-Markovian Discrete Diffusion with Causal Language Models” was accepted at #NeurIPS2025! 🎉 @YaleCSDept @YaleMed @yaledatascience We introduce CaDDi, a new framework that unifies discrete diffusion and causal LMs. A quick explainer 🧵👇
David van Dijk tweet media
English
2
11
65
15.6K
David Zhang retweetledi
Yiping Lu
Yiping Lu@2prime_PKU·
Anyone knows adam?
Yiping Lu tweet media
English
265
441
4.8K
634.4K
David Zhang retweetledi
Jun Cheng
Jun Cheng@s6juncheng·
Excited to share #AlphaGenome, a start of our AlphaGenome named journey to decipher the regulatory genome! The model matches or exceeds top-performing external models on 24 out of 26 variant evaluations, across a wide range of biological modalities.1/6
Jun Cheng tweet media
English
14
208
909
87.2K
David Zhang retweetledi
Richard Socher
Richard Socher@RichardSocher·
If you studied algorithms, I'm sure you've heard of Dijkstra’s algorithm to find the shortest paths between nodes in a weighted graph. Super useful in scenarios such as road networks, where it can determine the shortest route from a starting point to various destinations. It's been the most optimal algorithm since 1956! Until now. The O(E + V log V) complexity just went down to O(E log^(2/3) V) for sparse graphs. It would be amazing if this kind of breakthrough came through AI that can code but I guess we're not there yet..
Richard Socher tweet media
English
23
124
1.2K
140.4K
David Zhang retweetledi
David Zhang retweetledi
Theta
Theta@trytheta·
Introducing CUB: Humanity's Last Exam for Computer and Browser Use Agents
Theta tweet media
English
32
39
251
114K
David Zhang retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
Introducing AlphaEvolve: a Gemini-powered coding agent for algorithm discovery. It’s able to: 🔘 Design faster matrix multiplication algorithms 🔘 Find new solutions to open math problems 🔘 Make data centers, chip design and AI training more efficient across @Google. 🧵
GIF
English
175
1.3K
6.9K
2.6M
David Zhang retweetledi
Gurvir Singh
Gurvir Singh@_gurvir_·
we've been misled to believe that manual prompt hacking is the solution to teaching LLMs how to approach complex problems. why write a "magic prompt" to pattern match for every type of problem you might care about, when LLMs have already shown extraordinary ability to self-review and self-correct given the right feedback loops @karpathy alludes to it here, but what's missing is a memory layer so that LLMs can learn from their previous mistakes. they suffer from amnesia because they lack a mechanism to record and build upon problem solving strategies. a memory layer allows for this "system prompt learning" instead of relying on explicit human feedback there's a lot of engineering challenges in getting this to work effectively. how do you measure which insights are effective, and how do you refine them from feedback? building a "scratchpad" of notes that can be maintained over thousands of runs and indexed efficiently to get the right notes is a non-trivial problem, and it's exactly what we're tackling at @trytheta
Andrej Karpathy@karpathy

We're missing (at least one) major paradigm for LLM learning. Not sure what to call it, possibly it has a name - system prompt learning? Pretraining is for knowledge. Finetuning (SL/RL) is for habitual behavior. Both of these involve a change in parameters but a lot of human learning feels more like a change in system prompt. You encounter a problem, figure something out, then "remember" something in fairly explicit terms for the next time. E.g. "It seems when I encounter this and that kind of a problem, I should try this and that kind of an approach/solution". It feels more like taking notes for yourself, i.e. something like the "Memory" feature but not to store per-user random facts, but general/global problem solving knowledge and strategies. LLMs are quite literally like the guy in Memento, except we haven't given them their scratchpad yet. Note that this paradigm is also significantly more powerful and data efficient because a knowledge-guided "review" stage is a significantly higher dimensional feedback channel than a reward scaler. I was prompted to jot down this shower of thoughts after reading through Claude's system prompt, which currently seems to be around 17,000 words, specifying not just basic behavior style/preferences (e.g. refuse various requests related to song lyrics) but also a large amount of general problem solving strategies, e.g.: "If Claude is asked to count words, letters, and characters, it thinks step by step before answering the person. It explicitly counts the words, letters, or characters by assigning a number to each. It only answers the person once it has performed this explicit counting step." This is to help Claude solve 'r' in strawberry etc. Imo this is not the kind of problem solving knowledge that should be baked into weights via Reinforcement Learning, or least not immediately/exclusively. And it certainly shouldn't come from human engineers writing system prompts by hand. It should come from System Prompt learning, which resembles RL in the setup, with the exception of the learning algorithm (edits vs gradient descent). A large section of the LLM system prompt could be written via system prompt learning, it would look a bit like the LLM writing a book for itself on how to solve problems. If this works it would be a new/powerful learning paradigm. With a lot of details left to figure out (how do the edits work? can/should you learn the edit system? how do you gradually move knowledge from the explicit system text to habitual weights, as humans seem to do? etc.).

English
4
4
29
3.2K
David Zhang retweetledi
Brian Roemmele
Brian Roemmele@BrianRoemmele·
Google did a very good job with this commercial.
English
53
331
4.2K
381.2K
David Zhang retweetledi
Physical Intelligence
Physical Intelligence@physical_int·
We got a robot to clean up homes that were never seen in its training data! Our new model, π-0.5, aims to tackle open-world generalization. We took our robot into homes that were not in the training data and asked it to clean kitchens and bedrooms. More below⤵️
English
53
260
1.6K
488K