Dileep George

5.9K posts

Dileep George

Dileep George

@dileeplearning

Head of AI @AsteraInstitute Prev: AGI @DeepMind, cofounder @vicariousai (acqd by Alphabet), cofounder @Numenta. IIT-Bombay, MS&PhD Stanford. https://t.co/IlsczdBtZo

San Francisco, CA Katılım Haziran 2017
1.4K Takip Edilen15.9K Takipçiler
Dileep George retweetledi
D. Scott Phoenix
D. Scott Phoenix@fuelfive·
This is Progress. Conversations about life at the hinge of history with the people building it.
English
2
5
78
88.6K
Dileep George retweetledi
Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)
World Models: The old, the new and the wishful #SundayHarangue There is a lot of chatter about world models of late--even more than can be explained by Yann betting his entire new enterprise on it. I was going to comment on this clamor in my class this week, and thought I will preview it here first.. 😋 World Models are of course by no means new--whether learned or provided, they have been the backbone of decision making problems--be it control theory or #AI--for nearly a century. Russell & Norvig's Intro to AI text book *starts* with world model as an integral part of an agent architecture (see below). A fortuitous by-product of the focus on world modes is the crash course post-#alexnet #ML young'uns maybe getting to core #AI concepts: how hierarchical models of the world and mental simulation at differing abstractions help with long range planning.. Because the current world model craze has generally been ahistoric, it confounds multiple things, IMHO. Resolution vs. Abstraction: Perhaps the most important is on their intended purpose. Are they meant to "construct" believable synthetic worlds--thus requiring be CGI-level high fidelity Or are they meant to help the agent to efficiently mentally simulate evolution of its world--conditioned on its own and other agent's actions--to support long range planning and decision making. A large part of the current work on world models--especially that based purely on video and sensory data--seems to conflate it. While it may seem that having a high fidelity world building model should also help in long range decision making, it is quite likely that the computational tradeoffs--between hi-res and abstraction tend to make them of questionable use for long range planning. Faster roll out (mental simulation) and higher resolution are quite often at loggerheads.. Disjunction and Abstraction: Having mentioned "hierarchy" and "abstraction" multiple times, I feel it is worth pointing out that at its core abstraction is a form of disjunction. An agent reasoning with the abstract models is basically reasoning over a disjunction of many distinct concrete futures--that are all roughly equivalent from the point of view of the goals of the agent. The connection to disjunction and abstraction is a powerful one that is not often acknowledged. An abstract action is a disjunction over concrete courses of action--thus leading to a disjunction of world states. A learned latent variable has similar disjunction semantics. For example, in a transformer-like architecture, a latent variable can be seen as a distribution over concrete tokens. Role of language and Symbolic abstractions: While in theory it is possible to learn world models with hierarchical abstraction (e.g. with latent variable models), ignoring the linguistic data--which is after all the corner stone of human civilization--fails to leverage the abstractions we humans have developed over the millennia. Planning, of the kind I am fond of, is possible because the models are at a significantly higher level of abstraction than pixels, or even any latent variable learned models can provide in the near future. While the planning models of yore were written by humans, there is a way of avoiding that bottleneck. Our linguistic data already sort of captures of humanity's abstractions over video data--or what I like to call "space time signal tubes" (c.f. x.com/rao2z/status/1… & x.com/rao2z/status/1… ). So, as much as I agree with the argument that language may not by itself lead to effective world models, I also equally believe that getting to the right level of abstraction from pixel stream data--while theoretically possible (in that we the humanity and evolution seem to have done it), is going to be awfully slow--especially when we have the human abstractions, however imperfect, are readily available in the language data. A powerful way, it seems to me, is to complement these symbolic and pixel level WMs.. The tradeoff is either "important parts only, but can do long range prediction" vs. "full resolution, but not long enough range". Humans seem to use language vs. visual priors for these two, which argues for an approach that uses both types of data in learning world models. Internal Abstractions and Alignment Problem: Even if the efficiency is not an issue, another critical concern about learning purely from sensory data aligning the agents using those models to humans. There is no a priori reason that the abstractions learned internally from the sensory data by an agent would have any natural correspondence to those that humans use. To the extent we want artificial agents with learned world models to be easily aligned to us humans, taking the inductive biases present in the linguistic data seems like a smarter move (c.f. x.com/rao2z/status/1…). LLMs and Symbolic World Models: While there is a lot of evidence that LLMs may not be directly encoding (symbolic) world models, it has also been known that we can learn such symbolic models from LLMs. Indeed, one of our earliest works on the role of LLMs in Planning was to extract symbolic planning models from them (c.f. arxiv.org/abs/2305.14909). There has been significant additional work since then--with some of it trying to combine sensory and linguistic data in learning world models. Verifiers and Simulators are related to World Models: A lot of the improvement in LLM reasoning models has come from post-training phase that uses LLMs as generators of plausible solutions, and checking their correctness with the verifiers or simulators that are available externally (c.f. x.com/rao2z/status/2…). The critical importance of the availability of such verifiers/simulators for LLM post-training has become so clear that there is a clamor of the so-called "RL Environments"--which basically are RL engines coupled to verifiers or simulators standing in for the "environment." Acknowledging this connection would make "world model learning" as a general version of "verifier/Simulator learning". Learning from your experience vs. other's experience: One important distinction in world model learning is whether you are learning them by doing things in the world yourself and observing/feeling the consequences (which is pretty much what kids do), or whether you are trying to learn them from other people's collected experience (which is what most of the current post-LLM research on World Models does). The big difference tends to be causality.. when you generating your own experiences, you have the ability to do arbitrary causal intervention experiments, something that is hard when you are only learning from others' experience. The difficulty of gaining your own experience of course is that (a) it is time consuming and (b) possibly unsafe. Not surprisingly, notwithstanding Sutton's OAK proposal, most ongoing work on world models is based on the agent learning from others' experience. On the Irony of learning world models for synthetic worlds: A lot of the work on world models seems to be quixotically based on virtual worlds--such as video games. This seems quite ironic. Since these are made by us, the whole point of learning world models seems to be sort of "reverse engineering" what we (the humanity) already know. In this era of LLMs where everything that humanity knows is already fodder for training LLMs, what is the deeper reason as to why learning virtual worlds (rather than just stealing the program running the virtual world) is a legitimate long term research direction? (I am fine with playing with virtual worlds as a training wheel for the "real world" that we didn't engineer.. but am a little mystified by video games as the be all and end-all. Come to think of it, this irony is also present for the original Atari Game suite that pushed a lot of deep RL research: The game engine converts a compact RAM state to a video frame so the humans can "play" and the DRL algorithms try to reverse engineer the logic from this video frame.. Since the time of the Atari Games benchmark, any illusory need for such reverse engineering has largely disappeared, IMHO).
Subbarao Kambhampati (కంభంపాటి సుబ్బారావు) tweet media
English
7
24
107
11.8K
Dileep George retweetledi
Elias Bareinboim
Elias Bareinboim@eliasbareinboim·
Action conditioned models estimate transition kernels. But the same transition statistics can arise from different causal structures that imply different consequences under new policies. This means policy evaluation becomes a causal identification problem, not merely a prediction problem. This observation is closely related to the distinction emphasized by @yudapearl between interventions and the underlying mechanisms that generate them. I discussed this in an ICML tutorial a few years ago, link: crl.causalai.net. Part III of my forthcoming book Causal Artificial Intelligence discusses this in detail , including examples where identical MDP transitions correspond to different causal environments (see Fig. 8.4, p. 544): causalai-book.net Reinforcement learning and representation learning are important pieces of the puzzle, but causal structure must also be treated as a first-class component of intelligent systems.
Yann LeCun@ylecun

Our world models are action conditioned, and hence causal. The concept of world model for planning goes back to the 1950s in optimal control (before I was born). I didn't just discover it. But training action-conditioned world models from sensory inputs (like video) requires new techniques.

English
6
21
160
22.2K
Dileep George
Dileep George@dileeplearning·
@athomasq "Sometimes, the tooth hurts". I think you made up this story just for this joke. 😛
English
0
0
1
307
Abraham Thomas
Abraham Thomas@athomasq·
Observation (n=5 and counting): Dentists are the worst angels. If you made your money in finance, tech, law, real estate, you probably have some commercial savvy; some sense of how money flows, how the sausage is made. Dentists don't have that. What they do have is cashflow, free time, and a day job where nobody ever questions them. This combination leads, invariably, to absolutely horrendous investing instincts. What can I say? Sometimes, the tooth hurts.
English
3
0
12
1.2K
Dileep George retweetledi
Elias Bareinboim
Elias Bareinboim@eliasbareinboim·
If you are a student interested in causality and the future of AI, consider applying and spending two weeks with us in NYC!
Columbia Engineering@CUSEAS

Applications are open for the Machine Learning Summer School, a two-week intensive program at Columbia University designed for PhD students serious about advancing ML research. @ColumbiaCompSci @DataSciColumbia @eliasbareinboim Apply before March 22: bit.ly/40nG66R

English
0
8
87
15.4K
Naveen Rao
Naveen Rao@NaveenGRao·
@dileeplearning Cosyne! Awesome...haven't heard of people going to that for 10+ years. Such a cool conference
English
1
0
2
493
Daniel Jeffries
Daniel Jeffries@Dan_Jeffries1·
I agree with Dr Li and LeCun. Models need something more. This is a piece of the puzzle. Three of the biggest problems with today's models are obvious to anyone working with them regularly to do real work and who aren't promising magic ex nihilo. - World models - Continual learning - Long term memory Dr Li and LeCun are working on problem one. Others are working on the other two. The are more problems but a consistent and clear understanding of the world that is stable is essential for robotics and for models that consistently make intelligent decisions from sound foundations. Too often today the same model gives a different answer on a different day to the same question. You do not do that, unless you are a politician or a sociopath or both (often the case.) To have consistency means coming from a consistent and cohesive understading of the world that changes slowly. Emphasis on slowly. This has major implications for real world physics, games, movies, TV, GUI navigation and more. The second issue is updateable weights and learning from mistakes. It can't be that the models only learn during training runs and never update in real time based on their experience. They must get this capability or we can't teach them in the real world and the model will continue to check into Git and run the full test suite on you no matter how many times you tell it not to and it will still not understand how many R's in strawberry when you misspelled it. The last one is memory. Without it a model can't keep long term task horizons in play. Forget those fake benchmarks about models doing tasks over a long time. It's a parlor trick. It's not real. Today's models are really terrible at true long term understating even if they can makes notes in a markdown file. Context is too short AND it does not dynamically pull from long term, updatable context. That is the real key to memory, constant automatic background search. You are always running multiple automatic, autonomic background memory searches when you read or talk to someone or do a project or do anything really. Your brain is constantly saying, hey I retrieved this and it might be relevant to what you are doing. Do you want it in your short term memory? A subconscious yes/no is happening there as you pull from associative past experience, wisdom, similar domains that might be helpful to understand what you are currently facing. For me AGI is a project manager. That is when I will know it's really real. When it can do the job of the best project manager. Not the tasks of a project manager, like filling out tickets or creating a sheet or a report. I mean the job, the critical thinking of the project manager. A project manager has to keep long term goals in mind, constantly updating objectives, feedback from stake holders, shifting asset locations and states of readiness, politics, who is lazy, who can be trusted, why, when, with what, and so much more. To do these kinds of things we need real change. Do not believe in the jobs apocalypse or the idea that current progress is anything but an S curve. A marvelous one. A beautiful one. A useful one. But an S curve nonetheless. And all the fools like Sanders and everyone else who thinks AI will be able to do every job next week are just that, fools. Don't believe them even if they work in AI and are really smart. Even if they're geniuses in other parts of their lives and in other domains they are absolute and total fools when it comes to predictions in the short term. Do not believe them. Only bold new research will get us to true AGI and ASI. But in the meantime, these LLMs are something wonderful so enjoy them as well.
Aakash Gupta@aakashgupta

Two Turing-class AI researchers just raised $2B in three weeks to bet against every LLM company on the planet. Fei-Fei Li closed $1B for World Labs on February 18. LeCun closed $1.03B for AMI Labs today. Both building world models. Both arguing that the entire generative AI paradigm is a statistical parlor trick. And the investor overlap tells you this is coordinated conviction, not coincidence. Nvidia backed both. So did Sea and Temasek. The math on AMI is absurd. $3.5B pre-money valuation. Four months old. Zero product. Zero revenue. The CEO said on the record that AMI won’t ship a product in three months, won’t have revenue in six, won’t hit $10M ARR in twelve. He described it as a long-term scientific endeavor. Investors gave him a billion dollars anyway. This tells you everything about how the smart money is actually modeling AI’s future. They’re not pricing AMI on a revenue multiple. They’re pricing it on the probability that LLMs hit a ceiling. And if you look at the investor list, Nvidia, Samsung, Toyota Ventures, Dassault, Sea, these are companies that need AI to understand physics, geometry, and force dynamics. A language model that can write poetry is worthless to a robotics company trying to predict what happens when a mechanical arm applies 12 newtons at a 30-degree angle to a flexible surface. LeCun raided his own lab to build this. Mike Rabbat, Meta’s former research science director. Saining Xie from Google DeepMind. Pascale Fung, senior director of AI research at Meta. He walked into Zuckerberg’s office in November, told him he was leaving, and four months later half of FAIR works for him. Meta is reportedly partnering with AMI anyway, which means Zuckerberg thinks LeCun might be right even while Meta keeps scaling Llama. AMI’s first partner is Nabla, a medical AI company, building toward FDA-certifiable agentic AI. That’s the use case that makes world models existential. LLMs hallucinate. In healthcare, hallucinations kill people. You can’t prompt-engineer your way out of a model that generates statistically plausible text when you need a system that actually understands how a human body works. Two billion dollars in three weeks. Two of the most credentialed researchers alive. And a thesis that says the $100B+ already poured into scaling LLMs is optimizing the wrong architecture entirely. If they’re wrong, investors lose money. If they’re right, every company building on top of GPT and Claude for physical-world applications just bought the wrong foundation.

English
17
11
140
17.9K
Dileep George
Dileep George@dileeplearning·
@pfau unfortunate naming though....why not "universal AI as interaction"?
English
0
0
0
359
Kording Lab 🦖
Kording Lab 🦖@KordingLab·
@dileeplearning Oh my point was rather that i disagreed with your "precisely". It needs not be identical, just a subset.
English
1
0
0
105
Kording Lab 🦖
Kording Lab 🦖@KordingLab·
I think the key pushback on this logic - as much as I love it coming from a normative angle - is that it will only work if the algorithm implemented in the brain is "simple", if it is human understandable. Imaging-to-simulation neither requires understandability nor promises it.
Doris Tsao@doristsao

My thoughts on connectomics and upload: 1) there is zero question connectomes are invaluable, and we need to get them for mouse, monkey, and human 2) the human, or even monkey, connectome seems a long ways off given costs (roughly $1/neuron). The projectome (map of all the axons) seems eminently reachable and should be a top priority imho 3) but even having the full connectome would only tell you numbers of synapses, not actual synaptic weights, and the two can be hugely divergent (eg only 5% of synapses onto V1 layer 4 neurons come from thalamus, even though this is the major driving input) 4) given #2 & #3, I think we can get to upload in the sense of building a functionally equivalent organism much faster through understanding the algorithms of the primate brain than through blind copying 5) in putting together something as complex as the human brain we would definitely want to check that the various pieces work as we go, which we can only do if we understand these pieces 6) I don't think upload in the sense of blindly creating a digital copy is the path to the abundant transhumanist future--actual understanding of brain structures so we can intelligently interface with them, and emulate their function in code without copying all the details, is. All to say, we need functional understanding to go hand in hand with anatomical mapping!

English
5
6
58
11.2K
Dileep George
Dileep George@dileeplearning·
@KordingLab what's the difference between "identifying function" and just "function"?
English
1
0
2
118
Kording Lab 🦖
Kording Lab 🦖@KordingLab·
@dileeplearning No. It assumes that the details we fail to capture are not necessary for identifying function. It is ok if the ones we fail to capture are a subset of those that we do not need for identifying function.
English
1
0
0
387
Dileep George
Dileep George@dileeplearning·
@aran_nayebi I think they mentioned in the thread that no training is done, just LIF neurons. My hypothesis is that the connectome encodes a CPG that is coupled to all the legs and just dialing in a time constant might be enough to produce the behavior. @ChongxiLai ...thoughts?
English
1
0
10
455
Dileep George
Dileep George@dileeplearning·
This
Doris Tsao@doristsao

My thoughts on connectomics and upload: 1) there is zero question connectomes are invaluable, and we need to get them for mouse, monkey, and human 2) the human, or even monkey, connectome seems a long ways off given costs (roughly $1/neuron). The projectome (map of all the axons) seems eminently reachable and should be a top priority imho 3) but even having the full connectome would only tell you numbers of synapses, not actual synaptic weights, and the two can be hugely divergent (eg only 5% of synapses onto V1 layer 4 neurons come from thalamus, even though this is the major driving input) 4) given #2 & #3, I think we can get to upload in the sense of building a functionally equivalent organism much faster through understanding the algorithms of the primate brain than through blind copying 5) in putting together something as complex as the human brain we would definitely want to check that the various pieces work as we go, which we can only do if we understand these pieces 6) I don't think upload in the sense of blindly creating a digital copy is the path to the abundant transhumanist future--actual understanding of brain structures so we can intelligently interface with them, and emulate their function in code without copying all the details, is. All to say, we need functional understanding to go hand in hand with anatomical mapping!

English
3
0
37
6.2K