David McAllester

32 posts

David McAllester banner
David McAllester

David McAllester

@McAllesterDavid

Singularity or bust.

Katılım Kasım 2017
145 Takip Edilen603 Takipçiler
David McAllester retweetledi
Joshua Levy
Joshua Levy@ojoshe·
It may seem like AI is a brave and different world but it seems to me the same forces we’ve seen in business, software, and open source still apply. Dominant monopolies with technical strength have no incentive to open source (think of Oracle or Microsoft years ago). But open source is often the most powerful way to compete with the leader (think of Linux/Red Hat or Postgres). The third phase is when the underdog wins the technical battle and the leader internally reorganizes with open source more pervasive internally (think of Microsoft buying GitHub and using Linux in its cloud, or Oracle buying Java). Broadly this is all a good thing: it is better for developers, which (if we for the moment set aside AGI risk) is longer term better for consumers and the economy because it fundamentally simplifies the development of complex software. In this case OpenAI is the leader and (ironically) Meta and Google are the underdogs now so it is eminently sensible for Meta and Google to publish and open source more. Of course none of this is certain, but by this logic, it’s seems quite likely they’ll win the technical war toward open models and OpenAI/Microsoft will later absorb the open source and models.
English
1
2
19
7.8K
Yann LeCun
Yann LeCun@ylecun·
@dschan02 Yes, we can predict their desires & drives because we will be the ones designing them.
English
34
0
67
20.5K
David McAllester retweetledi
Yann LeCun
Yann LeCun@ylecun·
Once AI systems become more intelligent than humans, humans we will *still* be the "apex species." Equating intelligence with dominance is the main fallacy of the whole debate about AI existential risk. It's just wrong. Even *within* the human species It's wrong: it's *not* the smartest among us who dominate the others. More importantly, it's not the smartest among us who *want* to dominate others and who set the agenda. We are subservient to our drives, built into us by evolution. Because evolution made us a social species with a hierarchical social structure, some of us have a drive to dominate, and others not so much. But that drive has absolutely nothing to do with intelligence: chimpanzees, baboons, and wolves have similar drives. Orangutans do not because they are not a social species. And they are pretty darn smart. AI systems will become more intelligent than humans, but they will still be subservient to us. They same way the members of the staff of politicians or business leaders are often smarter than their leader. But their leader still calls the shot, and most staff members have no desire to take their place. We will design AI to be like the supersmart-but-non-dominating staff member. The "apex species" is not the smartest but the one that sets the overall agenda. That will be us.
English
540
539
3.3K
1.2M
David McAllester retweetledi
Sebastien Bubeck
Sebastien Bubeck@SebastienBubeck·
New LLM in town: ***phi-1 achieves 51% on HumanEval w. only 1.3B parameters & 7B tokens training dataset*** Any other >50% HumanEval model is >1000x bigger (e.g., WizardCoder from last week is 10x in model size and 100x in dataset size). How? ***Textbooks Are All You Need***
Sebastien Bubeck tweet media
English
41
316
1.6K
952.6K
David McAllester
David McAllester@McAllesterDavid·
@ChrSzegedy I would expect that dropout is coupled to learning rate --- the optimal learning rate varies with dropout and dropout changes the effective learning rate. Was this carefully controlled?
English
1
0
9
736
David McAllester
David McAllester@McAllesterDavid·
@ylecun @nlpnoah Just one more observation. In a model with an internal "chain of though", as described in the blog post, the internal state then depends on the stochastically generated thoughts.
English
0
0
0
312
Yann LeCun
Yann LeCun@ylecun·
@McAllesterDavid @nlpnoah LLMs in their current form are stateless. Their "state" is entirely determined by the prompt, hence immaterial.
English
16
9
85
29.7K
David McAllester
David McAllester@McAllesterDavid·
@ylecun @nlpnoah When we ask what a language model believes we are not asking a question about the context. We are asking a question about how the model might reply in response to a certain question.
English
1
0
0
328
David McAllester
David McAllester@McAllesterDavid·
@ylecun @nlpnoah It is true that computation is determined by the context. But it seems philosophically irrelevant to wonder whether humans are a deterministic function of their inputs. We are still interested in their internal computation and state.
English
1
0
1
541
David McAllester
David McAllester@McAllesterDavid·
arxiv.org/abs/2301.11108 A paper on the mathematics of diffusion models that explains the diffusion SDEs --- both forward and backward --- assuming only familiarity with Gaussians. It also gives some original non-variational likelihood formulas.
English
8
160
799
84K
David McAllester retweetledi
Percy Liang
Percy Liang@percyliang·
RL from human feedback seems to be the main tool for alignment. Given reward hacking and the falliability of humans, this strategy seems bound to produce agents that merely appear to be aligned, but are bad/wrong in subtle, inconspicuous ways. Is anyone else worried about this?
English
74
82
930
0
David McAllester
David McAllester@McAllesterDavid·
When I was in high school I read a book on information theory. It was obvious to me at that time that strong modeling of the distribution of language requires uncovering meaning. I continue to be frustrated that this seemingly obvious observation gets so little traction.
English
0
0
13
0
David McAllester
David McAllester@McAllesterDavid·
@chrmanning When I was in high school I read a book on information theory. It was obvious to me at that time that strong modeling of the distribution of language requires uncovering meaning. I continue to be frustrated that this seemingly obvious observation gets so little traction.
English
1
0
0
0
Christopher Manning
Christopher Manning@chrmanning·
Interestingly, ChatGPT presents a much more balanced perspective on this issue than you do! 8/8
Christopher Manning tweet mediaChristopher Manning tweet media
English
7
13
126
0
Christopher Manning
Christopher Manning@chrmanning·
Dear @emilymbender—and @Abebab—you need to keep “reminding” people of your viewpoint because it is not an argument that is convincing to all or a self-evident truth. It is a particular academic position, which lots of people support but a good number of others disagree with. 1/8
@emilymbender.bsky.social@emilymbender

Yes, exactly this. I wish we didn't need to keep reminding people, and @Abebab is commendable for being gentle about it! For the long form of this argument, see Bender & @alkoller 2020: aclanthology.org/2020.acl-main.…

English
15
124
654
0
David McAllester retweetledi
Dan Roy
Dan Roy@roydanroy·
I've discovered the secret of general Artificial Intelligence. It just so happens to be answered by my own field decades ago, but it just needed to be synthesized. I see further than everyone else. Follow me and I'll tweet out tidbits of wisdom / trivia at regular intervals.
English
19
5
195
0
David McAllester
David McAllester@McAllesterDavid·
@GaryMarcus @ylecun What makes you think that your list of four requirements is more than 20 years away? Maybe it will all happen in just the next architectural advance. Many many people are trying all sorts of things ...
English
0
0
0
0
Gary Marcus
Gary Marcus@GaryMarcus·
.@ylecun asks me, basically, if you are so smart, why don’t you solve AGI? My answer is that is too far outside grasp at the present time, because it requires significant progress in multiple areas, more or less in lockstep, before we will get to radical increases in performance:
Gary Marcus@GaryMarcus

@ylecun @rao2z @guyvdb @MITCoCoSci I don’t think that AGI is achievable in short-term; as outlined in The Next Decade in AI, I think progress requires coalition of - advances in neurosymbolic integration - richer knowledge bases - better reasoning from incomplete data - ways of inducing complex cognitive models

English
7
4
26
0