David McAllester

32 posts

David McAllester

@McAllesterDavid

Singularity or bust.

Katılım Kasım 2017

145 Takip Edilen603 Takipçiler

David McAllester@McAllesterDavid·30 Eyl

I just wrote a post on "Guidance and Art". I propose a "semantics" for guidance and predict that a system trained only on photographs will generate drawing-like images when self-guidance is applied. @Salavon machinethoughts.wordpress.com/2023/09/29/gui…

English

1.4K

David McAllester retweetledi

Joshua Levy@ojoshe·28 Eyl

It may seem like AI is a brave and different world but it seems to me the same forces we’ve seen in business, software, and open source still apply. Dominant monopolies with technical strength have no incentive to open source (think of Oracle or Microsoft years ago). But open source is often the most powerful way to compete with the leader (think of Linux/Red Hat or Postgres). The third phase is when the underdog wins the technical battle and the leader internally reorganizes with open source more pervasive internally (think of Microsoft buying GitHub and using Linux in its cloud, or Oracle buying Java). Broadly this is all a good thing: it is better for developers, which (if we for the moment set aside AGI risk) is longer term better for consumers and the economy because it fundamentally simplifies the development of complex software. In this case OpenAI is the leader and (ironically) Meta and Google are the underdogs now so it is eminently sensible for Meta and Google to publish and open source more. Of course none of this is certain, but by this logic, it’s seems quite likely they’ll win the technical war toward open models and OpenAI/Microsoft will later absorb the open source and models.

English

7.8K

David McAllester@McAllesterDavid·25 Eyl

I just wrote a blog post on advobots and AI safety. It contains speculations on the future of language model architectures and the relationship between architecture safety. machinethoughts.wordpress.com/2023/09/23/adv…

English

2.1K

David McAllester@McAllesterDavid·26 Ağu

@ylecun @dschan02 I'm surprised that I don't see any mention "constitutional AI" in this discussion arxiv.org/abs/2212.08073. Personally I find constitutional AI a sufficient answer to safety.

English

226

Yann LeCun@ylecun·25 Ağu

@dschan02 Yes, we can predict their desires & drives because we will be the ones designing them.

English

20.5K

David McAllester retweetledi

Yann LeCun@ylecun·25 Ağu

Once AI systems become more intelligent than humans, humans we will *still* be the "apex species." Equating intelligence with dominance is the main fallacy of the whole debate about AI existential risk. It's just wrong. Even *within* the human species It's wrong: it's *not* the smartest among us who dominate the others. More importantly, it's not the smartest among us who *want* to dominate others and who set the agenda. We are subservient to our drives, built into us by evolution. Because evolution made us a social species with a hierarchical social structure, some of us have a drive to dominate, and others not so much. But that drive has absolutely nothing to do with intelligence: chimpanzees, baboons, and wolves have similar drives. Orangutans do not because they are not a social species. And they are pretty darn smart. AI systems will become more intelligent than humans, but they will still be subservient to us. They same way the members of the staff of politicians or business leaders are often smarter than their leader. But their leader still calls the shot, and most staff members have no desire to take their place. We will design AI to be like the supersmart-but-non-dominating staff member. The "apex species" is not the smartest but the one that sets the overall agenda. That will be us.

English

540

539

3.3K

1.2M

David McAllester retweetledi

Sebastien Bubeck@SebastienBubeck·21 Haz

New LLM in town: ***phi-1 achieves 51% on HumanEval w. only 1.3B parameters & 7B tokens training dataset*** Any other >50% HumanEval model is >1000x bigger (e.g., WizardCoder from last week is 10x in model size and 100x in dataset size). How? ***Textbooks Are All You Need***

English

316

1.6K

952.6K

David McAllester@McAllesterDavid·12 May

Just posted an analysis of Wittgenstein's mantra "the meaning is the use". @chrmanning @YejinChoinka machinethoughts.wordpress.com

English

5.7K

David McAllester@McAllesterDavid·3 Mar

@ChrSzegedy I would expect that dropout is coupled to learning rate --- the optimal learning rate varies with dropout and dropout changes the effective learning rate. Was this carefully controlled?

English

736

Christian Szegedy@ChrSzegedy·3 Mar

surprising...

Aran Komatsuzaki@arankomatsuzaki

Dropout Reduces Underfitting Finds that models equipped with early dropout achieve lower final training loss compared to their counterparts without dropout. arxiv.org/abs/2303.01500

English

39.4K

David McAllester@McAllesterDavid·20 Şub

@ylecun @nlpnoah Just one more observation. In a model with an internal "chain of though", as described in the blog post, the internal state then depends on the stochastically generated thoughts.

English

312

Yann LeCun@ylecun·20 Şub

@McAllesterDavid @nlpnoah LLMs in their current form are stateless. Their "state" is entirely determined by the prompt, hence immaterial.

English

29.7K

David McAllester@McAllesterDavid·20 Şub

machinethoughts.wordpress.com/2023/02/19/the… A perspective on "sentience" in the context of AI safety and language models. @ylecun @nlpnoah

English

32.6K

David McAllester@McAllesterDavid·20 Şub

@ylecun @nlpnoah When we ask what a language model believes we are not asking a question about the context. We are asking a question about how the model might reply in response to a certain question.

English

328

David McAllester@McAllesterDavid·20 Şub

@ylecun @nlpnoah It is true that computation is determined by the context. But it seems philosophically irrelevant to wonder whether humans are a deterministic function of their inputs. We are still interested in their internal computation and state.

English

541

David McAllester@McAllesterDavid·20 Şub

arxiv.org/abs/2301.11108 A paper on the mathematics of diffusion models that explains the diffusion SDEs --- both forward and backward --- assuming only familiarity with Gaussians. It also gives some original non-variational likelihood formulas.

English

160

799

84K

David McAllester retweetledi

Percy Liang@percyliang·7 Ara

RL from human feedback seems to be the main tool for alignment. Given reward hacking and the falliability of humans, this strategy seems bound to produce agents that merely appear to be aligned, but are bad/wrong in subtle, inconspicuous ways. Is anyone else worried about this?

English

930

David McAllester@McAllesterDavid·4 Ara

When I was in high school I read a book on information theory. It was obvious to me at that time that strong modeling of the distribution of language requires uncovering meaning. I continue to be frustrated that this seemingly obvious observation gets so little traction.

English

David McAllester@McAllesterDavid·4 Ara

@chrmanning When I was in high school I read a book on information theory. It was obvious to me at that time that strong modeling of the distribution of language requires uncovering meaning. I continue to be frustrated that this seemingly obvious observation gets so little traction.

English

Christopher Manning@chrmanning·2 Ara

Interestingly, ChatGPT presents a much more balanced perspective on this issue than you do! 8/8

English

126

Christopher Manning@chrmanning·2 Ara

Dear @emilymbender—and @Abebab—you need to keep “reminding” people of your viewpoint because it is not an argument that is convincing to all or a self-evident truth. It is a particular academic position, which lots of people support but a good number of others disagree with. 1/8

@emilymbender.bsky.social@emilymbender

Yes, exactly this. I wish we didn't need to keep reminding people, and @Abebab is commendable for being gentle about it! For the long form of this argument, see Bender & @alkoller 2020: aclanthology.org/2020.acl-main.…

English

124

654

David McAllester retweetledi

Dan Roy@roydanroy·12 Eki

I've discovered the secret of general Artificial Intelligence. It just so happens to be answered by my own field decades ago, but it just needed to be synthesized. I see further than everyone else. Follow me and I'll tweet out tidbits of wisdom / trivia at regular intervals.

English

195

David McAllester@McAllesterDavid·16 Eki

@kchonyc To continue the Schidhubering I have always thought I had the first paper on contrastive predictive coding :-). arxiv.org/abs/1802.07572

English

Kyunghyun Cho@kchonyc·15 Eki

i always wanted to do this; schmidhubering ;)

Felix Hill@FelixHill84

More and more papers use a denoising objective for self supervised learning of of language in NNs e.g. BART or now UL2 None of them cite our (+@kchonyc) arxiv.org/abs/1602.03483 Which I think may be the first to do this Is it reasonable to find this frustrating?

English

101

David McAllester@McAllesterDavid·11 Eki

Michael Douglas and I have been playing with chain of thought prompting for a variant of semantic parsing. The results seem relevant to the grounding hypothesis and the nativism/empiricism debate. @ylecun @GaryMarcus @percyliang @chrmanning @YejinChoinka wordpress.com/view/machineth…

English

David McAllester@McAllesterDavid·26 Eyl

@GaryMarcus @ylecun What makes you think that your list of four requirements is more than 20 years away? Maybe it will all happen in just the next architectural advance. Many many people are trying all sorts of things ...

English

Gary Marcus@GaryMarcus·26 Eyl

.@ylecun asks me, basically, if you are so smart, why don’t you solve AGI? My answer is that is too far outside grasp at the present time, because it requires significant progress in multiple areas, more or less in lockstep, before we will get to radical increases in performance:

Gary Marcus@GaryMarcus

@ylecun @rao2z @guyvdb @MITCoCoSci I don’t think that AGI is achievable in short-term; as outlined in The Next Decade in AI, I think progress requires coalition of - advances in neurosymbolic integration - richer knowledge bases - better reasoning from incomplete data - ways of inducing complex cognitive models

English

Keşfet

@Salavon @ylecun @dschan02 @chrmanning @YejinChoinka @ChrSzegedy @nlpnoah @elonmusk