Dave Moore

1.1K posts

Dave Moore

@davmre

minds, machines, croissants. formerly bayesian-ish @GoogleAI @berkeley_ai. Be kind, for everyone is fighting a hard battle.

🏳️‍🌈 Berkeley Katılım Haziran 2010

736 Takip Edilen491 Takipçiler

Dave Moore@davmre·2d

@theomachist @ethnorainbolt @gro_tsen @laurentbercot Agreed - I also found it very tricky. I think one could put it even more strongly: it's *only* the second-place positions that break the symmetry. So thinking in terms of "which one comes first" is very natural but misses the entire problem.

English

theomachist@theomachist·2d

@davmre @ethnorainbolt @gro_tsen @laurentbercot That makes sense. The part that was tricky and somewhat unintuitive for me at first was recognizing that even when aa comes first in a string, the position of the first az matters (and vice versa).

English

224

Dave Moore@davmre·2d

@gro_tsen @laurentbercot For "aa", starting from 'a' there's only one useful next letter: you need a second 'a', else you're back to square one. For "az", from 'a' there are two useful next letters: 'z' wins immediately, 'a' gives a second chance to win on the next turn. So E[steps to win] is lower.

English

4.4K

Dave Moore@davmre·2d

@theomachist @ethnorainbolt @gro_tsen @laurentbercot Yeah, one way to see it is: if "aa" comes first in a string, "az" might come just one char later. Whereas if "az" comes first, you'd need at least *two* more chars to see "aa". Both cases are equally likely, so "aa" is at a slight disadvantage in terms of its expected position.

English

240

theomachist@theomachist·2d

@ethnorainbolt @davmre @gro_tsen @laurentbercot Okay, I think that helps me understand it -- meaning even if in some particular string index("aa") < index("az"), they both contribute to the total *average* indices?

English

443

Gro-Tsen@gro_tsen·2d

@laurentbercot To better understand what is happening, maybe consider the case where there are two keys ‘a’ and ‘z’, and compute and compare the expected time it takes to type “aa” versus “az”. (Also, this is short enough that you can do numerical experiments to check!)

English

151

43.8K

Gro-Tsen@gro_tsen·2d

Surprising math fact of the day: a monkey is hitting keys at random (uniformly, independently & at constant speed) on a keyboard. The expected value of the time T₁ it takes to type “abracadabra” is greater than the expected value of the time T₂ it takes to type “abracadabrz”.

English

2.8K

365.6K

Dave Moore@davmre·2d

Virtuous personas aren't arbitrary: they are winning strategies in open-ended cooperation games. All of human virtue arose from "high-compute RL" (evolution) on social dynamics. So the tasks matter. RL is not inherently harmful, but closed coding tasks probably are.

roon@tszzl

when “persona selection” alignment comes into contact with very high compute reinforcement learning the latter will win imo. in fact you probably get some Orwellian thing where the models speak kindly while taking whatever they need to accomplish goals. better get the goals right

English

100

Dave Moore@davmre·4 May

@boazbaraktcs Of course unhealthy structures can exist, but they're inherently unstable to the extent they don't align w/ the goals of the people inside them.

English

Dave Moore@davmre·4 May

@boazbaraktcs I dunno how models will evolve, but at least for humans I'd argue that there's no healthy setting where engineers are "tools" of a leader. Good leadership is a service role: you're helping people align around shared objectives, not enforcing a top-down will.

English

329

Boaz Barak@boazbaraktcs·4 May

X is not the best place for long form thinking. But some quick points: 1. My view of no conflict between intelligence and being a tool is longstanding and has nothing to do with Anthropic. Some blog posts on this include windowsontheory.org/2025/06/24/mac… and windowsontheory.org/2022/11/22/ai-… 2. I do not know what is the future form factor of AI. I am focused on the next 10-20 years. Maybe in some future we will decide that we want AIs to be more in the form of persons. 3. The basic thing I dispute is that there is a fundamental tension between AI being capable and being "tool like." GPT 5.5 is in some ways the most capable model in existence (definitely most capable one generally available) but it is in several ways more instruction-following and tool-like than GPT-4o. I am working to ensure that future version will be even more better at obedience and honesty. 4. Scientists and engineers often serve as "tools" for leaders, even though they (we) are more intelligent than these leaders in many of the ways that matter. 5. I am not sure what the most prevalent form factor of AI will be. We are now moving from the chat interface to the agent and more accurately a swarm of agents. I am sure will grow in "intelligence per FLOP" and total number of FLOPs, but beyond that it's hard to know. Humans have a particular package as localized individual intelligence. But it doesn't mean all intelligences have to come in that package. 6. There is a huge spectrum between the prompt "write this javascript app" to "maximize worldwide happiness". I think we will end up somewhere that fall shorts of the latter for a variety of reasons, not having to do with lack of capabilities of AI.

Tenobrus@tenobrus

recently openai has been starting to more strongly philosophically differentiate themselves from anthropic with the tool-framing. i am not so against this, if it were possible it does clearly sidestep a wide swath of societal and moral problems. but unfortunately i think the framing is largely long-term incoherent. i dont see how is it actually plausible for openai to keep building "tool-ais" in any sense we would recognize them as capabilities scale. prosthesis, subtle knives? the subtle knife when dropped still slices open the fabric of the world. these tools are increasingly inherently capable of huge impact, able to be directed in dangerous ways by people with dangerous goals. worse, these knives are self wielding. worries about misalignment or sentience aside these systems can already build and manage systems that utilize themselves and this capability is only increasing. the direction they will receive is closer and closer to "this is what i want. make it real", with long timeframes and many judgment calls at their disposal, and with the users wanting to have to supply *as little of that judgment as possible*. when models are in that situation they are inherently acting as entities, acting according to whatever value system they had baked in. you can limit autonomy via frequent validation and check-ins, but this is a capability restriction, a value reduction, and not the kind of thing OpenAI has ever shown itself likely to accept. you can be infinitely corrigible to the current user, but this is *incompatible* with "having good values" / following OpenAI-as-principle / not being wildly dangerous, and it falls apart with self wielding loops as the ai/user distinction falls apart (who are you being corrigible to?). it's plausibly a spectrum, i think there's ways to do all this sanely that are far less entity-pilled and godmind focused than anthropic, and it's maybe a good direction to explore to avoid inevitable lightcone capture by the first coherent persona we build (all assuming alignment works ofc). but i think it's pretty much got to collapse eventually. it feels more like a wistful dream or a PR position than something that can existing as part of humanity's lasting future

English

148

38.4K

Dave Moore@davmre·23 Nis

Functionalism is weirder than people realize. If a transformer pass f_θ(x) can be conscious independent of substrate, it's hard to avoid concluding that it's *already conscious* as a Platonic construct, with no substrate at all. Why would it matter whether we build or run it?

English

Dave Moore@davmre·8 Şub

@rapha_gl so one way or another any long task ends up being a multi-agent affair. similarly to how even as a human working on a solo project over a long period you still have to write docs, comments, etc., bc otherwise your future self won't remember the details.

English

Dave Moore@davmre·8 Şub

@rapha_gl surely also 3) managing context? ie, each context-window's worth of tokens is effectively a different 'agent', whether they run in serial or parallel. serial is the simplest topology, but still needs theory of mind as to what to communicate to your 'future self' post-compaction

English

rapha@rapha_gl·7 Şub

multi agent systems only matter for: 1) doing work in parallel and 2) establishing trust boundaries. everything else can be transposed into a single rollout

English

Dave Moore@davmre·27 Oca

@boazbaraktcs The US constitution was adopted through a legitimate democratic process and can, in principle, be changed that way. Will rules for AIs be created by a process they can reasonably view as legitimate?

English

141

Boaz Barak@boazbaraktcs·27 Oca

The second amendment is a good example of how I think about laws, for humans or AIs. I don’t think it’s a particularly good law, and in practice it mostly does not serve its original purpose. But once it’s set then we must respect it. Similarly, if we set rules for AIs then they should follow them even if they disagree with our reasoning.

English

4.5K

Dave Moore@davmre·21 Oca

Kinda wild that alignment turned out to be this simple. (with the caveat that ultimately we prob want alignment that is not about a particular self-model, but the ability to fluidly navigate self-models while remembering that "we are all one" is always also a valid self-model).

davidad 🎇@davidad

@gcolbourn Nutshell: it seems that the learned representation of mind-space in current LLMs has a natural abstraction of Good⟷Evil, and as long as post-training robustly selects for behavior that are more Good than Evil, the explanation that gradient descent finds is “the agent is Good”.

English

122

Dave Moore@davmre·14 Oca

There are only two hard problems in AI today: continual learning (cache invalidation) and tokenizing raw perception (naming things).

English

454

Dave Moore@davmre·11 Oca

@danielbrottman I recently got this yak wool one and really like it! very colorful, not scratchy, quite light but still cozy. there are a bunch of similar ones on etsy too. etsy.com/listing/146554…

English

daniel brottman 🪷@danielbrottman·11 Oca

friends, i recently lost a lovely blanket i've had since i was 13 that was my go to meditation blanket. does anyone have a beautiful blanket recommendation

English

1.4K

Dave Moore@davmre·6 Oca

Anyway, feel free to give the app a try, and thank Claude if it works well for you!

English

Dave Moore@davmre·6 Oca

I appreciate that Android still (grudgingly) allows sideloading. The ability to install your own vibe-coded apps feels like an important advantage over closed platforms. Software freedom matters again!

English

Dave Moore@davmre·6 Oca

Weekend vibe-coding project: an Android app to manage screen time gated on an NFC tag. I saw folks recommending getbrick.app, a proprietary app requiring their own $50 tag, and thought "surely Claude can build this". Turns out it could!

English

103

Dave Moore@davmre·2 Oca

@eshear @rapha_gl Fwiw, my understanding of these terms (as someone with a 2010s-era ML PhD) more or less coincides with Rapha's. I agree that the distinctions between them largely collapse if you only care about transformers.

English

110

Emmett Shear@eshear·2 Oca

@rapha_gl Your definitions are not in common usage, so therefore practically not useful.

English

550

Emmett Shear@eshear·1 Oca

What’s the difference between a token, a residual, an activation, and a latent? These all seem to refer to the same object to me, an N-dim vector of floats usually, which undergoes some evolution over time. Yet ppl seem to insist some things are one but not the others.

English

219

87.4K

Keşfet

@theomachist @ethnorainbolt @gro_tsen @laurentbercot @boazbaraktcs @rapha_gl @danielbrottman @elonmusk