Rachel

267 posts

Rachel banner
Rachel

Rachel

@chelcott9

Studying moral minds @Harvard. Enthusiast.

San Francisco, CA Katılım Mart 2019
1.4K Takip Edilen830 Takipçiler
Rachel retweetledi
Ian Arawjo
Ian Arawjo@IanArawjo·
Not sure if there's an audience for this... but at least I'm having fun 😅
Ian Arawjo tweet media
English
22
53
765
35.9K
Rachel retweetledi
christian
christian@cxgonzalez·
[Podcast] Moral realism, Kantian ethics and veganism In this episode I talk with @morallawwithin about God, secular morality, moral realism, reasons, intuitions, Kantian ethics, and more. Florence Bachus is a PhD student at Harvard studying Kantian ethics. 1:49 — God and religion 5:39 — What is truth? 8:15 — Morality without God 10:39 — Obligation 11:50 — Reasons 20:05 — Morality grounded in consciousness (Valence realism) 27:33 — Morality without God cont. 30:00 — Kantian ethics 36:44 — Why universalize? 42:18 — Kantian Contractualism 43:10 — 4 categories of moral theories 46:36 — Logic as bedrock 47:25 — Is everything intuition? 50:44 — What is philosophy? 52:53 — Are there objective moral facts? 56:29 — The Kantian argument for veganism 1:02:53 — Applying to grad school
English
12
18
156
42.2K
Rachel retweetledi
Valerio Capraro
Valerio Capraro@ValerioCapraro·
Super interesting paper just out in Nature Human Behaviour! Do humans learn like transformers? In a smart experiment, the authors trained humans and transformer networks on the same rule-learning task, manipulating only one thing: the distribution of training examples, from fully diverse (every example unique) to highly redundant (the same items repeated). The first results are already interesting: Diverse examples lead both human and artificial systems to generalise rules to novel situations. Redundant examples lead both humans and artificial systems to memorize examples. Additionally, the switching between these two strategies appear at similar tradeoffs. So, do humans and transformers learn in the same way? Not quite! And it’s here that things get super interesting: If you show diverse examples first, humans learn to generalize without losing the ability to memorize later. Transformers, by contrast, do not show the same benefit: when training shifts toward memorization, earlier generalization does not reliably carry over. Humans can accumulate learning strategies more flexibly than transformers. Paper in the first reply
Valerio Capraro tweet media
English
12
76
365
25.3K
Rachel
Rachel@chelcott9·
People (famously) disagree about moral ground truth. So how do we benchmark moral reasoning in LLMs? In a new paper, we evaluate models' moral *arguments* rather than their final answers ⬇️
Yu Ying Chiu (Kelly Chiu)@kellychiuyy

New paper out with @Scale_AI! Introducing MoReBench - the first-ever benchmark to evaluate procedural moral reasoning in LLMs. MoReBench focuses on how LLMs reason, not just what they decide. We reveal surprising gaps in frontier models' moral reasoning that scaling laws & existing benchmarks miss entirely, and encourage more research around CoT monitoring and robust capability building. This collaboration spanned @UW @nyuniversity @harvard @stanford @mit @cais & more 🧠⚖️

English
0
0
2
210
Rachel retweetledi
Séb Krier
Séb Krier@sebkrier·
When emails were invented, the barriers to sending random people mail went down massively. To deal with the influx, we had to develop both norms (what's acceptable to send to who) and technologies (spam filtering, aliases). This is the case with other technologies too, like the printing press: suddenly anyone can publish, and so over time society came up with libel laws, editorial gatekeeping, citation norms etc. It's inevitable that as costs go down, some degree of misuse follows, and society gradually adapts. The same will apply with AI in all sorts of domains, including science: anyone can now write a plausible looking but hollow paper, and there will be plenty of academislop. We're going through a kind of Sokal Experiment at scale. In a way, this feels almost necessary to push our slow moving, status quo loving institutions to start developing better verification mechanisms, mandatory preregistration, code sharing, replication requirements, interactive/living papers etc. Imo getting this right should be a priority for the Progress/metascience community this coming year!
Séb Krier tweet media
Gueliz, Royaume du Maroc 🇲🇦 English
22
127
1.4K
103.1K
Rachel retweetledi
Spencer A. Klavan
Spencer A. Klavan@SpencerKlavan·
I find it exceptionally moving when the world reveals a hairline slippage between the ideal and the real—pi being irrational, or the gap between just and equal intonation. A tiny and infinite space between earth and heaven, inviting us ever upward.
English
28
42
555
63.6K
Rachel retweetledi
Joe Henrich
Joe Henrich@JoHenrich·
I love this paper. They found a natural experiment embedded within the expansion of the US rail network. The results indicate that greater market integration resulted in greater impersonal prosociality and LESS interpersonal prosociality. (as predicted in Chapter 9 of WEIRD) .
Joe Henrich tweet media
English
2
3
17
730
Rachel
Rachel@chelcott9·
@michael_nielsen I really enjoyed Michael Tomasello's Natural History of Human Morality. Describes the deep evolutionary and cognitive development of humans' moral and cooperative instincts. It's not as broad as Darwin or Wilson, but I think it would pair well with Taylor
English
1
0
15
810
Michael Nielsen
Michael Nielsen@michael_nielsen·
I am reading Charles Taylor's (absolutely marvellous) "A Secular Age". It encompasses a wide range of different ways of thinking about why human beings are here, what our obligations are, how our sense of morality has changed, and so on One lack in the book is that it doesn't take seriously the idea that humans are animals, and our feelings and thinking must be understood in that light I'm considering rereading Taylor in tandem with something that very forcefully (but sensibly) makes the case for human-experience-only-makes-sense-in-the-light-of-evolution. Leading candidates are Darwin's "The Descent of Man" and E. O. Wilson's "Sociobiology". Or maybe something by Trivers or someone like that? A problem with a lot of evolutionary psych writing is it's pretty darn shallow by comparison to Taylor - I want something with the same sense of depth, to put in argument with Taylor. Anything likely to fit the bill?
English
53
11
196
21.7K
Rachel retweetledi
smitha milli
smitha milli@SmithaMilli·
our new work on learning interpretable descriptions of human feedback data! check out the demo here: rajivmovva.com/demo-wimhf/ a little bit about why i'm excited about this direction... in the Community Alignment paper (arxiv.org/abs/2507.09650), we show that existing preference datasets are insufficient for learning even very basic dimensions of variation in human values (e.g. traditional vs secular-rational values) — a strong *negative* result but for that analysis, we pre-specified the dimensions we considered, which is not scalable for understanding preference data in general in this new work, we use SAEs to automatically understand what kind of preferences are *measurable* (e.g., responses vary in emojis) and actually *realized* (e.g., annotators prefer emojis) in human feedback data preferences vary significantly across datasets. for example, because of the new way we sampled candidates, in Community Alignment, responses differ in terms of their *values* (e.g. "promotes traditional, cautious, authority-respecting choices"), but not in other datasets applications of this work include data curation (e.g., relabeling harmful preferences on LMArena leads to large safety gains w/o reducing general performance) & interpretable personalization as a meta-note, in the last 10 years, i never found interpretability stuff that useful for myself, but recently across multiple projects (not even about model internals), that's started to change, so that's kind of cool
Raj Movva@rajivmovva

📣NEW PAPER! What's In My Human Feedback? (WIMHF) 🔦 Human feedback can induce unexpected/harmful changes to LLMs, like overconfidence or sycophancy. How can we forecast these behaviors ahead of time? Using SAEs, WIMHF automatically extracts these signals from preference data.

English
1
9
75
13.8K
Rachel
Rachel@chelcott9·
@TylerAlterman commuting on the Caltrain is one of my favorite memories of the Bay Area. massive windows + dynamic view is a creativity catalyst
English
0
0
1
146
Tyler is finishing a book, slow to reply
I'm convinced that one of the best things you can do for your life is to learn to enjoy commutes. If you can enjoy commutes it opens up all the riches of a city and you can be friends or lovers with ppl who live 1hr away
English
38
47
1.1K
26.3K
Rachel
Rachel@chelcott9·
@divyasiddarth My dad is working on an observatory in a dark sky area of South Africa that ppl can access remotely. redistribute star gazing
English
0
0
2
78
Divya Siddarth
Divya Siddarth@divyasiddarth·
WHERE ARE THE STARS wherearethestars.com pre-1900s, everyone saw the stars, everywhere. today, one-third of humanity can no longer see the milky way. over 80% of the global population lives under light-polluted skies. we are bortle 9 people now. but this can change!
Divya Siddarth tweet media
English
4
11
105
7.3K
Rachel
Rachel@chelcott9·
preregistering what evidence would update me toward belief in god, so interested deities can optimize their interventions
English
0
0
5
446
Rachel retweetledi
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
Just $1 can help avert 10 years of farmed animal suffering. I decided to give $250,000 as a donation match to @farmkind_giving after learning about the outsized opportunities to help. FarmKind directs your contributions to the most effective charities in this area. Please consider contributing, even if it’s a small amount. Together, we can double each other's impact and give a total of $500,000. Use the link below to donate with my match. Bluntly, there are some listeners who are in a position to give much more. Given how neglected this topic is, one such person could singlehandedly change the game for 10s of billions of animals. If you’re considering donating $50k or more, please reach out directly to @Lewis_Bollard and his team by DMing him, or emailing andres@openphilanthropy.org
Dwarkesh Patel@dwarkesh_sp

New episode w @Lewis_Bollard - a deep dive on the surprising economics of the meat industry. 0:00:00 – The astonishing efficiency of factory farming 0:07:18 – It was a mistake making this about diet 0:09:54 – Tech that’s sparing 100s of millions of animals/year 0:16:16 – Brainless chickens and higher welfare breeds 0:28:21 – $1 can prevent 10 years of animal suffering 0:37:26 – The situation in China and the developing world 0:41:41 – How the meat lobby got a lock on Congress 0:53:23 – Business structure of the meat industry 0:57:42 – Corporate campaigns are underrated Available on YouTube, Apple Podcasts, Spotify, etc (look up Dwarkesh Podcast).

English
79
163
1.3K
587.8K
Tyler is finishing a book, slow to reply
I want to host a “MAGA + libs” party where the two sides are not allowed to talk politics. Instead they do an activity, like boardgames. I want to witness the dawning horror as they find themselves getting along
English
788
53
1.8K
7M
Rachel retweetledi
Lucius Caviola
Lucius Caviola@LuciusCaviola·
1/ 🚨 New report out! Futures with Digital Minds: Expert Forecasts in 2025 Together with Bradford Saad, I surveyed experts on the future of digital minds — computers capable of subjective experience. Here’s why this is important and what they said 👇
Lucius Caviola tweet media
English
8
27
70
10.3K
Rachel retweetledi
sarah
sarah@atheorist·
Everyone has heard that Lean is a programming language that allows for proof verification in mathematics. But what does that actually mean and how does it work? If you’re interested in this question you should check out an article my friends and I wrote detailing the nuts and bolts of how Lean works. The article is written for those who want to get feel for what’s really going on, but don’t want to comb through the documentation.
sarah tweet media
English
17
63
356
44.9K