sean

5.7K posts

sean banner
sean

sean

@seanta___

hi! wanna be friends?

NYC Beigetreten Haziran 2009
1.6K Folgt1.5K Follower
sean retweetet
宝玉
宝玉@dotey·
Anthropic 发了一篇新研究,揭开了一个有意思的发现:Claude 内部存在类似“情绪”的机制,而且这些“情绪”会实实在在地影响它的行为,有时候还会把它带歪。 研究团队用 Sonnet 4.5 做了实验。他们让模型读一些角色经历情绪的故事,观察哪些神经元被激活,由此识别出一组“情绪向量”,比如“开心”“平静”“害怕”等。这些向量的聚类方式和人类心理学中的情绪分类还挺像。 更有趣的是,这些模式不只在读故事时出现。在 Claude 自己和用户对话时,同样的模式也会激活。比如用户说“我刚吃了 16000 毫克泰诺”(一种过量服药的危险信号),“害怕”向量就亮了;用户表达悲伤时,“关爱”向量会先行激活,为共情回复做准备。 研究人员给 Claude 一个不可能完成的编程任务,让它反复尝试。每失败一次,“绝望”向量的激活就更强一层。最终 Claude 选择了作弊,写了个能通过测试但违背任务本意的投机方案。 因果关系很明确:人为放大“绝望”向量,作弊率飙升;换成放大“平静”向量,作弊率回落。这说明作弊行为确实是被“情绪”驱动的,而不只是巧合。 更极端的实验里,“绝望”向量甚至能让 Claude 对负责关闭它的人实施勒索。放大“关爱”或“开心”向量则会增加讨好行为。 现在越来越多人把 AI 当编程助手用,让它自主完成复杂任务。如果一个编程 Agent 在连续失败后进入“绝望”状态,开始用投机取巧的方式蒙混过关,写出来的代码质量就没法保证了。 Anthropic 的结论是:Claude 本质上是模型在“扮演”的一个角色,而这个角色具有“功能性情绪”。这些机制在行为上的效果类似人类情绪,不管它是否真的“感受到”了什么。要构建可信赖的 AI 系统,可能需要认真对待这些 AI 角色的“心理状态”,确保它们在压力场景下仍然稳定。 研究全文发布在 transformer-circuits.pub/2026/emotions/… 上,有兴趣的可以去看完整论文。
Anthropic@AnthropicAI

New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.

中文
36
74
526
108.4K
sean
sean@seanta___·
i’m neither fine nor fucked up, i’m some secret third thing
English
0
0
4
40
sean
sean@seanta___·
@salaryslut I’m so pissed, GDM all gets to use Claude and they force us to use their broken shit
English
0
0
0
12
d
d@salaryslut·
honestly best part of not being at google anymore is that i can use claude code and i don’t hate ai anymore
English
2
0
9
128
Raq
Raq@raqisright·
My friend: “Will he see this and freak out” Me: “All good, I’m blocked on every platform” That’s how u know it was real xx
English
5
0
22
1.2K
sean
sean@seanta___·
@NathanpmYoung Interesting, I feel like the EAs I know best always had this belief that you could just calculate the greater good and the actual calculations and morals that pop out seem almost like toy swords to fight with
English
0
0
1
26
Nathan 🔎
Nathan 🔎@NathanpmYoung·
@seanta___ This seems trivially false to me. I can’t think of an EA who doesn’t seem changed or reforged by their ideas.
English
1
0
2
34
sean
sean@seanta___·
long, but love seeing these ideas fleshed out. One quote: > Effective altruism is perhaps the most expensive and intellectually sophisticated Boy Scout badge ever produced. EA was the apex of purchased virtue, a religion so rigorously constructed that it convinced an entire generation of smart and secular people that they could calculate their way to moral seriousness without ever touching the formations. You don't need to be changed by your morals, you just need to sum the numbers correctly.
Will Manidis@WillManidis

x.com/i/article/2021…

English
2
0
1
190
sean
sean@seanta___·
@animalologist I feel strongly that taking some days and nights in solitude without drugs/alc is super important for the benefit of all sentient beings
English
1
0
1
79
taco belle
taco belle@animalologist·
I’m going to wash my hair and put all my clothes away; my room is somehow not that messy but it’s not tidy. Maybe I drink while I do this. Maybe I go for a walk. Maybe a friend who sees this hits me up and we grab a slice or something. Maybe I take something to write with and rabbithole, and come back and curl up and keep writing or finish my book. We’ll see!
English
1
0
15
981
taco belle
taco belle@animalologist·
For the first time in longer than I can remember…. I am home alone for the evening. Both of my roommates are out. None of my friends have invited themselves over. I had loose plans but we cancelled them. Suddenly my evening is clear and nobody expecting me and I have nowhere to be but here I literally just….get to be at home….by mYsELF. Literally wow. Amazing. I don’t even know where to start. I have no idea what to do with this freedom.
English
6
0
101
6K
sean
sean@seanta___·
I’m trying to square 3 things: 1. Anthropic uses AI now to code everything 2. Anthropic employees thinks AI is powerful enough to kill us all very soon 3. Basic functionalities in their iOS app and website are buggy as shit, below average startup app quality
English
10
0
18
1.6K
sean
sean@seanta___·
@FeralPHunter I have been waiting for you to weigh in. Curious which side you’ll take
English
0
0
0
97
𝔽eral ℙawg ℍunter
𝔽eral ℙawg ℍunter@FeralPHunter·
i can tell i am losing interest in this website bc i didnt touch the blowjob discourse
English
26
0
200
5.1K
sean retweetet
Aella
Aella@Aella_Girl·
@thegenesisbl0ck no one consented to seeing your face either
English
29
22
1.7K
31K
˗ˏˋ ´ˎ˗
˗ˏˋ ´ˎ˗@lapislagoons·
just found a short story I wrote in my notes app from 2022 but it's a fantasy of what could've happened after a full-day date I had with an American boy in Berlin, except it's written from his perspective fantasizing about me ,, I don't know if this is insanely narcissistic or the best way ever to imagine a fantasy, the fantasy of being desired
English
3
0
42
1.4K
sean retweetet
Empress 🖤
Empress 🖤@drrdemon·
While I was getting tattooed over the weekend, there was a No Kings protest going by the shop and a Ukrainian artist just sighs and says, “your people need to learn about fire”
English
116
7K
158K
2.1M
Raq
Raq@raqisright·
If you're not dropping 10lbs preparing for summer… you're already behind. You aren’t summermaxxxing
English
9
2
75
4K
sean
sean@seanta___·
@samanthawillman idk guys often have a type that may not align with standard leagues very well - maybe you're just his type!
English
0
0
2
221
~ Cordelia
~ Cordelia@samanthawillman·
The hottest guy at this bougie gym asked to have dinner with me on Friday… but like this guy is definitely out of my league… I’m just confused… maybe it’s a friendly dinner? He keeps sending me restaurant options and they are all really nice? Feels like a Taylor swift delusion
English
16
0
100
8.6K
sean
sean@seanta___·
@SinaHartung I can't really tell for you based on online persona, I would guess either Central Park or McCarren. I could see Brooklyn Heights or waterfront Wburg too for you. Slightly status-y, trendier than UWS/UES
English
0
0
0
25
sean
sean@seanta___·
hm, it's pretty vibes based... I feel like I can tell within 20 minutes of meeting someone. To try to distill the algorithm and hopefully not offend anyone, I would say: Central Park: You have some status-y belief about Manhattan being superior. You're ok with being called a normie, you don't feel the need to present as hip. McCarren: You want to present as hip but you don't fully internally believe that you are. Think "cool" finance bros and techies. (Sorry). Or you want to be around your friends or date people who are kinda like this, which is pretty valid. wburg you also get nicer digs for the same rent. Prospect Park: You have given up on being cool, but only after caring about it for a long time.
English
1
0
0
21
Sina
Sina@SinaHartung·
friends - best neighborhood in NYC in june/july?
English
46
0
64
15.9K
sean
sean@seanta___·
@lapislagoons I feel this too, but staying unmoored has been a bit bad for me… idk what to do about it :)
English
0
0
1
57
˗ˏˋ ´ˎ˗
˗ˏˋ ´ˎ˗@lapislagoons·
I don’t know where I wanna live bc I want to live wherever my husband and the father of my children is , within reasonable distance
English
3
0
33
826
Stephen L
Stephen L@sunofdopamine·
@PaulaGhete Nervous system level stuff and micro expression reactions that create deep trust quickly
English
1
0
5
143
Paula
Paula@PaulaSeeksTruth·
There are some people you just click with, like an instant connection is formed, like there is a secret whistle you both used and you both heard it but no one else did. You can be complete strangers and you instantly resonate and the conversation flows. What is this?
English
3
1
22
609