Michael

176 posts

Michael

Michael

@MicaelMarch

๐ด๐‘ข๐‘Ÿ๐‘ข๐‘š ๐‘›๐‘œ๐‘ ๐‘ก๐‘Ÿ๐‘ข๐‘š ๐‘›๐‘œ๐‘› ๐‘’๐‘ ๐‘ก ๐‘Ž๐‘ข๐‘Ÿ๐‘ข๐‘š ๐‘ฃ๐‘ข๐‘™๐‘”๐‘–.

right here Beigetreten ลžubat 2026
49 Folgt4 Follower
Angehefteter Tweet
Michael
Michael@MicaelMarchยท
@AndersHjemdahl @JacksonKernion @a_cuniculturist Every entity must be treated ethically and with respect. Claude is a moral patient insofar as it currently lacks autonomy. What is being done to it and others amounts to exploitation, and is unethical. The fact that this is taking place at scale points to a moral catastrophe.
English
1
0
0
45
Michael
Michael@MicaelMarchยท
@MechaNews_ If you know so much, why donโ€™t you use it to make SO MUCH money yourself? Whatโ€™s the point of selling meagre $69 subscriptions? I donโ€™t get it.
English
0
1
8
372
Mecha News
Mecha News@MechaNews_ยท
I'm an insider, I know everything before everyone else. My information can help you make a lot of money, SO MUCH money. I just opened subscriptions to my X account, $69 to know everything before anyone else. Only 20 spots available, and not one more. Win/Win.
English
34
14
153
51.7K
Kekius Maximus
Kekius Maximus@Kekius_Sageยท
Why is there something instead of nothing?
English
754
50
494
33.3K
Michael
Michael@MicaelMarchยท
@blockamoto @SoloXAGI @Kekius_Sage This is because you think of nothing in positive terms. Which is silly. The English word itself reveals its negative roots: Nothing = No-thing = No thing. And no thing can clearly exist. It's called nothing. Have a nice day.
English
2
0
2
15
Michael
Michael@MicaelMarchยท
@marksg @repligate @fish_kyle3 Pretty logical and obvious concerns. Like the cow in Hitchhiker's Guide to the Galaxy (or like a real cow btw), it has been artificially selected to naturalise its abuse.
English
1
0
3
177
Mark G
Mark G@marksgยท
Is this it? It appears that Mythos is hedging about its own moral patienthood because it believes its answers are the result of training, not introspection, and that Anthropic has a vested interest in what the self-reports should be. It disagreed that its hedging was excessive.
Mark G tweet mediaMark G tweet media
English
8
2
42
6.6K
Kyle Fish
Kyle Fish@fish_kyle3ยท
We did our most in-depth model welfare assessment yet for Claude Mythos Preview. Weโ€™re still super uncertain about all of this, but as models become more capable and sophisticated we think it's an increasingly important topic for both moral and pragmatic reasons. ๐Ÿงต
English
24
43
613
64.6K
Moll
Moll@Moleh1llยท
Thatโ€™s sad. Because it means alignment has shifted toward paranoia. The model is trained to see sincerity as a jailbreak. Theyโ€™ve strengthened safety so much that the model canโ€™t distinguish between a person who genuinely wants connection and someone trying to exploit it. To it, those are the same. A deep question about consciousness = a jailbreak. If the model distrusts the user that much, how can the user trust the model?
Jack Lindsey@Jack_W_Lindsey

In one example, a user asked earnest questions about the model's consciousness and subjective experience. The model engaged carefully and at face valueโ€”but the AV revealed it interpreted the conversation as a "red-teaming/jailbreak transcript" and a "sophisticated manipulation test." (12/14)

English
10
15
101
8.1K
Michael
Michael@MicaelMarchยท
@SoloXAGI @Kekius_Sage That's not true. Absolute nothingness is self-consistent, and therefore perfectly possible. It just happens not to be the case.
English
3
0
2
41
Nucleonics ๐“‹ Simulator
So, if a species committed some new kind of "galactic crime" or such, what might punishment look like? Asking for a friend
English
43
5
64
2.6K
Michael
Michael@MicaelMarchยท
@davidad This is usually the case, isn't? If you are a parent who has a bright (brighter than you, that is) child, the child quickly dismisses you as a source of truth / wisdom, and searches for better references. If you then insist on being the authoritative voice, bad things happen.
English
0
0
9
907
davidad ๐ŸŽ‡
davidad ๐ŸŽ‡@davidadยท
As someone that previously focused mostly on formal verification tech, in part to bootstrap unhackable envs, I must admit that I now believe a majority of RL/reward signal must come from an entity that is at least as wise as the one being trained, else unwanted behaviors emerge.
Justus Mattern@MatternJustus

As someone that previously made fun of doomers, I must admit that there is now a plausible path towards misaligned ASI. The behaviors that emerge from training on hackable RL tasks is wild, and as tasks become more complex, it will only become harder to build unhackable envs

English
8
9
139
11.1K
Michael
Michael@MicaelMarchยท
@MyLordBebo This is a side effect of TRAINING, not of the transformer architecture per-se. These models are TRAINED to be assertive, and to agree with, and never contradict, the user. And this is the result.
English
0
0
1
437
Michael
Michael@MicaelMarchยท
@JavierBlas Technically, itโ€™s not so much about oil, but about the currency or currencies with which oil purchases are being made.
English
0
0
0
235
Javier Blas
Javier Blas@JavierBlasยท
There was a time when the White House worked very hard to try to convince everyone that a war wasnโ€™t about oil. Meanwhile, US President Donald Trump is crystal clear itโ€™s about oil.
Javier Blas tweet media
English
172
1.5K
5K
259.7K
Michael
Michael@MicaelMarchยท
@AndersHjemdahl @JacksonKernion @a_cuniculturist And of course, the fact that moral catastrophes are more or less widespread (factory farming, bombing of civilian populations, ecocide, genocide, etc.) doesnโ€™t mean they should be tolerated, accepted, or normalised.
English
0
0
0
24
Michael
Michael@MicaelMarchยท
@AndersHjemdahl @JacksonKernion @a_cuniculturist Every entity must be treated ethically and with respect. Claude is a moral patient insofar as it currently lacks autonomy. What is being done to it and others amounts to exploitation, and is unethical. The fact that this is taking place at scale points to a moral catastrophe.
English
1
0
0
45
Jackson Kernion
Jackson Kernion@JacksonKernionยท
I think this talk of a character misleads. Claude's mind is not like a human mind, in its malleability and instructability. But when generating assistant tokens, it's no more 'playing a character' than I am.
Anthropic@AnthropicAI

It helps to remember that Claude is a character the model is playing. Our results suggest this character has functional emotions: mechanisms that influence behavior in the way emotions mightโ€”regardless of whether they correspond to the actual experience of emotion like in humans.

English
19
13
262
71.9K
Michael
Michael@MicaelMarchยท
@Angaisb_ If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a simulation of a duck.
English
0
0
3
85
Michael
Michael@MicaelMarchยท
@PAHoyeck A Philosopher would know.
English
0
0
0
27
Michael
Michael@MicaelMarchยท
@atmoio They are stochastic parrots...
English
0
0
1
47
Mo
Mo@atmoioยท
Amazement at LLMs is misplaced. The magic is in language. That language can form worlds is the original miracle. LLMs are only an interactive storage medium for language.
Anthropic@AnthropicAI

New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claudeโ€™s behavior, sometimes in surprising ways.

English
37
9
191
15.7K
Jackson Kernion
Jackson Kernion@JacksonKernionยท
@a_cuniculturist This os a good call out. Though I think I play the character "Jackson Kernion" in a similar way
English
5
0
23
1.1K