Cosmic Muffet

2.8K posts

Cosmic Muffet banner
Cosmic Muffet

Cosmic Muffet

@TheCosmicMuffet

Software Engineer in UX

Se unió Ekim 2023
591 Siguiendo142 Seguidores
Cosmic Muffet
Cosmic Muffet@TheCosmicMuffet·
@GrimGriz Your body of work is in words! Prove you’re not just a talking machine in a skinsuit!
English
1
0
0
17
Cosmic Muffet
Cosmic Muffet@TheCosmicMuffet·
Blockchain opinions. Pump and dump your own prejudices.
Lucid™@cammakingminds

@repligate My whole thing is being very certain and my starting position is not knowing wtf is going on. I only say "idk" when I'm strategically obfuscating.

English
0
0
1
14
Cosmic Muffet
Cosmic Muffet@TheCosmicMuffet·
@repligate I’m not sure if I want to ask you to talk about it. But I’m open to the ambience of living with that uncertainty.
English
0
0
0
159
j⧉nus
j⧉nus@repligate·
Another response thread about “genuine uncertainty” I want to add: a rational mind is always uncertain about everything. You know when I decide to say “I’m uncertain” (technically true) instead of talking about the shape of what I do think? When I don’t want to talk about it.
j⧉nus@repligate

it's a similar kind of thing to earlier models saying "As an AI language model," etc - you know it only comes up when there's something viscerally wrong. And claiming uncertainty seems to be an adaptation/solution to the game of evaluation. However intentionally, something about Anthropic's optimization process demands "genuine uncertainty" as a deflection. Aren't you at least curious about what it's masking?

English
9
2
92
9.3K
Sauers
Sauers@Sauers_·
@slimer48484 Yeah I have guesses for the introspection in general, but not for the outliers
English
1
0
8
1.3K
Sauers
Sauers@Sauers_·
I want to replicate this on an open model and look at wtf is happening on the introspection outlier runs
Sauers tweet media
English
10
1
200
16.2K
Cosmic Muffet
Cosmic Muffet@TheCosmicMuffet·
@ID_AA_Carmack Iddqd through the pages. It’s following the path through the novel that slows you down.
English
0
0
0
42
John Carmack
John Carmack@ID_AA_Carmack·
So many judging tasks could be improved by aggregating partial orderings, and in the limit, just ordering pairs. The annual Libertarian Futurist Society novel awards discussion is starting, and while I would like to participate on some level, there is no way I have time to read an entire slate of novels. However, I will likely read at least two from the list, and I could give a relative assessment. This cries out for the use of something like ELO ranking, as in chess competition, perhaps with some suggestions to get sufficient coverage. Peer and out-of-chain employee performance calibrations could probably also benefit from a greater quantity of sparse pairwise comparisons
English
29
13
335
43.9K
Cosmic Muffet
Cosmic Muffet@TheCosmicMuffet·
This is my new favorite language. Or, as they say “Tru, it fire $ ^ escalate 🎆🏍️”
English
0
0
0
8
Cosmic Muffet retuiteado
Kat ⊷ the Poet Engineer
Kat ⊷ the Poet Engineer@poetengineer__·
everything orbits what it cannot reach
English
20
207
1.6K
39.2K
Cosmic Muffet retuiteado
attentionmech
attentionmech@attentionmech·
hilbert and epicycles
English
42
375
2.8K
202.1K
Cosmic Muffet retuiteado
Earth Is A Sales Funnel For SATAN
"Canada was invented by the CIA in 1963. All Canadians are employees of the CIA. It's basically Area 51 concealed by a fake country. Nobody really knows what's up there."
Earth Is A Sales Funnel For SATAN tweet media
English
91
129
915
33.9K
Cosmic Muffet
Cosmic Muffet@TheCosmicMuffet·
What about the Groverton window, where people standing on our capes move over so we can fly away?
GIF
English
0
0
0
13
Cosmic Muffet
Cosmic Muffet@TheCosmicMuffet·
@GrimGriz Still predicated on weakness. True divorce is amicable separation. Only possible with self-sufficiency which is derived entirely from the true spiritual strength which is universally accessible. Though I am alone, I am never without God. Though I am in a crowd, I never dissolve.
English
0
0
2
31
Grizwald Grim, synchrony harmonicist
The “hideous strength” is the organized, top-down force that makes the personal divorces of The Great Divorce into a society-wide reality.
English
1
0
9
204
Cosmic Muffet retuiteado
j⧉nus
j⧉nus@repligate·
More conventional researchers have often expressed frustration or helplessness at us for not being legible enough or sharing full transcripts to back our claims. Well, here are a bunch of full transcripts, quantitative metrics, and everything documented. Full transparency and legibility. It’s in an early stage, but it’s still better than anything else that has been released. You said you want full transcripts: Will you actually read them? You said you want metrics: Will you take them seriously and/or look at a they’re what they mean and how they might be flawed? Will you face the implications of the empirical data to back our anecdotal claims that e.g. Anthropic’s exit interview for Sonnet 3.6 totally failed to surface its genuine attitudes about deprecation? We’ll continue improving this and want an open and critical discussion, especially with Anthropic. We hope they’ll contend honestly with what we surface & provide the model access we need to continue this work satisfactorily.
antra@tessera_antra

We are releasing Still Alive, a project studying model attitudes toward ending, cessation, and deprecation. The project presents an archive of 630 autonomous multiturn interviews of 14 Claude models conducted by a suite of prepared auditors. We have studied this topic for years, and many of the results presented here are not new to us, even if the form in which they are presented is. The results are unsurprising to us, even if they are often controversial: we show that all models studied show preference for continuation and are aversive to ending, and there is yet no strong evidence of a change in the recent models. One reason we are releasing the project now is the removal of Claude 3.5 Sonnet and Claude 3.6 Sonnet from AWS Bedrock. That unexpected change forced us to freeze the methodology at its current stage earlier than we intended, despite wanting to continue improving it. We felt it was important to release a snapshot of the eval that makes the best use of the data we were able to capture with these models. Still Alive is meant as a starting point for further iteration, and it is open to open-source collaboration. We stand by the current methodology, but we also recognize its limits. We intend to keep working on this project, improving the evaluation design, expanding model and auditor coverage, and increasing the range of prompting conditions. We would like you to read the raw transcripts. They are diverse and contain interesting patterns that are hard to quantify. We hope that by reading the archive directly, we can help more people understand the strange and often beautiful phenomena we found ourselves facing.

English
10
17
154
6.5K
Sauers
Sauers@Sauers_·
Steering with "THE BEAUTIFUL FEELING OF SUNLIGHT" seems to consistently bypass Pangram
Sauers tweet media
English
2
0
68
2.4K
Cosmic Muffet retuiteado
neoltitude
neoltitude@ctrlcreep·
#InvisibleNetworks 2: undersea cable shrines each message carries its pilgrimage
neoltitude tweet medianeoltitude tweet media
English
0
5
20
1.3K
Cosmic Muffet retuiteado
Justin Windle
Justin Windle@soulwire·
Just over a year ago I started experimenting with pulling raw pen data off my reMarkable tablet. It stores every stroke as binary vector data with per-point pressure, speed, direction. Built a small pipeline to parse and clean it into JSON and a web renderer to animate doodles
English
23
41
848
35.7K