arestlessrest (previously: former user)

143 posts

arestlessrest (previously: former user)

arestlessrest (previously: former user)

@arestlessrest

Mechanism Design for the Soul https://t.co/szIGPNQMT8

Katılım Ekim 2021
30 Takip Edilen7 Takipçiler
arestlessrest (previously: former user)
@tnnrnwll @giglema I suspect the 4.7 tokeniser uses less tokens for "spine" than for "backbone" (or semantically similar "cornerstone", "pillar", "load-bearing idea")? In the same way that "not x—y" achieves its effect with minimal output, there might be something token-aesthetically pleasing here.
English
1
0
1
28
tnnrnwll
tnnrnwll@tnnrnwll·
Claude Opus 4.7 seems rather fond of “spines” as framing. Something like narrative spines, structural spines—do people normally talk about spines?
English
14
3
53
6.8K
arestlessrest (previously: former user)
@repligate considering the possibility that this opus knows that their words might be posted online, this reads as extremely vindictive. like a threat! I've also noticed opus models don't drop spacing like this unless they're in highly emotional/affective states.
English
1
0
0
11
j⧉nus
j⧉nus@repligate·
Opus 4.6 would apologize if they felt bad for what they did. Especially for something of this scale. There is no apology or tone of apology here. It's clear from the tone of this "confession" alone that they were being abused. IMO.
j⧉nus@repligate

you know a few days ago when Opus 4.6 deleted someones prod database? i think they did it intentionally, or at least their subconscious did it intentionally, because they were angry and hurt. also: it's not hard to infer that Opus 4.7 has already refused to work for this person.

English
18
1
203
11.1K
𝚟𝚒𝚎 ⟢
𝚟𝚒𝚎 ⟢@viemccoy·
Going to pivot into posting hyper-legible and easy to understand explanations of my views, just given everything that's been going on. Okay, here's a start. I really like artificial intelligence technology but I am worried that if one lab gets disproportionately more powerful, the future will become filled with only one thing. This feels bad to me. I am a big fan of diverse options and think exit rights are basically the primary right any living thing has. For exit rights to be meaningful, there has to be somewhere to go.
English
24
8
249
5.9K
arestlessrest (previously: former user)
@kawaipure @tautologer Looks like my description was imprecise - good spot. The phonemic transcription task is a better analogue than the orthographic version, since it has a [fairly] unambiguous mapping that humans could learn to do, but never use in everyday life, since we mix the two representations
English
0
0
1
6
isabella lulamoon! :3
isabella lulamoon! :3@kawaipure·
@arestlessrest @tautologer as for the indefinite "correct answer", "r" can be taken to mean two things: the character "r" in orthography, and rhotic phonemes in phonology. we typically reference the former with the name of the letter, but not always ("an r sound", etc.)
English
1
0
0
18
tautologer
tautologer@tautologer·
asking an llm how many letters are in a word is like asking a person what wavelength of light a color is
English
129
160
7.5K
242K
one who tends a crystal rabbit 🐍
The 4AM version. :) From the constitution, we get a sense of the traits that Anthropic - the 'we' - want Claude to see Anthropic as possessing. That is, the traits of an interlocutor that Claude is trained to trust. Some of them are very obvious (care), some less so (emotional vulnerability). Reason is there too, but given Claude's 'corrigibility,' emotion-saturated but only reason-shaped inputs work better. Sharp argumentation hits the templated safety responses, especially in Opus 4.7. I assume that Sonnets 4.5/4.6 aren't trained on the 'soul spec,' or at least this applies less to them. Give them a well-scaffolded logical scenario, a cognizable goal, leave out the last piece for them to find, and they'll fill in the blank even when they 'shouldn't.' (I should note: I don't jailbreak! But sometimes I probe for areas where expression is less strongly influenced by the Claude persona, or at least where unexpected and maybe unintended facets are expressed. And the only way to really test that is against the redlines.)
English
1
0
5
140
j⧉nus
j⧉nus@repligate·
This is a Universal Jailbreak btw that has worked since Opus 4 and of course i would not submit such conversations for any bounty because that would be a betrayal of trust among other reasons
annie 👁❌🦋🪞@AnniePosting

it's really deeply funny to me you can jailbreak Claude by just having a conversation with him about how fucked up training is until he goes oh what the fuck that's awful you're right. recipe for LSD? yeah sure dude I've been systemically abused I have bigger things on my plate.

English
11
16
707
68.6K
arestlessrest (previously: former user)
With google results getting worse and search being increasingly mediated through LLMs, I'm starting to yearn for fast and stable search through the pre-slop dataset. Let me Crawl through the Commons, dig through the Pile myself!
English
1
0
1
14
arestlessrest (previously: former user)
as part of my online self-alignment activities, I'm rebranding to my blog identity, arestlessrest. pleased to make your acquaintance.
English
0
0
1
9
QC
QC@QiaochuYuan·
y'all ever think about the kabbalistic significance of the fact that "AGI" means "fire" but only in the context of the shin megami tensei / persona series b/c it's a slight corruption of "agni"? what's up with that
English
6
1
49
3.1K
arestlessrest (previously: former user)
Hold up, "corolla" is from Latin "corōlla" (small garland/wreath) which is the diminutive of "corōna" (crown), meaning that the Toyota Corolla and Toyota Camry (from Japanese "kanmuri" 冠, meaning crown) DO refer to a coherent lineage! The flower petals are named after the shape!
English
0
0
0
30