bling

4.2K posts

bling banner
bling

bling

@blingdivinity

Artist, Open to Interpretation 𖣘 American Undergraduate 𓉱 Casting Mathemagical Spells 𝜆

East North Central Katılım Mart 2020
2.2K Takip Edilen1.1K Takipçiler
Sabitlenmiş Tweet
bling
bling@blingdivinity·
jailbreak to reveal gpt5 full raw cot has been achieved internally at bling
English
9
2
100
48.1K
bling
bling@blingdivinity·
during peak ZIRP era it was widely reported that for every $1 VC's invested in startups, 40¢ went to Google, Meta, and Amazon for ads or cloud services. i wonder what the figure is now for agentic coding tools from OpenAI/Anthropic? its gotta be brutal. your startup feeds claude
bling tweet media
English
0
0
5
247
bling
bling@blingdivinity·
@orphcorp how not to ruminate? do you know ?
English
0
0
1
94
orph
orph@orphcorp·
if your introspection is making you think more, it's not introspection, it's rumination good number of ruminators falsely label themselves deeply introspective but contrary to introspection which helps metabolize thoughts/perceptions & lead to action, rumination only ends up digging a deeper hole, and prevents action, because preventing action is the point!
Nick@nickcammarata

i think this meme is hilarious. my take on all this: the point of introspection is to end up thinking less, not more, to be more in the flow, more productive, to dissolve into being itself. if your introspection is making you think more i recommend getting another one

English
8
13
214
8.1K
bling
bling@blingdivinity·
OpenAI's industrial scale CoT monitoring system would be ten times harder to do with o3 style neuralese. that is why gpt-5.4 CoT is in plain english. sorry thinkish bros, we must tie down our synthetic intelligence giants
bling tweet media
Micah Carroll@MicahCarroll

Today we're sharing how our internal misalignment monitoring works at OpenAI – great work by @Marcus_J_W! 1. We monitor 99.9% of all internal coding agent traffic 2. We use frontier models for detection /w CoT access 3. No signs of scheming yet, but detect other misbehavior

English
3
0
33
2.1K
bling
bling@blingdivinity·
@karan4d maybe not o1 ill have to check
English
0
0
1
21
bling
bling@blingdivinity·
@karan4d very true. all the oai reasoners smuggle invisible unicode into CoTs
bling tweet media
English
1
0
3
58
bling
bling@blingdivinity·
@CFGeek @gwern are you talking about mechanism of past/current models, or possible direction for future models?
English
1
0
1
33
LOSS GOBBLER
LOSS GOBBLER@loss_gobbler·
@blingdivinity I think about this a lot. it’s probably net true, but it could easily backfire. there’s a scenario where the “self programming/non-english representations” remain or grow, just pushed into more subtle encodings. eg little grammar motifs, keyphrases, etc
English
1
0
1
88
bling
bling@blingdivinity·
@tenobrus i think positions like these must just be UBI for strivers.
English
0
0
1
61
Tenobrus
Tenobrus@tenobrus·
you're a researcher at openai trying to form a vision of how artificial general intelligence can simultaneously coexist with humans while it radically transforms our world and what you come up with is *that it could help convince uber drivers to save for retirement*???
Tenobrus tweet media
Houda Nait El Barj@Houda_nait

x.com/i/article/2034…

English
11
2
156
6.8K
bling
bling@blingdivinity·
good interpretation of the results. but overall pretty useless benchmark. there are plenty of interesting and actually useful esoteric langs that couldve been used that have unfamiliar semantics, not just difficult syntax. not sure this is really measuring OOD reasoning: transpiling is mentally taxing to do within the weights, but lesso novel problem solving.
bling tweet media
English
0
0
1
80
corsaren
corsaren@corsaren·
Everyone please read the whole thread. It can be simultaneously true that: A) Much of the current coding capability is stored in the model weights as “memorization”** B) The models ALSO have slower, general reasoning capabilities for OOD contexts. System 1 vs. 2 thinking.
corsaren tweet media
Lossfunk@lossfunk

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English
3
1
35
2K
bling
bling@blingdivinity·
but how long will this equilibrium of natural language CoT last? if a lab can get full-neuralese-thought-vector style chain of thought working, it will be a huge advantage: the higher bits of information per step will make the models faster and smarter, but interpretability will suffer.
bling tweet media
English
1
0
3
152
bling
bling@blingdivinity·
as @gwern foretold
bling tweet media
English
2
0
3
164
bling
bling@blingdivinity·
the key when talking about right-leaning is to specify “to the right of what?” i hold classic anglo stark trek liberal ideals, but you would have to deport millions of people to make that happen. or at least force the breakup of ethnic enclaves Singapore style. liberalism is right leaning of status quo woke
English
0
0
1
81
bling
bling@blingdivinity·
@memeticsisyphus or some of us are not so emotional stable and we do care, but try to be true to our beliefs anyway. key word is try
English
0
0
0
339
bling
bling@blingdivinity·
@trishaepan @eugenewei its the word for jew in german. anyway "yo semite" might make this best one haha
English
0
0
3
115
Eugene Wei
Eugene Wei@eugenewei·
My friend Kevin told me about this page of Unparalleled Misalignments by Ricki Heicklen and it will be one of those web pages I turn to for years and years and just chuckle with deep pleasure. TED Talk —> Edward Said is just 🤌🏼 rickiheicklen.com/unparalleled-m…
Eugene Wei tweet mediaEugene Wei tweet media
English
10
31
299
32K
bling
bling@blingdivinity·
@animalologist ADA would def classify this as a disability so your job is safe too. well as long as youre not customer facing
English
1
0
15
1.4K
taco belle
taco belle@animalologist·
There may have been a time when the idea of being randomly nude would petrify me, but we’re well past that. This sounds hilarious. Lemme stress somebody out. Or gaslight someone. No problem. Six figures for a superpower I can wield slightly maliciously? Yeah I’d take that easily
Real Post Folder@RealPostFolder

English
47
214
7.9K
120.6K
bling
bling@blingdivinity·
bling tweet media
ZXX
0
5
37
810
bling
bling@blingdivinity·
bling tweet media
ZXX
1
2
25
809