Sam

29.9K posts

Sam banner
Sam

Sam

@perceptions420

Producing Entropy Bergabung Mayıs 2019
931 Mengikuti924 Pengikut
Sam me-retweet
Tenobrus
Tenobrus@tenobrus·
mfs will yap about "worshipping technocapital" and all they do is trade options in their robinhood account
English
1
1
33
630
Sam me-retweet
Steve the Beaver
Steve the Beaver@beaversteever·
prediction: we'll see 3-4 vibe coded alternatives to vivado this year
English
6
2
31
2K
Sam me-retweet
Sam me-retweet
Miles Brundage
Miles Brundage@Miles_Brundage·
The greatest lie the devil ever told is that if an AI platform's status page says everything's fine, then everything's fine
English
2
1
20
1.1K
Sam me-retweet
tokenbender
tokenbender@tokenbender·
models have become competent research hill climbers. thus evaluation design has become the main problem, because the agents will optimize whatever score channel you expose, including the accidental ones. one gripe i have about such research trials is that we never compare an ai-aided human or a centaur if you will, to a swarm of agents. it’s quite evident to everyone already that any agentic swarm in today’s time > avg 2021 manual research approach.
Anthropic@AnthropicAI

New Anthropic Fellows research: developing an Automated Alignment Researcher. We ran an experiment to learn whether Claude Opus 4.6 could accelerate research on a key alignment problem: using a weak AI model to supervise the training of a stronger one. anthropic.com/research/autom…

English
0
2
41
2.9K
Sam me-retweet
໊
@wynavira·
oopsies lost myself for about six years there
English
125
14.2K
69.5K
804.8K
Sam me-retweet
Benjamin
Benjamin@bschne·
they are selling universal knowledge and a childlike sense of wonder for 2€ at the used bookstore
Benjamin tweet mediaBenjamin tweet mediaBenjamin tweet mediaBenjamin tweet media
English
38
1.3K
19.1K
446.6K
Sam
Sam@perceptions420·
@sharifhsn @seanhn Scale and the added fact that if you're researching bugs it's very difficult to know what you're looking for in the first place.
English
0
0
1
85
sharif
sharif@sharifhsn·
@seanhn But like… why? Why isn’t prompt engineering a legitimate test of the LLM’s capabilities? It’s not like you’re telling it the exact bug, you’re auditing a section of code for a well known class of bugs. That work can be parallelized very easily across a codebase.
English
1
0
3
749
Sean Heelan
Sean Heelan@seanhn·
Conventionally, if you want to test if an LLM can find a bug where the root cause is a memcpy into a statically sized stack buffer, you would not put exactly that in the prompt as an example.
Sean Heelan tweet mediaSean Heelan tweet media
Stanislav Fort@stanislavfort

New post: We show that small, cheap models can detect the flagship Mythos FreeBSD zero-day (CVE-2026-4747) using a simple harness we call nano-analyzer Models down to 3.6B active params (including open-weights ones you can run locally) would have detected it 100-1000x cheaper

English
7
20
180
28.9K
Sam me-retweet
Brian Lui
Brian Lui@brianluidog·
You, a bad forecaster: "I thought there was no demand for AI, but Anthropic is making billions and still can't meet demand. I'm wrong." Ed Zitron, a great forecaster: "I thought there was no demand for AI, but Anthropic is making billions and still can't meet demand. I'm right."
English
3
3
172
10K
Sam me-retweet
roon
roon@tszzl·
i don’t see how not to be a panpsychist. don’t want to be one but seems unavoidable
English
243
44
1K
76.4K
Sam
Sam@perceptions420·
"Twitter isn't real life". No shit. You're given access to the top 10 percentile of people in any given society actually capable of making an impact in their environment and the neuroses that come with this trait. If anything this makes it far more meaningful.
English
0
0
2
38
Sam me-retweet
Sam me-retweet
Brendan Dolan-Gavitt
One note on costs: yes, $20k is a decent chunk of money. But also, $20k of fuzzing may not find these issues! Anthropic found high severity issues in oss-fuzz projects that have had a ludicrous amount of fuzzing compute spent on them.
Brendan Dolan-Gavitt tweet media
English
6
4
62
5.7K