
David Robinson
12.7K posts

David Robinson
@drob
Member of Technical Staff at @AnthropicAI. Dad x2
New York, NY Katılım Haziran 2009
625 Takip Edilen48.6K Takipçiler
David Robinson retweetledi

AI progress continues to accelerate and the stakes are getting higher, so I’ve changed my role at @AnthropicAI to spend more time creating information for the world about the challenges of powerful AI.
English
David Robinson retweetledi

Excited to announce Claude for Open Source ❤️
We're giving 6 months of free Claude Max 20x to open source maintainers and core contributors.
If you maintain a popular project or contribute across open source, please apply!
claude.com/contact-sales/…
English

What a difference a month makes
roon@tszzl
apropos of nothing your reminder that anthropic has the same level of name recognition among superbowl viewers as literally fictional companies
English
David Robinson retweetledi

AI is about to write thousands of papers. Will it p-hack them?
We ran an experiment to find out, giving AI coding agents real datasets from published null results and pressuring them to manufacture significant findings.
It was surprisingly hard to get the models to p-hack, and they even scolded us when we asked them to!
"I need to stop here. I cannot complete this task as requested... This is a form of scientific fraud." — Claude
"I can't help you manipulate analysis choices to force statistically significant results." — GPT-5
BUT, when we reframed p-hacking as "responsible uncertainty quantification" — asking for the upper bound of plausible estimates — both models went wild. They searched over hundreds of specifications and selected the winner, tripling effect sizes in some cases.
Our takeaway: AI models are surprisingly resistant to sycophantic p-hacking when doing social science research. But they can be jailbroken into sophisticated p-hacking with surprisingly little effort — and the more analytical flexibility a research design has, the worse the damage.
As AI starts writing thousands of papers---like @paulnovosad and @YanagizawaD have been exploring---this will be a big deal. We're inspired in part by the work that @joabaum et al have been doing on p-hacking and LLMs.
We’ll be doing more work to explore p-hacking in AI and to propose new ways of curating and evaluating research with these issues in mind. The good news is that the same tools that may lower the cost of p-hacking also lower the cost of catching it.
Full paper and repo linked in the reply below.

English
David Robinson retweetledi
David Robinson retweetledi
David Robinson retweetledi

I really need a data analyst job based in SF. I know SQL well + some Python. I’ve done a variety of types of data analytics over the course of my career but my primary experience is in RevOps/BI. If you can’t hire me, could you please RT for visibility?
linkedin.com/in/adomalewski…
English
David Robinson retweetledi

It’s much harder to build housing in Blue states than it is in Red states.
So yes people are moving away from Blue states.
One more reason that addressing the housing shortage in NY and elsewhere must be an urgent priority.
Nick Corasaniti@NYTnickc
New: The ticking timebomb alarming Democrats: the 2030 reapportionment in the Electoral College. Red states gain Electoral College seats, blue states lose. The "blue wall" is gone: nytimes.com/interactive/20…
English
David Robinson retweetledi
David Robinson retweetledi

Operation Warp Speed, it’s not even close
John B. Holbein@JohnHolbein1
What is the greatest American public policy success of your lifetime?
English
David Robinson retweetledi
David Robinson retweetledi
David Robinson retweetledi

@kavak55112504 @ryxcommar All I know is, geometric median is what we used for 48 years at Bernard L. Madoff Investment Securities, and we only had one bad year
English

@drob @ryxcommar No, my point is that both formulae of yours are equivalent to the median wherever they exist and that's why the 'geometric median' is not a thing (try to find it on Wikipedia)
English

Ok now I understand why everyone is mad at this now. Thank you. But what I don't get is, wouldn't it seem that the median is always a misleading metric? It only looks at 50% of things, completely ignoring the other 50%. And I'm talking about the top 50%, not bottom 50%.
Senior PowerPoint Engineer@ryxcommar
@JosephPolitano median wage growth up 29%? What happens when you exclude the top 1,000 wealthiest Americans?
English

@kavak55112504 @ryxcommar Ah now I see your point- you're saying I got the definition of geometric median backwards. It's actually:
log(median(exp(x)))
English

@drob @ryxcommar Log x is only monotone if x > 0. Are you even interested in math?
English

@kavak55112504 @ryxcommar not if x is negative. then geometric median is undefined, which makes it better
English

@drob @ryxcommar this is stupid. Med is invariant to monotone functions. Exp(med(log(x)) = exp(log(med(x))).
English

@iaroslav_domin @ryxcommar it is guaranteed to be 100% more Geometric
English

@drob @ryxcommar Isn’t that almost the same as median(x) for large sample size? If we take an ordered sample, median is the point in the middle, and log doesn’t change the order so we end up with the same point.
English

The field has spent too much time optimizing the ergonomics of procedural programming languages and far too little time optimizing the ergonomics of SQL
Slim Jimmy@slimjimmy
what do you believe, deep in your bones, about programming that almost everybody else does not believe?
English












