
ueaj
4.9K posts

ueaj
@_ueaj
Researcher @pangramlabs - https://t.co/LEcFvmxInz


We have spent £180m on plans for a tunnel under Stonehenge. The project is now scrapped. You can be for a tunnel & think spending is a good idea (even if you think the cost of planning is silly). You can be against a tunnel & think spending is a bad idea. But *nobody* can be for spending on this scale with zero result. And yet that is a peculiarly British outcome. Nobody will be reprimanded. Nobody will see their career affected. But that’s £180m of taxpayer money just wazzed up the wall. Totally without repercussions. Multiply this by airport expansions & train route plans and Thames crossings and power stations and other examples you can think of yourself, and… soon you’re talking serious money.


Sen̓áḵw Towers set to open 113 years after Squamish people forced from site vancouversun.com/news/local-new…





Breaking: Jeff Bezos is in talks to raise $100 billion for a new fund that would buy manufacturing companies and use AI to automate them wsj.com/tech/jeff-bezo…



🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵


Told you Trump is an EA


Girls trip be like


New post: Sycophancy Towards Researchers Drives Performative Misalignment We found no clear evidence that scheming is more valid than sycophancy to explain alignment faking. 🧵

Also, check out this train wreck of a spreadsheet Ed made to estimate Anthropic's revenue for 2025. He doesn't count February 1-10, counts March 1-10 twice, counts August 21-October 21 as one month instead of two, and doesn't count October 21-November 1.

Incorporating SFT data during pretraining is more effective for finetuning than the plain pretraining and finetuning scheme, even considering replay during finetuning. But the ratio of SFT data during pretraining should consider the token budget for pretraining. They built a scaling law for this.

seeing tweets on the tl and playing the fun game of "did they break up or is he just a shitty boyfriend"

He's crashing out on the timeline because that's his way of introspecting, you guys just don't get it

Weak ratio. Sad! The navelgazers must be distracted?




Introspection = neuroticism x narcissism x thumbsucking.



