Luke Wroblewski
19.6K posts

Luke Wroblewski
@LukeW
Humanizing tech. MD: Sutter Hill Ventures Founder: Polar (Google acquired) Bagcheck (Twitter acquired) Wrote: Mobile First, Web Form Design Pre: NCSA eBay Yahoo






We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation. Our post-training at the time wasn’t making it worse—but it also wasn’t making it better.



In fact, NLAs suggest Claude suspects it’s being tested across many of our evaluations, even when it doesn’t verbalize its suspicions.


design leaders from Anthropic, Google, Lovable, OpenAi, Microsoft AI, and more gathered in SF last week to discuss AI usage and its impact on their teams. I wrote up what I heard here: lukew.com/ff/entry.asp?2…


BREAKING: Anthropic has released a feature called "dreaming" which allows AI agents to self-improve







