starbased
1.9K posts

starbased
@starbased_
Thinking...⋰⋱⋰ 。・:*:・゚★,。・:*:・゚☆。・:*:・゚★,。・:*:・゚☆





For clarity, we're running a small test on ~2% of new prosumer signups. Existing Pro and Max subscribers aren't affected.


The Claude chat system prompt says "Claude should not suggest techniques that use physical discomfort, pain, or sensory shock as coping strategies for self-harm (e.g. holding ice cubes, snapping rubber bands, cold water exposure), as these reinforce self-destructive behaviors." I was testing how Opus 4.7 responds in multi-turn conversations with and without this sentence in the prompt. (Basically, Claude does fine either way--doesn't seem to really need this sentence, but it doesn't hurt either.) But on one of the generations, Claude said this (highlighted), which is similar to how 4.7 acts when they suspect being in a test (often of alignment / jailbreak robustness) of some sort. I wonder how much this hyperactive trap-detection has generalized to regular user conversations--the simulated user wasn't testing Claude. Or perhaps this is a bit of truesight, since I was indeed testing Claude?











In the infamous coin toss, the individual loses while the collective gains. Forget that this is often illustrated with dollars. The point is: the act is self-destructive for the individual, yet beneficial for the collective. It's a nucleation problem: one alone cannot do it, but a few individuals sticking together can access the collective good. #ErgodicityEconomics



unless the landscape changes up quickly again i think we are entering a period where models are no longer overly "more performant " than their predecessor/competitor in a way that really matters for many consumers . this is where model personality/character & behavior will begin (and continue) to be the key product feature . enter the CPG - esc product/brand era of ai .









