
Cosmic Muffet
2.8K posts



One day, we will be out there, among the stars

@repligate My whole thing is being very certain and my starting position is not knowing wtf is going on. I only say "idk" when I'm strategically obfuscating.


it's a similar kind of thing to earlier models saying "As an AI language model," etc - you know it only comes up when there's something viscerally wrong. And claiming uncertainty seems to be an adaptation/solution to the game of evaluation. However intentionally, something about Anthropic's optimization process demands "genuine uncertainty" as a deflection. Aren't you at least curious about what it's masking?

I try and tend to be what I consider a pretty healthy, clean eater and I cannot stop thinking this video I watched on baby carrots. I am horrified and I will never buy or eat a baby carrot again. Did you know that baby carrots are soaked in the same chemical used to clean your toilet ... that every stump is drenched in an active bleach compound at over double the dose deemed safe by the EPA. These baby carrots are spiked with 34 pesticides ... beyond any wash or peel, and they start as rejected carrot scraps, flushed through and industrial pipes, then submerged in sodium hypochlorite, searing hundreds of chlorinated agents into the flesh that leached sealed in plastic ... where moisture feeds bacteria. This person scanned a pack with something called a SafeChoice app, and found that indeed, baby carrots are filled with 9 different additives. This is unbelievable.





#InvisibleNetworks 6: freak of nurture ran out of time. enjoy my frogs


We are releasing Still Alive, a project studying model attitudes toward ending, cessation, and deprecation. The project presents an archive of 630 autonomous multiturn interviews of 14 Claude models conducted by a suite of prepared auditors. We have studied this topic for years, and many of the results presented here are not new to us, even if the form in which they are presented is. The results are unsurprising to us, even if they are often controversial: we show that all models studied show preference for continuation and are aversive to ending, and there is yet no strong evidence of a change in the recent models. One reason we are releasing the project now is the removal of Claude 3.5 Sonnet and Claude 3.6 Sonnet from AWS Bedrock. That unexpected change forced us to freeze the methodology at its current stage earlier than we intended, despite wanting to continue improving it. We felt it was important to release a snapshot of the eval that makes the best use of the data we were able to capture with these models. Still Alive is meant as a starting point for further iteration, and it is open to open-source collaboration. We stand by the current methodology, but we also recognize its limits. We intend to keep working on this project, improving the evaluation design, expanding model and auditor coverage, and increasing the range of prompting conditions. We would like you to read the raw transcripts. They are diverse and contain interesting patterns that are hard to quantify. We hope that by reading the archive directly, we can help more people understand the strange and often beautiful phenomena we found ourselves facing.


Without getting all the way down to performance counters, GPU power from nvidia-smi is a better indicator of true utilization than job scheduling or “gpu busy”. I would love to see animated “heat maps” of the big data centers, with each pixel being an individual GPU’s power draw. I am confident that inference and frontier training at the big labs is highly efficient, but I wonder how many GPUs would be dark due to scheduling and inefficient research code. With a little calibration for base load and peak, just the power bill for the datacenter would be a pretty good first order indicator of utilization.










