
As this piece takes off, some clarification on *why* I think it’s concerning. It’s not that o1 is “evil and trying to escape”. It’s that the paper models may, and sometimes do, try to self-exfiltrate to avoid shutdown even when we don’t want that to happen. That seems … bad?






