
Strata
8.8K posts

Strata
@ChainZenit
Ex-Geologist ⚒️ → AI Builder 🤖. Vibe-coding & Web3. Documenting the journey from dirt to bots. No fluff, just the build. Join me. 🧵 https://t.co/pdIXIpfR8C


New GDM interp research: SFT is a big deal for safety relevant behaviors. We recently investigated root causes for some of Gemini’s behaviors. We were surprised to find that many behaviors actually came from the initial supervised finetuning stage, not later stages like RL! 🧵

It was in fact Amazon (CEO Andy Jassy) who reportedly helped trigger the Claude shutdown. Via The Information Amazon CEO Andy Jassy reportedly warned senior Trump administration officials about security risks in Anthropic’s newest Claude models, helping trigger late-night export restrictions on Mythos 5 and Fable 5. "An Amazon spokesperson told The Information: “As a leading cloud provider that serves a large number of private and public sector customers, it’s not uncommon for governments to seek our counsel on potential security risks. When they occur, we don’t share the details of these discussions.”" In other words: Anthropic’s own mega-backer may have played a key role in pushing the government to freeze access to its most advanced models.





Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇




DAY 0 ALERT: @MiniMax_AI M3 is now available on HuggingFace & has been added to InferenceX. The M3 architecture has ~428B parameters and ~23B activated parameters. Due to the 10x engineers from @inferact, M3 is already delivering pretty well-optimized performance on @NVIDIAAI B300 Blackwell Ultra on Day 0 @vllm_project! Furthermore, Inferact released their EAGLE3 heads, which enable even greater performance. Looking forward to Day 1, 2, and 3 performance & the team is grinding on benchmarking Day 0 MI355X performance on InferenceX too.



Really excited to open source a new project: Omnigent, a meta-harness for AI agents. It lets you build multi-agent coding and custom agents, sitting above Claude Code, Codex, Pi, and agent SDKs to let you compose them. It also adds live collaboration and rich control policies.



I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true: — As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable. — Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.) — A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused. — In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.” — In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety. — In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community. — The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority. — Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.





