
Soham Mehta
198 posts

Soham Mehta
@SohamThoughts
Research Fellow @JoinFAI | prev @NewYorkStateAG @Columbia @CFPB @knightcolumbia | Views, however unoriginal and derivative, my own



Can't believe we're doubling down on "it's just a stupid stochastic parrot" when OpenAI disproved a longstanding conjecture in discrete geometry less than 24 hours ago. We need more Bores's, and less of whatever this is.

Today, Matthew Scherer argues that the most pressing AI-driven crisis is the overestimation of AI’s capabilities and impacts, which has produced a historically large speculative AI bubble.


Building trades unions have long been considered a voice of the American worker. Now, they're intertwined with the richest companies in the world as they create America’s artificial intelligence economy. apnews.com/article/artifi…






1/5 Everyone who's seen @_panthalassa's wave-powered data centers agrees that the tech is incredible. But just as revolutionary is what it does to 400 years of legal thinking about the ocean. The law has always seen the ocean as space you move through or extract from, never a space where you produce things. Ocean compute changes that. If the ocean becomes the site of AI's industrial base, the consequences for great power competition and AI governance are enormous. A 🧵on my recent analysis:

These are great steps! Here's 8 other things we could do: 1. Congress should fund CAISI at ~$80 million instead of $10 mn, which is our internal analysis of what it'd take for CAISI to actually fulfill the purposes laid out in the AI Action Plan and other Trump admin directives. 2. The NSA, CAISI + others should plan for the moment when >Mythos-class models are distilled or trained in China, and make a real effort in preemptive cyberdefense. We called this last year, and have some ideas on what to do (ifp.org/operation-patc…, ifp.org/the-great-refa…, ifp.org/preventing-ai-…) 3. OSTP and NSC should coordinate building RAND-style SL-4/SL-5 security for frontier model weights. Distillation is one way to get somewhat capable models, but stealing model weights gets you the best model, and it's completely doable for well-resourced state-backed actors. The weights themselves are the crown jewels, and most labs aren't close to being able to defend them! Once we train a 10x Mythos soon, we'll wish we had a secure environment to run it in. (More implementation details here: ifp.org/a-sprint-towar…) 4. Relatedly, fund + help staff an insider-threat / counter-intel program for frontier labs. It is much harder to protect model weights if adversarial people have privileged access. 5. The White House should direct Commerce/BIS to strengthen AI chip and SME export controls to adversarial countries, so that even if cyber-capable models are distilled or stolen, they can't be deployed at scale on American chips. China has huge domestic production bottlenecks (ifp.org/the-b30a-decis…), so exporting fewer chips makes a difference, pound for pound. 6. And because smuggling is still a problem, we should also be deploying chip security measures like privacy-preserving country-level location verification, which will allow us to export more chips to semi-trusted countries while verifying that they're not being smuggled to adversarial ones (more: rebuilding.tech/posts/conditio…), and there is more AI verification work to be done to enable more mutually beneficial trade without national security downsides (ifp.org/faster-ai-diff…). 7. On top of funding CAISI, we should direct it to run pre-deployment evals for CBRN and cyber uplift on a classified track. You can't hold adversaries accountable for abusing US models if we don't systematically measure what those models can do in the first place. 8. The NSC, NSA and CAISI should write the emergency-response playbook for the day a Mythos-class weight leak is confirmed, or distillation is successful. Who does what, in what order? To be in a good place, we should've started years ago. But it'll only be more urgent each passing month. Compute stock is growing 3.4x/year; LLM inference prices declining at -40x/year for a fixed level of capability; software progress is improving so quickly that the pre-trainig compute we need to reach a capability is 3 times lower each passing year (epoch.ai)... These are just some ideas for government, related to distillation and model weight theft. Philanthropy and the private sector have big roles to play as well. We have so much work to do!


1/ GPU smuggling is already big business, and GPU pirates on the high seas may be next. 🧵










