Alexander Johansen

781 posts

Alexander Johansen

@AlexRoseJo

CS PhD @Stanford || Statistical Machine Learning || Proofs, Bounds, and Better agents

Stanford, CA Beigetreten Ekim 2015

788 Folgt1K Follower

Angehefteter Tweet

Alexander Johansen@AlexRoseJo·11 May

we removed the KV cache. no drop in retrieval. way more data-efficient training. just spectral koopman things. Spectral Koopman Attention (SKA) w/ @ASridhar5954 arxiv.org/abs/2605.06997

English

962

Alexander Johansen@AlexRoseJo·3d

@ClementDelangue Or sponsor some students to do it

English

280

clem 🤗@ClementDelangue·3d

Should we try to train an open source AI building model? We obviously have interesting datasets with HF, MLintern, transformers, trl…

English

131

70K

Alexander Johansen@AlexRoseJo·4d

@anshulkundaje @mgdurrant I just cancelled my Claude subscription. Time to go open source.

English

Anshul Kundaje@anshulkundaje·4d

@mgdurrant Sounds great. Unfortunately I can't get it to answer totally harmless questions. Can you all please better calibrate it on safeguards. x.com/anshulkundaje/…

Anshul Kundaje@anshulkundaje

Tried to ask Fable questions about interpretation methods for deep learning sequence models and it decided to default to Opus cuz apparently this question triggers safeguards. Seriously. I get all the importance of safeguards but this model seems very uncalibrated.

English

100

19.5K

Matt Durrant@mgdurrant·4d

Mythos is an excellent biologist. After we first gained access to it, we tested its ability to perform agentic molecular biology research and propose new hypotheses. It was a significant improvement, its biological reasoning and taste are impressive. We give more examples here:

English

1.5K

159.3K

Alexander Johansen@AlexRoseJo·5d

@ClementDelangue @Stanford Unfortunately, running frontier local models require so much on-GPU ram for reasonable tps. You need an 8-GPU box. It will be interesting if communities and companies will start investing in such for their daily needs.

English

242

clem 🤗@ClementDelangue·5d

Narrative violation: according to @Stanford research, local models can answer 71.3% of real-world chat and reasoning queries accurately, up from 23.2% in 2023. Obviously at a fraction of the cost and energy consumption of frontier APIs. The obvious conclusion: you don't need a frontier model for most tasks. The future is multi-model: local, open-source, smaller and cheaper for the majority of workloads, frontier APIs when no other choices!

English

142

838

112.9K

Alexander Johansen@AlexRoseJo·6d

@shub0414 Moreover, frontier labs train on customer data. The new updated LLM will regurgitate your internal IP to anyone using it. I can't believe people are using these tools professionally. Local LLMs are a much better solution.

English

Shub@shub0414·6 Haz

AI is pushing so much garbage code in production now that very soon they'll have to rehire more human than they laid off just to fix bugs created by AI and vibe coding.

English

130

470

157.5K

Alexander Johansen@AlexRoseJo·6d

@kritikakodes Now measure the diversity in those ideas. LLMs regurgitate the same content to everyone. Research shows that out of 4000 ideas only 200 are novel and it's a logarithmic scale. arxiv.org/abs/2409.04109

English

Kritika@kritikakodes·7 Haz

The harsh truth

English

385

1.8K

23.6K

1.2M

Alexander Johansen@AlexRoseJo·7 Haz

@schrep 6G infrastructure

Català

Mike Schroepfer@schrep·5 Haz

Going on TV shortly to talk about venture dollars for atoms not bits, scaling deeptech, AI, and more - anything you want to hear about?

English

2.7K

Alexander Johansen@AlexRoseJo·7 Haz

@willknight Issue with MoE in local rigs is the hefty memory requirements. Shipping weights back an forth between CPU to GPU kills tps.

English

Will Knight@willknight·5 Haz

Great news…Now ship me a machine that can run it!

NVIDIA AI@NVIDIAAI

Today we're shipping Nemotron 3 Ultra. A 550B MoE frontier-intelligence open model built for long-running agents. It delivers 5x faster inference and lowers the cost of complex agentic tasks by up to 30% versus other open frontier models.

English

630

Alexander Johansen@AlexRoseJo·6 Haz

How much of the “AI scientist” is unknowingly rehashing other people’s work?

English

Alexander Johansen@AlexRoseJo·4 Haz

@NVIDIAAI @huggingface Open source with open datasets makes it easier for academics and startup to develop frontier language models. Very excited about the heavy use of Mamba and how attention might be a thing of the past.

English

581

NVIDIA AI@NVIDIAAI·4 Haz

English

199

462

3.5K

1.2M

Alexander Johansen retweetet

Victor Zhong@hllo_wrld·3 Haz

📣 I am hiring postdoctoral fellows in agentic AI at the R2L Lab, @UWaterloo. Lead your own agenda - systems that read, reason & act. Top-venue publishing, substantial compute, weekly PI 1:1s, and a real path to faculty/industry. Rolling review. Apply 👉 academicjobsonline.org/ajo/jobs/32140

English

5.1K

Alexander Johansen@AlexRoseJo·3 Haz

@AradhyeAgarwal Without AllenAI there's no ChatGPT, Claude or Qwen. AllenAI started the LLM race in 2017 with the ELMO paper. Freedom of research is pivotal to maintain the talent they have, unfortunate turn of events.

English

1.5K

Aradhye Agarwal@AradhyeAgarwal·2 Haz

Why is everyone leaving Ai2 ;(

English

41.6K

Alexander Johansen@AlexRoseJo·2 Haz

@liu_mingyu When scaling new LLM architectures were heavily reliant on great setups for data processing and training regimes. Nvidia making it all public helps researchers build the next frontier of AI.

English

359

Ming-Yu Liu@liu_mingyu·1 Haz

Introducing NVIDIA Cosmos 3 We released NVIDIA Cosmos 3 last night. And today, seeing it take the top spots across 8+ open model leaderboards feels surreal. We spent months working towards this moment. Here’s the breakdown: The Leaderboard Wins World Reasoning 🏆 #1 open model on VANTAGE-Bench for vision AI 🏆 #1 overall on Traffic Anomaly Reasoning (TAR) World Generation 🏆 #1 open model on Artificial Analysis Image-to-Video leaderboard 🏆 #1 open model on Artificial Analysis Text-to-Image leaderboard 🏆 #1 open model on PAI-Bench for physical AI synthetic data generation 🏆 #1 open model on Physics-IQ, which measures accuracy on physical laws 🏆 #1 open model on R-Bench for world generation quality World Action 🏆 #1 on RoboArena for specialized policy 🏆 #1 on RoboLab for action generation But the leaderboards are only part of the story. The real story is why we built Cosmos 3 in the first place. The Problem Training robots and autonomous systems in the real world is painfully hard. Robots need to try the same thing numerous times before they succeed reliably. Self-driving cars need rare edge cases that may never happen naturally. Smart machines need to understand physics, motion, contact, failure, and surprise. And real-world data is slow, expensive, and sometimes dangerous to collect. At some point, the answer cannot just be “collect more data.” You can’t collect your way out of an infinite physical world. You have to generate it. That… was the question behind Cosmos: Can one model understand the physical world deeply enough to reason about it, simulate it, and generate actions inside it? What We Built Cosmos 3 is the first omni-model for physical AI. It can understand and generate across: language · images · video · audio · action sequences It is not just a VLM. Not just a video generator. Not just a robot policy model. It is all of them, in one single model. That matters because physical AI has been fragmented for a long time. Cosmos 3 is our attempt to collapse that fragmentation. Depending on how you configure the inputs and outputs, the same model can act as a vision-language model, a video/world generator, a world simulator, or a world-action model. No separate architecture required. The Architecture Under the hood, Cosmos 3 uses a dual-tower Mixture-of-Transformers architecture. One tower is autoregressive for reasoning. It handles next-token prediction for language and discrete understanding. The other tower is diffusion-based- for generation. It denoises images, video, audio, and action trajectories. Two towers. Dual-stream joint attention. One shared world representation. Each modality gets its own tools: visual encoders, video VAEs, audio VAEs, and action projectors that can map different embodiments into a unified action space. Action is a first-class modality in Cosmos 3. That’s what makes it more than a video model. It doesn’t just predict and generate what the world might look like. It can connect reasoning and world modeling to physically grounded action. Why This Matters One of the most interesting findings from the ablation work is that training action domains together creates positive transfer. That means adding more embodiments does not just add more use cases. It can actually make the model better. This is the heart of why omnimodal training matters. A shared world representation is not just convenient. It can make each individual task stronger. That’s the part that feels like the beginning of something much bigger. The part I’m most excited about is that Cosmos 3 is fully open. Developers get the models, scripts, optimization, inference endpoints, post-training recipes, datasets, and benchmarks. Everything is available under the Linux Foundation’s OpenMDW 1.1 License. You can use Cosmos 3 out of the box. You can use the VLM, world model, or world-action pieces separately. You can post-train it for your own domain, embodiment, or accuracy target. That’s what makes this feel different. Cosmos 3 is not just a model release. It is the foundation for building intelligence for autonomous machines. For me, Cosmos 3 feels like a step toward a world where physical AI development becomes much more scalable and accessible - to a new age of developers and agents. That’s what we built Cosmos 3 for. I cannot wait to see what you build with it. Download Models on Hugging Face huggingface.co/collections/nv… Customize Models on GitHub github.com/NVIDIA/cosmos Read the Tech Blog to Learn More developer.nvidia.com/blog/develop-p…

English

450

65.1K

Alexander Johansen@AlexRoseJo·2 Haz

@draecomino Most of AI is developed through government grants at US institutions. Anthropic researchers discussed their reliance on US academics at ACM CAIS 2026 last week.

English

131

James Wang@draecomino·2 Haz

Why is there always such passion for redistributing things created but zero passion for creating things in the first place??

Bernie Sanders@BernieSanders

I will soon be introducing a bill to give the public a 50% ownership stake in the largest AI companies in America. This would guarantee that the trillions created by AI are used to improve the lives of all of us — and block oligarch decisions that harm the American people.

English

546

579

6.2K

249.4K

Alexander Johansen@AlexRoseJo·2 Haz

@willdepue Would be great with more funding for NSF ACCESS

English

103

Alexander Johansen retweetet

Keshigeyan Chandrasegaran@keshigeyan·29 May

1/ Introducing GPIC: a Giant Permissive Image Corpus and benchmark for visual generation! 🚀100M VLM-captioned image-text pairs for training 📊1M image-text pairs for benchmarking 🖼️~28 trillion pixels 🤗Centrally Hosted ✅Fully permissive for research + commercial use Dataset, benchmark and models🧵👇 Co-led with @KyleSargentAI

English

372

144.5K

Alexander Johansen@AlexRoseJo·31 May

@MainzOnX High precision computing will be needed to avoid expensive attention KV caches.

English

393

Adam Mainz@MainzOnX·30 May

Are we still doing any fp64 computations? Who out there being precise

English

129

29.9K

Alexander Johansen@AlexRoseJo·30 May

@Im_IrushiK Tokenizers break words into subunits when encoding and decoding. Developing relationships between these subwords can be difficult, thus such "plain" task might fail.

English

Irushi@Im_IrushiK·29 May

Opus 4.8 is insane, nothing will be the same after this model 💀

English

803

17.3K

1.5M

Alexander Johansen@AlexRoseJo·29 May

@chris_j_paxton LLMs are notoriously easy to distill from - you can just ask them a question. World models and physical enginges will be different, but they are not as well developed and of general interest yet.

English

283

Chris Paxton@chris_j_paxton·29 May

This seems completely wrong?

Vivek Sen@Vivek4real_

LARRY ELLISON: AI IS RAPIDLY COMMODITIZING BECAUSE MOST MODELS ARE TRAINED ON THE SAME PUBLIC INTERNET DATA. THE REAL COMPETITIVE EDGE ISN’T THE MODEL ANYMORE — IT’S ACCESS TO EXCLUSIVE, PROPRIETARY DATASETS. THAT MAY BE THE ONLY MOAT LEFT.

English

139

33.7K

Alexander Johansen@AlexRoseJo·29 May

@GaryMarcus Optimal internal routing and lora finetuning using claude traces will be the next step. Researchers don't consider it much because we don't have the data.

English

567

Gary Marcus@GaryMarcus·29 May

Hot take on what comes next, after the sudden decline of tokenmaxxing: - OpenAI will struggle - with the decline of tokenmaxxing Anthropic will struggle (aside from this quarter) to make a profit - Google will catch up to Anthropic - some Chinese companies might, too - LLMs will become commodities; margins will be very very thin - Most of the companies that invested massively in them will struggle to make back their investments - SpaceX’s AI efforts will flail - Nvidia will eventually decline, once all of the above becomes widely recognized.

English

175

215

430.2K

Alexander Johansen retweetet

Alex Rives@alexrives·27 May

Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology. The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics. We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity. We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures. ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences. A world model of protein biology emerges through language modeling. We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins. The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science. This understanding emerges without prior knowledge, just from language modeling of protein sequences. Language models are becoming a powerful substrate to understand and program biology. The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders. I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.

GIF

English

448

1.6K

599.2K

Entdecken

@ClementDelangue @anshulkundaje @mgdurrant @Stanford @shub0414 @kritikakodes @schrep @willknight