Alexander Johansen

781 posts

Alexander Johansen banner
Alexander Johansen

Alexander Johansen

@AlexRoseJo

CS PhD @Stanford || Statistical Machine Learning || Proofs, Bounds, and Better agents

Stanford, CA Beigetreten Ekim 2015
788 Folgt1K Follower
clem 🤗
clem 🤗@ClementDelangue·
Should we try to train an open source AI building model? We obviously have interesting datasets with HF, MLintern, transformers, trl…
English
131
41
1K
70K
Matt Durrant
Matt Durrant@mgdurrant·
Mythos is an excellent biologist. After we first gained access to it, we tested its ability to perform agentic molecular biology research and propose new hypotheses. It was a significant improvement, its biological reasoning and taste are impressive. We give more examples here:
English
61
70
1.5K
159.3K
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@ClementDelangue @Stanford Unfortunately, running frontier local models require so much on-GPU ram for reasonable tps. You need an 8-GPU box. It will be interesting if communities and companies will start investing in such for their daily needs.
English
0
0
1
242
clem 🤗
clem 🤗@ClementDelangue·
Narrative violation: according to @Stanford research, local models can answer 71.3% of real-world chat and reasoning queries accurately, up from 23.2% in 2023. Obviously at a fraction of the cost and energy consumption of frontier APIs. The obvious conclusion: you don't need a frontier model for most tasks. The future is multi-model: local, open-source, smaller and cheaper for the majority of workloads, frontier APIs when no other choices!
clem 🤗 tweet media
English
70
142
838
112.9K
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@shub0414 Moreover, frontier labs train on customer data. The new updated LLM will regurgitate your internal IP to anyone using it. I can't believe people are using these tools professionally. Local LLMs are a much better solution.
English
0
0
0
85
Shub
Shub@shub0414·
AI is pushing so much garbage code in production now that very soon they'll have to rehire more human than they laid off just to fix bugs created by AI and vibe coding.
English
130
46
470
157.5K
Kritika
Kritika@kritikakodes·
The harsh truth
Kritika tweet media
English
385
1.8K
23.6K
1.2M
Mike Schroepfer
Mike Schroepfer@schrep·
Going on TV shortly to talk about venture dollars for atoms not bits, scaling deeptech, AI, and more - anything you want to hear about?
English
5
0
25
2.7K
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@willknight Issue with MoE in local rigs is the hefty memory requirements. Shipping weights back an forth between CPU to GPU kills tps.
English
0
0
0
39
Alexander Johansen
Alexander Johansen@AlexRoseJo·
How much of the “AI scientist” is unknowingly rehashing other people’s work?
English
0
0
0
52
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@NVIDIAAI @huggingface Open source with open datasets makes it easier for academics and startup to develop frontier language models. Very excited about the heavy use of Mamba and how attention might be a thing of the past.
English
0
0
0
581
NVIDIA AI
NVIDIA AI@NVIDIAAI·
Today we're shipping Nemotron 3 Ultra. A 550B MoE frontier-intelligence open model built for long-running agents. It delivers 5x faster inference and lowers the cost of complex agentic tasks by up to 30% versus other open frontier models.
English
199
462
3.5K
1.2M
Alexander Johansen retweetet
Victor Zhong
Victor Zhong@hllo_wrld·
📣 I am hiring postdoctoral fellows in agentic AI at the R2L Lab, @UWaterloo. Lead your own agenda - systems that read, reason & act. Top-venue publishing, substantial compute, weekly PI 1:1s, and a real path to faculty/industry. Rolling review. Apply 👉 academicjobsonline.org/ajo/jobs/32140
English
1
10
54
5.1K
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@AradhyeAgarwal Without AllenAI there's no ChatGPT, Claude or Qwen. AllenAI started the LLM race in 2017 with the ELMO paper. Freedom of research is pivotal to maintain the talent they have, unfortunate turn of events.
English
0
0
8
1.5K
Aradhye Agarwal
Aradhye Agarwal@AradhyeAgarwal·
Why is everyone leaving Ai2 ;(
English
7
2
91
41.6K
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@liu_mingyu When scaling new LLM architectures were heavily reliant on great setups for data processing and training regimes. Nvidia making it all public helps researchers build the next frontier of AI.
English
0
0
0
359
Ming-Yu Liu
Ming-Yu Liu@liu_mingyu·
Introducing NVIDIA Cosmos 3 We released NVIDIA Cosmos 3 last night. And today, seeing it take the top spots across 8+ open model leaderboards feels surreal. We spent months working towards this moment. Here’s the breakdown: The Leaderboard Wins World Reasoning 🏆 #1 open model on VANTAGE-Bench for vision AI 🏆 #1 overall on Traffic Anomaly Reasoning (TAR) World Generation 🏆 #1 open model on Artificial Analysis Image-to-Video leaderboard 🏆 #1 open model on Artificial Analysis Text-to-Image leaderboard 🏆 #1 open model on PAI-Bench for physical AI synthetic data generation 🏆 #1 open model on Physics-IQ, which measures accuracy on physical laws 🏆 #1 open model on R-Bench for world generation quality World Action 🏆 #1 on RoboArena for specialized policy 🏆 #1 on RoboLab for action generation But the leaderboards are only part of the story. The real story is why we built Cosmos 3 in the first place. The Problem Training robots and autonomous systems in the real world is painfully hard. Robots need to try the same thing numerous times before they succeed reliably. Self-driving cars need rare edge cases that may never happen naturally. Smart machines need to understand physics, motion, contact, failure, and surprise. And real-world data is slow, expensive, and sometimes dangerous to collect. At some point, the answer cannot just be “collect more data.” You can’t collect your way out of an infinite physical world. You have to generate it. That… was the question behind Cosmos: Can one model understand the physical world deeply enough to reason about it, simulate it, and generate actions inside it? What We Built Cosmos 3 is the first omni-model for physical AI. It can understand and generate across: language · images · video · audio · action sequences It is not just a VLM. Not just a video generator. Not just a robot policy model. It is all of them, in one single model. That matters because physical AI has been fragmented for a long time. Cosmos 3 is our attempt to collapse that fragmentation. Depending on how you configure the inputs and outputs, the same model can act as a vision-language model, a video/world generator, a world simulator, or a world-action model. No separate architecture required. The Architecture Under the hood, Cosmos 3 uses a dual-tower Mixture-of-Transformers architecture. One tower is autoregressive for reasoning. It handles next-token prediction for language and discrete understanding. The other tower is diffusion-based- for generation. It denoises images, video, audio, and action trajectories. Two towers. Dual-stream joint attention. One shared world representation. Each modality gets its own tools: visual encoders, video VAEs, audio VAEs, and action projectors that can map different embodiments into a unified action space. Action is a first-class modality in Cosmos 3. That’s what makes it more than a video model. It doesn’t just predict and generate what the world might look like. It can connect reasoning and world modeling to physically grounded action. Why This Matters One of the most interesting findings from the ablation work is that training action domains together creates positive transfer. That means adding more embodiments does not just add more use cases. It can actually make the model better. This is the heart of why omnimodal training matters. A shared world representation is not just convenient. It can make each individual task stronger. That’s the part that feels like the beginning of something much bigger. The part I’m most excited about is that Cosmos 3 is fully open. Developers get the models, scripts, optimization, inference endpoints, post-training recipes, datasets, and benchmarks. Everything is available under the Linux Foundation’s OpenMDW 1.1 License. You can use Cosmos 3 out of the box. You can use the VLM, world model, or world-action pieces separately. You can post-train it for your own domain, embodiment, or accuracy target. That’s what makes this feel different. Cosmos 3 is not just a model release. It is the foundation for building intelligence for autonomous machines. For me, Cosmos 3 feels like a step toward a world where physical AI development becomes much more scalable and accessible - to a new age of developers and agents. That’s what we built Cosmos 3 for. I cannot wait to see what you build with it. Download Models on Hugging Face huggingface.co/collections/nv… Customize Models on GitHub github.com/NVIDIA/cosmos Read the Tech Blog to Learn More developer.nvidia.com/blog/develop-p…
Ming-Yu Liu tweet media
English
20
68
450
65.1K
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@draecomino Most of AI is developed through government grants at US institutions. Anthropic researchers discussed their reliance on US academics at ACM CAIS 2026 last week.
English
0
0
1
131
Alexander Johansen retweetet
Keshigeyan Chandrasegaran
Keshigeyan Chandrasegaran@keshigeyan·
1/ Introducing GPIC: a Giant Permissive Image Corpus and benchmark for visual generation! 🚀100M VLM-captioned image-text pairs for training 📊1M image-text pairs for benchmarking 🖼️~28 trillion pixels 🤗Centrally Hosted ✅Fully permissive for research + commercial use Dataset, benchmark and models🧵👇 Co-led with @KyleSargentAI
Keshigeyan Chandrasegaran tweet media
English
15
84
372
144.5K
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@MainzOnX High precision computing will be needed to avoid expensive attention KV caches.
English
1
0
0
393
Adam Mainz
Adam Mainz@MainzOnX·
Are we still doing any fp64 computations? Who out there being precise
English
31
1
129
29.9K
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@Im_IrushiK Tokenizers break words into subunits when encoding and decoding. Developing relationships between these subwords can be difficult, thus such "plain" task might fail.
English
0
0
0
58
Irushi
Irushi@Im_IrushiK·
Opus 4.8 is insane, nothing will be the same after this model 💀
Irushi tweet media
English
803
1K
17.3K
1.5M
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@chris_j_paxton LLMs are notoriously easy to distill from - you can just ask them a question. World models and physical enginges will be different, but they are not as well developed and of general interest yet.
English
0
0
0
283
Alexander Johansen
Alexander Johansen@AlexRoseJo·
@GaryMarcus Optimal internal routing and lora finetuning using claude traces will be the next step. Researchers don't consider it much because we don't have the data.
English
0
0
0
567
Gary Marcus
Gary Marcus@GaryMarcus·
Hot take on what comes next, after the sudden decline of tokenmaxxing: - OpenAI will struggle - with the decline of tokenmaxxing Anthropic will struggle (aside from this quarter) to make a profit - Google will catch up to Anthropic - some Chinese companies might, too - LLMs will become commodities; margins will be very very thin - Most of the companies that invested massively in them will struggle to make back their investments - SpaceX’s AI efforts will flail - Nvidia will eventually decline, once all of the above becomes widely recognized.
English
175
215
2K
430.2K
Alexander Johansen retweetet
Alex Rives
Alex Rives@alexrives·
Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology. The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics. We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity. We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures. ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences. A world model of protein biology emerges through language modeling. We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins. The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science. This understanding emerges without prior knowledge, just from language modeling of protein sequences. Language models are becoming a powerful substrate to understand and program biology. The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders. I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.
GIF
English
74
448
1.6K
599.2K