Alexander Lavin

1.7K posts

Alexander Lavin banner
Alexander Lavin

Alexander Lavin

@AlexLavin_C137

AI ∩ physics → *Simulation Intelligence* founder @simai4science + Digital Twin Earth @nasa @fdl_ai + 🦾 🛰 🧠 @vicariousai @numenta @cmu_robotics @cornelleng

simulation.science 👀 Katılım Ekim 2012
984 Takip Edilen4.4K Takipçiler
Alexander Lavin retweetledi
Felix Koehler
Felix Koehler@felix_m_koehler·
Are you interested in Differentiable Programming/Physics & Autodiff? @SimAI4Science is currently hosting the first Tesseract hackathon. Participate, build cool stuff and win prizes! Find more details here: pasteurlabs.ai/tesseract-hack… Submission Deadline: Jan 5 2026
English
0
1
12
891
Gill Verdon
Gill Verdon@GillVerd·
We have gone from 0 to 1 for thermodynamic computing. Now it is time to scale the paradigm. Excited for what the future holds.
English
143
226
2.7K
252.3K
Alexander Lavin retweetledi
Ben Recht
Ben Recht@beenwrekt·
Revisiting last week’s open problems scandal, I wrote about LLMs as Lore Laundering Machines and why some are blind to the novelty whitewashing. argmin.net/p/lore-launder…
English
3
16
73
15.2K
Alexander Lavin retweetledi
rohit
rohit@krishnanrohit·
. @karpathy pours a little bit of cold water on the AI hype
rohit tweet media
English
27
48
607
189.7K
Alexander Lavin retweetledi
David Pfau
David Pfau@pfau·
Imagine trying to explain this image to someone at NIPS in 2010.
Charbel-Raphael@CRSegerie

Huge: Almost all members of the UN Security Council are in favor of AI regulation or setting red lines. Never before had the principle of red lines for AI been discussed so openly and at such a high diplomatic level. UN Secretary-General Antonio Guterres opened the session with a firm call to action for red lines: • “a ban on lethal autonomous weapons systems operating without human control, with [...] a legally binding instrument by next year” • “the need to ensure that AI never lowers the barriers to acquiring or deploying prohibited weapons” Then, Yoshua Bengio took the floor and highlighted our Global Call for AI Red Lines — now endorsed by 11 Nobel laureates and 9 former heads of state and ministers. Almost all countries were favorable to some red lines: China: “It’s essential to ensure that AI remains under human control and to prevent the emergence of lethal autonomous weapons that operate without human intervention.” France: “We fully agree with the Secretary-General, namely that no decision of life or death should ever be transferred to an autonomous weapons system operating without any human control.” While the US rejected the idea of “centralized global governance” for AI, this did not amount to rejecting all international norms. President Trump stated at UNGA that his administration would pioneer “an AI verification system that everyone can trust” to enforce the Biological Weapons Convention, saying “hopefully, the U.N. can play a constructive role.”

English
6
18
303
30K
Alexander Lavin retweetledi
Horace He
Horace He@cHHillee·
I quite enjoyed this and it covers a bunch of topics without good introductory resources! 1. A bunch of GPU hardware details in one place (warp schedulers, shared memory, etc.) 2. A breakdown/walkthrough of reading PTX and SASS. 3. Some details/walkthroughs of a number of other hardware features (TMAs, wgmmas, etc.)
Aleksa Gordić (水平问题)@gordic_aleksa

New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along. (Remember matmul is the single most important operation that transformers execute both during training and inference. Most of NVIDIA compute is spent on it. Gaining 1% in efficiency translates to massive savings in the order of many nuclear reactors :P) I, yet again, realized i underestimated the effort. 😅 Here is one more booklet (lol). 47 figures! I covered: * The fundamentals of the GPU architecture with an emphasis on the memory hierarchy, building mental models for GMEM, SMEM, and L1/L2, and then connecting them to the CUDA programming model. Along the way we also looked at the "speed of light," how it's bounded by power, with hardware reality leaking into our model. * PTX/SASS, and how to steer the compiler into generating what we actually want (is that loop being unrolled, are we using vectorized loads like LDG.128, etc.). I've annotated one PTX/SASS example for a simple matmul kernel in excruciating detail. Even if you're new to compilers you should find this useful. (i actually found various inefficiencies in both compilers - fun!) * Many core concepts such as tile/wave quantization, occupancy, ILP (instruction-level parallelism), roofline model, etc. Also building intuition around fundamental equivalences: dot product as a sum of partial outer products, why square tiles are the right shape for high arithmetic intensity, etc. * The warp tiling method - which is near SOTA assuming you can't use tensor cores, TMA, async mem instructions, and bf16. Just maximizing GPU's performance using nothing but CUDA cores, registers and shared memory. * Finally, we step into Hopper (H100): TMA, swizzling, tensor cores and the wgmma instruction, async load/store pipelines, scheduling policies like Hilbert curves, clusters with TMA multicast, faster PTX barriers, and more. As always lots of examples, lots of visuals. This is the first time i could see warp tiling kernel and be like "oh i get it completely". I just needed my mental image transformed into an actual image. A few years ago I was really inspired by @Si_Boehm's excellent blog post on how matmul works, but I also found it had several errors, some unclear explanations, and it was quite outdated. Building on @pranjalssh amazing work (who did a great job building sota kernels for H100) and my own research, this is the final result. --- Again a huge thank you to @Hyperstackcloud (GPU cloud) for giving me an H100 (PCIe) node to run some of the experiments and analysis that i needed to write this up. Also a big thank you to my friends Aroun (who did a very thorough review of the post; Aroun's doing cool GPU/AI stuff at Magic and was previously GPU architect at Apple and Imagine, he's one of the best GPU people i know and we worked together on llm.c w/ @karpathy) and the amazing @marksaroufim! (PyTorch) for taking the time during weekend when they didn't have to. :)

English
8
92
974
99.6K
David Pfau
David Pfau@pfau·
I miss finding and discussing machine learning papers on this website.
English
27
17
488
26.5K
Alexander Lavin
Alexander Lavin@AlexLavin_C137·
Customers, partners, and Pasteurians use Tesseracts for end-to-end differentiable pipelines consisting of wildly different components: physical simulators, geometric operators, differentiable meshers / renderers, CAD + CFD, scientific data transforms, AI & multiphysics ML models.
Alexander Lavin tweet media
English
0
0
1
222
Alexander Lavin
Alexander Lavin@AlexLavin_C137·
Now scientists & engineers can wrap complex scientific and ML code into containerized, self-documenting, differentiable functions—making them easier to use, compose, serve, share, and deploy 🚀 docs.pasteurlabs.ai/projects/tesse…
English
1
0
2
239
Alexander Lavin
Alexander Lavin@AlexLavin_C137·
Today we launched @SimAI4Science "Insights": pasteurlabs.ai/insights where we’ll share perspectives and learnings on all things Simulation Intelligence—from technical posts on autodiff and applied physics, to dialogues on frontier-tech startups and philosophy of science.
Alexander Lavin tweet mediaAlexander Lavin tweet mediaAlexander Lavin tweet media
English
0
3
6
775