Sanjoy Das

6K posts

Sanjoy Das banner
Sanjoy Das

Sanjoy Das

@_sanjoydas

AI Compiler Director @NVIDIA (Tile IR, XLA), 2x girldad

Los Altos, CA Katılım Nisan 2009
609 Takip Edilen1.2K Takipçiler
Sanjoy Das retweetledi
NVIDIA HPC Developer
NVIDIA HPC Developer@NVIDIAHPCDev·
🌅 BASIC is BACK! In response to overwhelming demand from seasoned developers everywhere, we’re releasing cuTile BASIC for GPUs, bringing CUDA Tile programming to this long-overlooked language. 🧵 👇
English
14
28
189
11.3K
Sanjoy Das retweetledi
Bryce Adelstein Lelbach
Today, NVIDIA is launching the next paradigm shift in GPU programming: cuTile BASIC Write perf portable BASIC kernels and deploy them at any scale from edge inference devices like your calculator to entire GPU clusters We're going back to BASIC developer.nvidia.com/blog/cuda-tile…
Bryce Adelstein Lelbach tweet mediaBryce Adelstein Lelbach tweet mediaBryce Adelstein Lelbach tweet media
English
16
39
324
26.9K
Sanjoy Das retweetledi
NVIDIA HPC Developer
NVIDIA HPC Developer@NVIDIAHPCDev·
🎉 CUDA 13.2 just dropped, and GPU programming just got simpler. This release expands CUDA Tile support to Ampere and Ada GPUs while delivering a stronger CUDA Python stack for cluster-scale workloads. What's new: ✅ Install cuTile Python directly from PyPI: pip install cuda-tile ✅ Enhanced CUDA Python profiling and debugging across Numba-CUDA flows and Nsight tools ✅ Modern CUDA C++ and refreshed math libraries optimized for AI and HPC kernels Ready to accelerate your workflows? 📝 Read the technical deep dive: nvda.ws/4rZtAq1
English
16
89
844
53.6K
Sanjoy Das retweetledi
Victor Kumar
Victor Kumar@victorckumar·
First we had one child and I thought I knew what children are like. Our second child was completely different; I’d overgeneralized. There are actually two types of children.
English
89
604
20.6K
325.1K
Sanjoy Das
Sanjoy Das@_sanjoydas·
@karpathy Generating code and deploying it (in the traditional sense) creates inflexible programs that cannot “learn on the job”.
English
0
0
0
20
Sanjoy Das
Sanjoy Das@_sanjoydas·
@karpathy Creating software might evolve into starting with a “blank slate” app, interacting with it until it’s sufficiently trained and save the “image”, which will continue to be incrementally trained by normal usage.
English
1
0
0
48
Andrej Karpathy
Andrej Karpathy@karpathy·
I think it must be a very interesting time to be in programming languages and formal methods because LLMs change the whole constraints landscape of software completely. Hints of this can already be seen, e.g. in the rising momentum behind porting C to Rust or the growing interest in upgrading legacy code bases in COBOL or etc. In particular, LLMs are *especially* good at translation compared to de-novo generation because 1) the original code base acts as a kind of highly detailed prompt, and 2) as a reference to write concrete tests with respect to. That said, even Rust is nowhere near optimal for LLMs as a target language. What kind of language is optimal? What concessions (if any) are still carved out for humans? Incredibly interesting new questions and opportunities. It feels likely that we'll end up re-writing large fractions of all software ever written many times over.
Thomas Wolf@Thom_Wolf

Shifting structures in a software world dominated by AI. Some first-order reflections (TL;DR at the end): Reducing software supply chains, the return of software monoliths – When rewriting code and understanding large foreign codebases becomes cheap, the incentive to rely on deep dependency trees collapses. Writing from scratch ¹ or extracting the relevant parts from another library is far easier when you can simply ask a code agent to handle it, rather than spending countless nights diving into an unfamiliar codebase. The reasons to reduce dependencies are compelling: a smaller attack surface for supply chain threats, smaller packaged software, improved performance, and faster boot times. By leveraging the tireless stamina of LLMs, the dream of coding an entire app from bare-metal considerations all the way up is becoming realistic. End of the Lindy effect – The Lindy effect holds that things which have been around for a long time are there for good reason and will likely continue to persist. It's related to Chesterton's fence: before removing something, you should first understand why it exists, which means removal always carries a cost. But in a world where software can be developed from first principles and understood by a tireless agent, this logic weakens. Older codebases can be explored at will; long-standing software can be replaced with far less friction. A codebase can be fully rewritten in a new language. ² Legacy software can be carefully studied and updated in situations where humans would have given up long ago. The catch: unknown unknowns remain unknown. The true extent of AI's impact will hinge on whether complete coverage of testing, edge cases, and formal verification is achievable. In an AI-dominated world, formal verification isn't optional—it's essential. The case for strongly typed languages – Historically, programming language adoption has been driven largely by human psychology and social dynamics. A language's success depended on a mix of factors: individual considerations like being easy to learn and simple to write correctly; community effects like how active and welcoming a community was, which in turn shaped how fast its ecosystem would grow; and fundamental properties like provable correctness, formal verification, and striking the right balance between dynamic and static checks—between the freedom to write anything and the discipline of guarding against edge cases and attacks. As the human factor diminishes, these dynamics will shift. Less dependence on human psychology will favor strongly typed, formally verifiable and/or high performance languages.³ These are often harder for humans to learn, but they're far better suited to LLMs, which thrive on formal verification and reinforcement learning environments. Expect this to reshape which languages dominate. Economic restructuring of open source – For decades, open-source communities have been built around humans finding connection through writing, learning, and using code together. In a world where most code is written—and perhaps more importantly, read—by machines, these incentives will start to break down.⁴ Communities of AIs building libraries and codebases together will likely emerge as a replacement, but such communities will lack the fundamentally human motivations that have driven open source until now. If the future of open-source development becomes largely devoid of humans, alignment of AI models won't just matter—it will be decisive. The future of new languages – Will AI agents face the same tradeoffs we do when developing or adopting new programming languages? Expressiveness vs. simplicity, safety vs. control, performance vs. abstraction, compile time vs. runtime, explicitness vs. conciseness. It's unclear that they will. In the long term, the reasons to create a new programming language will likely diverge significantly from the human-driven motivations of the past. There may well be an optimal programming language for LLMs—and there's no reason to assume it will resemble the ones humans have converged on. TL; DR: - Monoliths return – cheap rewriting kills dependency trees; smaller attack surface, better performance, bare-metal becomes realistic - Lindy effect weakens – legacy code loses its moat, but unknown unknowns persist; formal verification becomes essential - Strongly typed languages rise – human psychology mattered for adoption; now formal verification and RL environments favor types over ergonomics - Open source restructures – human connection drove the community; AI-written/read code breaks those incentives; alignment becomes decisive - New languages diverge – AI may not share our tradeoffs; optimal LLM programming languages may look nothing like what humans converged on ¹ x.com/mntruell/statu… ² x.com/anthropicai/st… ³ wesmckinney.com/blog/agent-erg…#issuecomment-3717222957" target="_blank" rel="nofollow noopener">github.com/tailwindlabs/t…

English
701
656
8.1K
1.2M
Sanjoy Das retweetledi
Greg Brockman
Greg Brockman@gdb·
gb200 has really been enabling us to do some amazing things
English
115
83
1.9K
214.9K
Sanjoy Das retweetledi
Tianqi Chen
Tianqi Chen@tqchenml·
I’ll be giving a talk on TVM-FFI at @GPU_MODE this week! We will discuss how open ABI and FFI facilitate a fast, robust, and seamless framework interop experience across DSLs and kernel libraries.
GPU MODE@GPU_MODE

This Saturday Jan 31 at Noon PST we have one of the founders of the whole field of ML Systems @tqchenml who will be giving a talk on tvm-ffi - an open ABI and FFI for ML systems which has grown tremendously in relevance with the explosion of Kernel DSLs youtube.com/watch?v=xMzcs6…

English
1
15
130
17.5K
Sanjoy Das
Sanjoy Das@_sanjoydas·
@dccsillag > programs&compilers here I added some short explanatory points on these in the doc.
English
0
0
0
24
daniel csillag
daniel csillag@dccsillag·
@_sanjoydas Regarding novelty, I personally haven't seen such a result before, but to be fair I've seen fairly little theory on optimizing compilers.
English
1
0
0
75
Sanjoy Das
Sanjoy Das@_sanjoydas·
I wrote a short proof showing that any self-hosting compiler cannot perform certain legal optimizations. Would love feedback from compiler folks - does the proof look correct, and is it already well known? Link: docs.google.com/document/d/17R…
English
15
12
210
21.7K
Sanjoy Das
Sanjoy Das@_sanjoydas·
@oisyn @sparr0 I think the argument needs to be made more rigorous - there is no requirement that the compiler constant folds Compile(#P) to optimize the comparison. E.g. a compiler can optimize `X+1-1==X` to true without constant folding the LHS (`X+1-1`).
English
1
0
1
134
Sanjoy Das
Sanjoy Das@_sanjoydas·
@oisyn @sparr0 > As #P includes itself, it needs to evaluate itself for constant folding This could be an alternate proof to why the compiler cannot optimize the comparison `Compile(#P) == ...`. However, (contd)
English
2
0
0
136
Sanjoy Das
Sanjoy Das@_sanjoydas·
@sparr0 I don’t think the program has infinite recursion. Where do you see infinite recursion?
English
1
0
0
595
Sparr
Sparr@sparr0·
@_sanjoydas You have failed to define what the compiler does when faced with infinite recursion, which is what would happen when trying to compile this code. Your claim that the program sets X to False does not hold.
English
1
0
0
757
Sanjoy Das
Sanjoy Das@_sanjoydas·
@ngsankha > This reads to me as […] entire compiler? I don’t think that’s true – eg if the output of the compiler (the generated code) is not used then the entire compiler can be DCEed.
English
0
0
3
1.5K
Sankha Narayan Guria
Sankha Narayan Guria@ngsankha·
@_sanjoydas This reads to me as there cannot be a self hosted compiler that can optimize away the entire compiler? The statement you have is a negation of the above. I have to think harder of both are equivalent
English
1
0
2
1.7K
Sanjoy Das retweetledi
lia
lia@tallsnail·
infant parenting: intuition, surrender, vibes toddler parenting: a rigid system of statutes, swift enforcement mechanisms, zero tolerance policy
English
31
182
4.3K
110.2K
Sanjoy Das retweetledi
GPU MODE
GPU MODE@GPU_MODE·
This Saturday 10:00 AM PST, the last talk of the year before we resume again on Jan 3. NVIDIA has made a profound change to its programming model with cuTile and TileIR. They've given some shorter talks online but this will be the first deep dive youtube.com/watch?v=sjkEUh…
YouTube video
YouTube
English
1
14
114
20.4K