Bryce Adelstein Lelbach

12K posts

Bryce Adelstein Lelbach banner
Bryce Adelstein Lelbach

Bryce Adelstein Lelbach

@blelbach

Principal Engineer at @NVIDIA working on programming languages. @adspthepodcast co-host. C++ Library Evolution chair emeritus. Frequent flyer. Horology fan.

Manhattan, NY Katılım Mart 2011
2.7K Takip Edilen17.4K Takipçiler
Sabitlenmiş Tweet
Bryce Adelstein Lelbach
Bryce Adelstein Lelbach@blelbach·
The latest revision of @INCITS/@isostandards COBOL comes out this year The goals of COBOL sound normal today: - Portable - Freely available - Designed by the community In 1959 it was radical & unprecedented It was also conceived of & led by women This is the story of COBOL
Bryce Adelstein Lelbach tweet mediaBryce Adelstein Lelbach tweet media
English
9
104
349
0
Bryce Adelstein Lelbach retweetledi
Chris Lattner
Chris Lattner@clattner_llvm·
Amazing to catch up with @WenmeiHwu, a hero to many of us in the GPU programming space, and who I was lucky to have as an advisor on my PhD committee years ago. Congratulations on the new edition of “Programming Massively Parallel Processors”. Now also in Mojo!
Chris Lattner tweet media
English
9
19
441
19.8K
Bryce Adelstein Lelbach retweetledi
Jump Trading
Jump Trading@jumptrading·
For 15+ years, Jump Trading has partnered with @nvidia to advance accelerated computing in financial research. Today, we’re deploying NVIDIA’s Vera Rubin NVL72 to support large-scale AI infrastructure. We build for research velocity. Learn more: jumptrading.com/signals/jump-t…
Jump Trading tweet media
English
2
4
48
39.4K
Bryce Adelstein Lelbach retweetledi
Harley Finkelstein
Harley Finkelstein@harleyf·
Montreal has the best food scene in the world right now. And it's not even close. Here's @nytimes on Rôtisserie La Lune. Rotisserie chicken. Obsessive craft. Packed every night. One of my favorite restaurants on the planet. And absolutely in Montreal. So proud of @vanyafilipovic, Marco and the whole team. @Montreal is on fire 🔥 nytimes.com/2026/03/17/din…
Harley Finkelstein tweet mediaHarley Finkelstein tweet mediaHarley Finkelstein tweet mediaHarley Finkelstein tweet media
English
147
105
1.6K
206.7K
Bryce Adelstein Lelbach retweetledi
Charles 🎉 Frye
Charles 🎉 Frye@charles_irl·
love to see one of my heroes standing in front of our logo -- especially for a good reason, supporting development of cutting-edge blackwell kernels!
Vikram@msharmavikram

@marksaroufim @GPU_MODE @NVIDIAGTC Award ceremony for the nvfp4 kernels. Come hang out at GuildHouse!

English
0
1
32
2.5K
Bryce Adelstein Lelbach retweetledi
Dirhousssi Amine
Dirhousssi Amine@DirhousssiAmine·
GTC2026 Jensen mentions tiles 👀👀 CuTile will be a way bigger deal than we realise. Future hardware will need better software abstraction and tiling is a step in this direction
Dirhousssi Amine tweet media
English
0
4
10
1K
Bryce Adelstein Lelbach retweetledi
Bryce Adelstein Lelbach
Go to a talk about CUDA or speak to a CUDA developer at GTC 2026, and you might get one of these...
Bryce Adelstein Lelbach tweet media
English
3
1
38
3.8K
Ashvin
Ashvin@ashverm4·
@blelbach Wish I could be there! We're trying to target it with our in-house compiler, and it would've been nice to interact the folks who made it
English
1
0
0
54
Bryce Adelstein Lelbach
@dss_gabriel @NVIDIAGTC What if you had a GPU kernel programming model that was memory safe by construction? Perhaps some sort of array-oriented model where you don't explicitly program threads or do inter-thread communication.
English
4
2
8
419
Gabriel
Gabriel@dss_gabriel·
@blelbach @NVIDIAGTC That’s the catch :( It’s been a while since I’ve last looked at GPGPU in Rust (2023) but back then, Rust just couldn’t allow writing "safe" kernels since every thread would alias as soon as you indexed in a buffer. Idk how far the folks at VectorWare have come w/ Rust-CUDA tho
English
1
0
3
215
δ
δ@ptxpapi·
δ@ptxpapi

@blelbach I saw Vikram Mailthody mention online that there was a cuTile Rust in development… is this still true? Is there any sort of timeline on a release for this?

1
0
0
189
Bryce Adelstein Lelbach
Bryce Adelstein Lelbach@blelbach·
Stay tuned!
AlexZ 🦀@blackanger

刚发现 VectorWare 团队在 2026 年 2 月发布的一篇技术博文中宣布,他们成功在 GPU 上运行了 Rust 的 async/await。 vectorware.com/blog/async-awa… 这解决了传统 GPU 上并发困境。 传统 GPU 编程是数据并行,所有线程对不同数据执行相同操作。但随着 GPU 程序变复杂,开发者开始使用 warp specialization(warp 特化),让不同 warp 跑不同的任务(比如一个 warp 负责加载数据,另一个负责计算)。这本质上是从数据并行转向了任务并行。 问题在于:这种并发和同步完全靠手动管理,没有语言或运行时层面的支持,就像 CPU 上手写线程同步一样容易出错且难以推理。 博客文章梳理了三个已有的高层抽象方案: JAX 把 GPU 程序建模为计算图,编译器分析图中的依赖关系来决定执行顺序和并行策略。 Triton 用"block"作为独立计算单元,通过 MLIR 多层编译管线来管理并发。 CUDA Tile 则引入了"tile"作为一等公民数据单元,让数据依赖变得显式。 但这些方案有共同的缺点: 它们都要求开发者用全新的方式组织代码,需要新的编程范式和生态,对采用构成显著障碍。 vectorware而且代码复用困难。现有的 CPU 库和 GPU 库都无法直接与这些框架组合。 文章的核心论点是 Rust 的 Future trait 恰好满足了他们想要的所有特性: 1. Future 是延迟的、可组合的值。 跟 JAX 的计算图类似,你先构建程序的"描述",再执行。编译器可以在执行前分析依赖关系。 2. Future 天然表达独立的并发单元。 跟 Triton 的 block 类似,多个 future 可以串行(.await 链)或并行(join!、组合子)执行。 3. Rust 的所有权系统让数据依赖显式化。 跟 CUDA Tile 的显式 tile 类似,future 通过捕获数据来编码数据流向,而 Send/Sync/Pin 等 trait bound 则约束了数据如何在并发单元间共享和传递。 4. 最关键的一点: Warp 特化本质上就是手写的任务状态机,而 Rust 的 future 恰好编译成编译器自动生成和管理的状态机。 vectorware既然 future 只是状态机,没有理由不能在 GPU 上运行。 他们移植了 Embassy,一个为嵌入式系统设计的 no_std 异步执行器。GPU 没有操作系统,不支持 Rust 标准库,这跟嵌入式环境非常相似,所以 Embassy 是天然的选择。将 Embassy 适配到 GPU 上只需要很少的修改,这种复用现有开源库的能力远优于其他非 Rust 的 GPU 生态。 这篇文章表面上在讲 async/await,但它真正在说的是一个更大的事情:用 Rust 的类型系统和零成本抽象来统一 CPU 和 GPU 的编程模型。 Future 不关心自己跑在哪里,线程、核心、block、warp 都可以。同一段 async 代码可以不改一行在 CPU 和 GPU 上运行。 这跟 JAX/Triton 那种"为 GPU 写一套全新的东西"的思路根本不同,是一种从语言层面自底向上的统一。 VectorWare 之前还发过一篇关于在 GPU 上启用 Rust std 的文章,加上这次的 async/await,他们的目标很明确:让 GPU 编程变成"普通的 Rust 编程",而不是一个需要全新心智模型的特殊领域。

English
4
9
103
14.5K