VectorWare

12 posts

VectorWare

@vectorware

Присоединился Ağustos 2025

0 Подписки385 Подписчики

VectorWare@vectorware·1d

@AgileJebrim No worries, Rust is not for everyone! We're pretty bullish on github.com/nvidia/stdexec which maps well to our async/await work and CUDA Tile. Someone will likely take what we did here for Rust threads and do it for C++ to explore the tradeoffs.

English

110

Jebrim@AgileJebrim·1d

@vectorware Yeah I see the heavy insistence on Rust here too. I would personally pass, sorry, but I’ll follow to see where you guys end up. vectorware.com/jobs/

English

100

VectorWare@vectorware·2d

We are excited to announce that we can successfully use Rust's std::thread on the GPU. This has never been done before. vectorware.com/blog/threads-o… Supporting Rust's std::thread enables existing Rust code to work on the GPU and makes GPU programming more ergonomic.

English

100

638

38.6K

VectorWare@vectorware·1d

@AgileJebrim Agreed, software architected for the GPU will always be better than CPU software ported over. We've done this work mainly to use existing CPU libraries in GPU-native apps where it makes sense (adding some GPU-specific logic in places for perf).

English

134

Jebrim@AgileJebrim·1d

Being GPU-native is a great goal and I’ve been doing it for years, but starting from a CPU-based API that cannot even leverage the SIMD lanes within each wavefront doesn’t sound very GPU-native to me. It’s just programming the GPU with CPU-style scalar code, losing out on a magnitude’s worth of performance potential. I’m sure it’ll still perform better than typical multithreaded CPU code though, especially with the better bandwidth that’s available in GPUs, but you’re still just emulating a CPU on top of the GPU rather than truly being GPU-native and leveraging a wider range of the hardware capabilities of GPUs. Interesting idea though to at least improve some of the existing codebases out there without requiring a big rewrite.

English

630

VectorWare@vectorware·1d

@AgileJebrim Yep, future posts will talk about SIMD lanes within each warp. Check the pedantic note on "first" if you haven't!

English

729

Jebrim@AgileJebrim·1d

@vectorware Just skimmed this but it appears you’re doing warp-uniform behavior, not leveraging the SIMD lanes within each warp? “At VectorWare, we are building the first GPU-native software company.” I can absolutely say you’re not the first. :P

English

1.5K

VectorWare@vectorware·2d

@Leik0w0 Yep, that is the direction we've been experimenting with and will be talking about in a future post

English

913

Léo@Leik0w0·2d

@vectorware Very interesting work! I like the way you enforce truly independent work to be run on different warps. How would you model programming the lanes inside each warp though ? A simd like model ?

English

1.2K

VectorWare@vectorware·18 Şub

@ID_AA_Carmack We should talk, see vectorware.com/blog/async-awa….

English

John Carmack@ID_AA_Carmack·17 Şub

The glory work of GPU scheduling is in the frontier data centers with hundreds of thousands of GPUs, but a lot of research work is done with single GPU jobs on modest clusters, and the scheduling leaves much to be desired. I wish there were a clean way to preempt GPU tasks, so long running tasks could be transparently paused to allow higher priority tasks to get the minimum time-to-results. Manual checkpointing and cooperative multitasking is an option, but it complicates codebases and is fertile ground for bugs. It feels like most of the pieces are present: Everything goes through page tables on the GPUs already, Nvidia UVM (Unified Virtual Memory) allows demand paging to host memory, and MPS (Multi-Process Service) could act as a CUDA shim to force everything to use a different memory allocator. Memory page thrashing would be catastrophic for GPU tasks, but the idea would be to pause the host task of the low priority process, then let the high priority process force only the necessary pages out (or maybe none at all, if the memory pressure wasn’t high enough) while it is running, then resume the low priority task on completion, allowing it to page everything back in. Task switching at the level of tens of seconds, not milliseconds. Even if it didn’t handle absolutely all memory (kernel allocations and such) and had some overhead, that would be quite useful. Of course, Nvidia would prefer you to Just Buy More GPUs!

English

1.2K

98.2K

VectorWare@vectorware·17 Şub

We are excited to announce that we can successfully use Rust's async/await on the GPU. This has never been done before. vectorware.com/blog/async-awa… Supporting Rust's async/await (and futures) enables existing Rust code to work on the GPU and makes GPU programming more ergonomic.

English

VectorWare@vectorware·20 Oca

@nazarpc We got an unmodified coremark runnning on the GPU and a GPU warp is surprisingly competitive with a CPU core

English

131

Nazar Mokrynskyi@nazarpc·20 Oca

@vectorware Interesting experiment, but I am still skeptical about running general purpose CPU code on GPUs efficiently. From my experience GPUs like things in a very particular way and deviation leaves a lot of performance on the table. Looking forward to more technical details.

English

152

VectorWare@vectorware·20 Oca

We are excited to announce that we can successfully use Rust's standard library from the GPU. This has never been done before. vectorware.com/blog/rust-std-… Supporting Rust's standard library enables existing Rust code to work on the GPU and makes GPU programming feel normal.

English

3.7K

VectorWare ретвитнул

Christian Legnitto@LegNeato·13 Kas

I did an interview with the @rust_foundation about Rust on the GPU (with a brief mention of my new company @vectorware ): youtube.com/watch?v=monOq_…

YouTube

English

1.1K

VectorWare@vectorware·4 Kas

We're honored to be one of The Information's 50 most promising startups: theinformation.com/projects/the-i…

English

611

VectorWare@vectorware·23 Eki

Hello world! We are building the first GPU-native software company. Today we are sharing the thesis, people, and partners behind it. vectorware.com/blog/announcin…

English

2.3K

Открыть

@AgileJebrim @Leik0w0 @ID_AA_Carmack @nazarpc @rust_foundation @elonmusk @BarackObama @taylorswift13