Alex Goldring

638 posts

Alex Goldring banner
Alex Goldring

Alex Goldring

@SoftEngineer

Building Shade — next-gen Web graphics engine | Graphics engineer | consultant

Katılım Ağustos 2009
37 Takip Edilen2.2K Takipçiler
Alex Goldring
Alex Goldring@SoftEngineer·
@cassett31 That's why I built it this way - different animations would be just as cheap too. Or "expensive", whichever way you want to look at it. Every character gets a full copy of geometry, a complete new skeleton, and fully evaluates it's animation.
English
0
0
0
17
mimo
mimo@cassett31·
@SoftEngineer Since it’s the same instance of same mesh the gpu doesnt have to process everything again and again. What about different animations then ?
English
1
0
1
6
Alex Goldring
Alex Goldring@SoftEngineer·
Apparently animating more than ~20 characters in modern graphics engines is a big deal😅 324 skinned characters animating independently in WebGPU (browser) Each character is playing animating with a completely separate skeleton and a timeline. No two characters sample the same time. Character has 66 bones and 28,106 triangles I want to stress that there is no instancing of any kind here, and CPU is not involved at all.
English
31
30
396
56.3K
Alex Goldring
Alex Goldring@SoftEngineer·
Why does it have to be a texture? Easy - because that's all it could be when the tech was invented. It's true that today you don't have to go that route. However, a lot of tooling was already built around textures, and formats too. So unless you're building something from scratch or your tooling is more modern - you'll be limited by that side too. Basically - it's legacy. Nothing wrong with old tech per-se, there are reasons why it can still be good. For example - you're limited in how many storage buffers you can attach in WebGPU, but textures don't share that same limit, so if you're pushing up against that limit - textures can still be a valid path.
English
0
0
0
14
Erfan
Erfan@ahmadierfan999·
I'm not an expert in rendering animations, but why does it have to be in a texture? - You could have 12byte vertex data with 4 byte alignment and save memory if you use BDA or byte address buffer - textures usually need an extra staging copy to upload, that's just an extra copy and swizzle - AFAIK WebGPU doesn't let you decide on "Optimal" vs "Linear" tiling like vulkan, that means you'll be also paying for extra data swizzling into optimal layout for regular sampling (most likely a version of Z-order) yea it's brutal overall, considering streaming that small 15MB texture for a 1 sec in full would take 1-2ms based on PCIe config. I suppose there is also more advanced stuff one could do if they really wanted to do VAT, just spitting out ideas: - instead of storing all frames and linearly interpolating, use 4-6x less data, and do polynomial (quadratic) interpolation with nearest 3 points instead - to support 10-100 second weird animations, one could only stream a portion. not all of it needs to be resident on VRAM
English
1
0
1
126
Alex Goldring
Alex Goldring@SoftEngineer·
VAT or "Vertex animation" the way everyone seems to refer to it, is just pre-recorded set of positions for each vertex. Discretized into frames. Imagine like a large table with each row being a set of vertex coordinates for every vertex the character has. Most powerful GPUs cap out at 32k texture resolution, meaning that you character can't have more than 32k vertices. If your character has more than that - it's just not going to work. A lot of cards don't support 32k texture resolution, but, say 4k instead. If you have a 4k texture and you store full vertex positions, you'll need 12 bytes per vertex. Except, texture formats don't do 3 channels today, you have 1,2 or 4. We have to go with 4. That's 16 bytes (float32 = 4 bytes). The character I was showing, with ~30k vertices, would need 30,000^16 = 480 Kb of storage per frame. Let's say your animation is very short and sweet, only 30 frames, that's about 1-2 second of playback with decent resolution. You would need 15Mb of storage for that. The dancer in my video had an 18s animation loop. At just 30 frames per second, you'd need 260Mb of texture storage for just this one animation. Now let's say you want multiple animations, for multiple characters... it's a nice piece of tech for short and complex animations. Like baking a sim for example, or taking a very complex rig with 100s of IK drivers, and baking all of that down into a VAT. Lots of modern games use VATs for just that. Typically VAT will account for a tiny fraction of animations that you would see in a game. Using VAT for a skinned skeletal animation seems like a bad fit to me. That same 18s animation using exact curve data takes around < 100Kb of storage by comparison, that's 260,000% less
English
14
1
57
6.1K
Alex Goldring
Alex Goldring@SoftEngineer·
It can be, but typically those spectators have very low poly counts. You take a high-poly character with bones, animate it, then decimate it, then bake that into a vertex flipbook (VAT or whatever). The reason it's done, at least in part, is because skeletal animation is expensive in those engines.
English
1
0
1
21
悪魔的Z計画
悪魔的Z計画@akumatekiz·
@SoftEngineer その通りです。スキニングなら行列を保存した方が良いです。自分もそうしています。 ただ1つの短いアニメをするだけならVATの方が高速なので大量にいる観客やサイリュームライト等には良く使われていますね。
日本語
0
0
3
89
Alex Goldring
Alex Goldring@SoftEngineer·
@BrookeHodgman Yep, faces are complex, it's in that realm where skeletal animations just don't cut it or are a poor fit.
English
0
0
0
5
Alex Goldring
Alex Goldring@SoftEngineer·
True enough. I wrote a whole Buffer abstraction system on top of WebGL textures for meep ( @woosh/meep-engine" target="_blank" rel="nofollow noopener">npmjs.com/package/@woosh… ), that's what it uses for implementation of Virtual Geometry and meshlet-based rendering. discourse.threejs.org/t/virtually-ge… It's a workable solution, it's also a massive pain in the rear, compared to just having the raw memory access via buffers. In meep, I even have a page-based allocator on top of that buffer abstraction, it *almost* feels like working with real buffers, but the cost of such abstraction is non-zero. It also took me weeks to tune and debug in total.
English
0
0
0
23
Matthew Collison
Matthew Collison@MrCollison·
@SoftEngineer In the UK they add a Vertex animation charge to every purchase I love your rendering work btw. I feel like I recognize Shade from the @threejs forums when I was researching clustered rendering Are you that guy?
English
1
0
9
418
Alex Goldring
Alex Goldring@SoftEngineer·
Implemented per-vertex and per-object motions vectors for Shade #WebGPU. Integrated with TAA. Added motion optional motion blur post process. Not necessary, but it's something that requires high quality motion vectors, so why not.
English
2
3
77
5.7K
Alex Goldring
Alex Goldring@SoftEngineer·
@ivanpopelyshev @BrookeHodgman @marcsh Ah, cool. Yeah, there are a lot of limits that you can overcome a lot more easily with compute, especially once you add atomics and subgroup operations into the mix.
English
0
0
1
8
Alex Goldring
Alex Goldring@SoftEngineer·
It's a cool name "bone mixing". what is it? :D If you mean transform blending, I'm doing dual quat. If you mean performing hierarchical transforms - I do that too, without limits or doing multiple dispatches. If it's something highly specific like treating a skin like a pipe of fixed diameter...? The slides seem to have been made to accompany something. Maybe that's just me but there are no speaker notes. Let's start with a basic set of questions: 1. what problem does this solve? 2. who is this for? (use cases) 3. what is the main contribution?
English
1
0
1
16
Alex Goldring
Alex Goldring@SoftEngineer·
@BrookeHodgman Thanks, I haven't seen many assets on the web with 6 or 8 weights yet, I guess mostly due to asset size constraints. But I'll plan to support more than 4
English
0
0
0
5
Alex Goldring
Alex Goldring@SoftEngineer·
Depends on what you mean. A character with 66 bones and 4 weights is already quite heavy. Animating 100 characters with 16 bones each and 2 blend weights is easy. On the other hand, animating more than a few dozen fully rigged characters is hard. When you play a game, and you see a ton of characters on screen - they are typically LoDs, like maybe 20 are full rigs, and the rest are drastically simplified skeletons and skins, running at reduced animation tick rate. The reason is quite simple, if you have 100 bones to animate, that's 400 animation curves to sample, at least, 4 curves per quaternion of a bone, and that's just for rotation. If you have some kind of displacement going on, such as for hips, shoulders, face etc - that would be more curves. Then, we need to update node graph for bone hierarchies, that's a bunch of 4x4 matrix multiplications. Now, if we have animation blending - you have to multiply the previous work by the number of animations you're blending. Then we also have bounds calculation, because graphics engines need to know the space bounds that a mesh occupies. Very quickly the memory footprint explodes, and there's a ton of ALU as well. You can get far by carefully packing animation and transform data in memory, but it's inherently a problem with a massive amount of data. I was recently playing Cyberpunk 2077, and wherever you look - there are typically no more than ~16 animated characters on screen at the same time. Why "on screen" matters? - because a smart animation system can take advantage of that and pause animation if the character is not on screen. GPU-driven animation and skinning system are not really new, we've been moving in that direction in an ad-hoc way. Recent notable examples would be: 1. CDPR's Witcher 4 demo 2. Remedy's Alan Wake 2, where vegetation is driven through skinned animation 3. Warhammer 40k Space Marine 2, uses the same idea as Alan Wake 2 for animating huge number of Xenos
/// //@marcsh

@SoftEngineer @ivanpopelyshev A lot of modern games are running like 50 anims that are all being blended together in complex expensive ways So when folks talk about 'animations' they often mean those sorts of controllers

English
5
1
38
5K
Alex Goldring
Alex Goldring@SoftEngineer·
@BrookeHodgman Can you show me some examples and bones counts that you consider to be typical in 2026?
English
1
0
0
35
Alex Goldring
Alex Goldring@SoftEngineer·
@cassett31 These, basically, are 324 different meshes. Each skin is a full copy of the character. There would be no difference between this and a scene where every character being a different mesh, in terms of perf.
English
1
0
0
54
Alex Goldring
Alex Goldring@SoftEngineer·
For now - blending is coming, but controls are shipped to the CPU side. That is - the user is responsible for changing the current time if they want the animation to play from a different point. The user also sets playback_rate, which can be negative by the way, as well as weight. IK and other animation-adjacent features are not planned for now. There is space in the architecture for these features though.
English
0
0
1
47
Gabriel L. Kannenberg
Gabriel L. Kannenberg@gabriellkann·
@SoftEngineer Have you drafted a plan of animation features you would need to support for a full game? Since it's all GPU driven, I am curious how you would implement things like state machines, IK and transition blends. That's usually gameplay driven... What about multi-actor interactions?
English
1
0
1
58
Alex Goldring
Alex Goldring@SoftEngineer·
Actual skinning takes the most time, so it would help a lot. I'd say about 90% of the perf budget goes towards skinning and only about 10% towards actual animation. If you dropped vertex count per character from the 28k that this uses to 14k - you could expect perf to almost double.
English
0
0
1
107
Kryyative
Kryyative@KryyssX·
@SoftEngineer Out of curiosity. How much difference is the performance if a stylised aesthetic is used to lower the tri count per actor? Using a cel shader to justify the lack of detailing on the materials and mesh, for example.
English
1
0
1
122
Bartek Moniewski
Bartek Moniewski@BartekMoniewski·
@SoftEngineer When I hit that I immediately kill the context and start a fresh chat. There are few scientific papers that prove that any LLM will slowly deep-fry itself into full blown artificial psychosis after a bit of arguing.
English
1
0
2
19
Alex Goldring
Alex Goldring@SoftEngineer·
As an expert, AI gaslighting is frustrating. When Claude/Gemini/ChatGPT tries to convince you that black is white and sky is green, actually. Specific examples include hallucinating new API for WebGPU and trying to convince me that, "no, it really exists, you're just too uninformed"
English
5
0
11
1.3K
Alex Goldring
Alex Goldring@SoftEngineer·
@kteleho Looks interesting. I was thinking of supporting physics and attachments later on. For that you kind of need bones, and with bones skinning makes sense. I'll check out the video later, thanks for the tip.
English
0
0
0
441
kt
kt@kteleho·
@SoftEngineer youtu.be/xh0gT8acihE?si… one of the series. Not a big deal since VAT texture method is a common practice. Bake anims to vertex color as coords and animate using timeline. All on gpu, instanced mesh and state machine to control variations. Smth like this. Here 10000 meshes
YouTube video
YouTube
English
1
0
2
801
Alex Goldring
Alex Goldring@SoftEngineer·
@AgileJebrim @Colonthreee You're right, and to answer your question - a bit of both. It's hard to find the right balance, and some things end up being judgement calls.
English
0
0
0
52
Jebrim
Jebrim@AgileJebrim·
@SoftEngineer @Colonthreee Is realism the priority or real-time performance? To optimize for one is usually to deoptimize for the other.
English
1
0
1
53