Bert Van den Bosch

711 posts

Bert Van den Bosch banner
Bert Van den Bosch

Bert Van den Bosch

@bertvdbosch5

CS grad @KULeuven | Engine- & Graphics-Engineer @Cyborn3D

Belgium Katılım Aralık 2016
582 Takip Edilen114 Takipçiler
Sabitlenmiş Tweet
Bert Van den Bosch
Bert Van den Bosch@bertvdbosch5·
The future of Blender Hair + Node workflow?😉 Been playing around with geometry nodes to get some hair interpolation, clumping and curls going! Watch the full length demo: youtu.be/2M8F8PvWiA4 #b3d #geometrynodes
YouTube video
YouTube
English
2
5
59
0
Bert Van den Bosch retweetledi
beginbot 🃏
beginbot 🃏@beginbot·
wow @garrytan just exposed Anthropic as total frauds Claude Code was ONLY 512K LOC ☹️ Gary is shipping 37K LOCs PER DAY so Gary could recreate all of Claude Code in ONLY 13 days! a supposedly $380 billion is big trouble
English
80
173
6.7K
424.3K
Bert Van den Bosch
Bert Van den Bosch@bertvdbosch5·
@duhonedd @SebAaltonen Interesting take, although most renderers running on Adreno 7xx are probably already saturating texture pipe hw because of the material textures?
English
0
0
0
118
Eric Duhon
Eric Duhon@duhonedd·
Maybe a compromise. Positions stay available for hardware vert fetch. Rest change to load. Since positions probably benefit the most from hardware fetch, binning pass. I'd be curious to see a performance comparison on Adreno. If hit is small enough, I might take the simplicity instead. Especially since adreno will dump the fetch units at some point anyway, if they haven't already in 800 or newer
English
1
0
1
131
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
Setting specialization constants in our shader PSO create API. You provide a span of constants. In the example the span is filled from initializer list (all stack objects, no allocs or copy). .specializationConstants = { gfx::shader_constant_bool(alpha_clip_enabled) }
Sebastian Aaltonen tweet media
English
7
3
94
8.5K
Bert Van den Bosch
Bert Van den Bosch@bertvdbosch5·
@SebAaltonen Also in the HypeHype renderer? Still lots of mobile gpus using texture hw instead of raw buffer loads when doing vertex pulling... Looking at you adreno
English
1
0
0
462
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
I will be nuking the vertex buffer layout soon. 100% vertex pulling in the future. I don't like having to describe my data layouts to the graphics API. It should not need to know my data layout.
English
3
0
33
10.6K
Bert Van den Bosch
Bert Van den Bosch@bertvdbosch5·
@SebAaltonen Wait, what is the difference between the two shader variants (old) and the new 2 hardcoded shaders?😅
English
0
0
0
8
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
Specialization constant discard timings (6K res): G-buffer: 2 hardcoded shaders: 1.44ms Spec constants: 1.45ms Shadows: 2 hardcoded shaders: 0.10ms, 0.07ms, 0.27ms, 0.07ms Spec constants: 0.11ms, 0.05ms, 0.27ms, 0.07ms Identical performance. Some measurement noise of course.
Sebastian Aaltonen tweet media
English
3
4
50
5.3K
Bert Van den Bosch
Bert Van den Bosch@bertvdbosch5·
@SebAaltonen How is the order of the drawcalls influencing the TBDR binning, I was under the impression the binning pass only executed the vertex shader?
English
1
0
0
88
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
If you run all shaders that could perform discard last, you guarantee that rest of the scene doesn't suffer from worse Z-compression / early-Z / Hi-Z performance. And TBDR doesn't need to do extra partial tile evaluations.
English
2
2
33
3.6K
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
Alpha clip timings (6K res): G-buffer: Discard in main shader: 2.67ms Two shader variants: 1.86ms (70%) Binned (discard last): 1.44ms (54%) Shadows: 2 variants: 0.17ms, 0.10ms, 0.29ms, 0.22ms Binned: 0.10ms, 0.07ms, 0.27ms, 0.07ms (65%) Thread...
Sebastian Aaltonen tweet media
English
2
8
126
13.1K
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
SSAO = 0.32ms Bilateral = 0.03ms (9.3%) Pretty good.
English
1
0
20
3.2K
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
SSAO bilateral blur: - Codex initial version = 0.085ms - With gather4 optimizations = 0.03ms fp16 optimizations didn't make it any faster, but hopefully save power.
Sebastian Aaltonen tweet media
English
7
10
189
14.2K
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
@CUDAHandbook Vertices are not in SoA layout on purpose, because index buffer is not contiguous. You want AoS memory layout for vertices. In SoA layout, you end up with N cache misses instead of 1 whenever the index is not contiguous. Not good.
English
1
0
10
371
Sebastian Aaltonen
Sebastian Aaltonen@SebAaltonen·
This reminded me that I should kill vertex buffers and vertex input assembly setup APIs in our RHI. We no longer need GLES backwards compatibility. Vertex vertex = buffer.vertices[gl_VertexID];
Tech Bro Memes@techbromemes

English
10
12
462
32.8K
Bert Van den Bosch retweetledi
Gabriel Dechichi
Gabriel Dechichi@gdechichi·
@kimmonismus you aren’t though. none of these things are games, none of them have been shipped, people are not playing them, they are twitter demos. if this guy can prompt all these games into existence, why is he selling courses instead of selling games?
English
11
12
402
5.4K
Bert Van den Bosch
Bert Van den Bosch@bertvdbosch5·
@arno_coomans These results look very promising! Are there also any comparisons between probes and this technique in a more 'open-world/outdoor' scenario where there is possibly less variance in the GI?
English
1
0
0
467
Arno Coomans
Arno Coomans@arno_coomans·
Our Neural Irradiance Volume (Eurographics 2026) permits real-time rendering of large scenes with dynamic objects and moving lights, while providing a higher quality at a given memory budget (10x improvement over probe grids!). Project page: arnocoomans.be/eg2026/.
English
22
115
1.1K
60.7K
Bert Van den Bosch
Bert Van den Bosch@bertvdbosch5·
@BattleAxeVR What about using vulkan texture blitting? Provides a hw fast path on most mobile gpus struggling with vram bw
English
0
0
0
4
BattleAxeVR
BattleAxeVR@BattleAxeVR·
Single-pass MIPGEN in CS are certainly beneficial and I will implement it in my own engine, however very low priority. I profiled dynamic mipgen on post-DLSS upscaled images for lens unwarping on my RTX 5090 and it was very, very fast. But lower VRAM bw GPUs wd benefit, no doubt.
Kiriakos Gavras@Kiiiri7

Wrote a blog post about the Single Pass Downsampler I implemented for my depth pyramid. Code included. syllogi-graphikon.vercel.app/posts/metal-si…

English
2
0
1
249
Bert Van den Bosch retweetledi
RejectedShotgun
RejectedShotgun@RejectedShotgun·
Daily reminder that Bungie made every Halo level a .ASS file and every object you put inside the .ASS is known as a poop. This is hard-coded, you cannot change it.
RejectedShotgun tweet media
English
44
862
13K
260.1K
:3
:3@Colonthreee·
Either my account is dead, shadowbanned, or notifications don't work.
English
3
0
6
196
Bert Van den Bosch
Bert Van den Bosch@bertvdbosch5·
@gdechichi @panoskarabelas Personally I batch by material e.g. large indirect draw per material type. Still have to figure out some compaction if all draws get culled indirectly… This still leaves the user’s option to use a single master material and use a single batch!
English
0
0
0
69
Gabriel Dechichi
Gabriel Dechichi@gdechichi·
@panoskarabelas how you are handling material uniform buffers for different material types? is the layout determined by the largest material struct, or is there some indirection?
English
2
0
12
1.5K
Panos Karabelas
Panos Karabelas@panoskarabelas·
I got GPU driven rendering working thanks to the single geometry buffer, bindless textures and samplers, and an uber shader approach that keeps PSOs down. Very low state permutations, the engine lives up to the Spartan name, and it made the implementation much easier, leaving it for last was the right call. Draw calls dropped a bit more too, at this point most of them are probably ImGui.
Panos Karabelas tweet media
Panos Karabelas@panoskarabelas

I switched from per mesh vertex and index buffers to a single global geometry buffer that holds everything. This is the same idea id Software used in id Tech for Doom, and you can see the vertex and index binding counts drop, which opens the road for easier GPU driven rendering.

English
5
1
89
15.5K