Nicolas Riciotti

344 posts

Nicolas Riciotti

Nicolas Riciotti

@twodelab

Freelance Creative Developer https://t.co/o8vjGNb19q

Paris, France Katılım Ağustos 2011
495 Takip Edilen1.6K Takipçiler
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@jo_chemla @BrianKaris Hey ! thanks !! - after all culling and LOD selection it renders about 30M tris, but it can handle scene with up to 500B tris, maybe even more - yep i plan on adding webGPU support to use compute shaders to make things faster :) - plz see my reply here: twitter.com/twodelab/statu…
Nicolas Riciotti@twodelab

@edankwan hey ! sorry for the late reply. I tried using uint, but was not sure how to encode both the depth & triangleIndex without breaking the gl.MIN test. imagine you have a depth of 0.01 and a triangleIndex of 999999, how to encode it to only use depth for the gl.MIN test ?

English
0
0
5
235
Jonathan Chemla
Jonathan Chemla@jo_chemla·
@twodelab @BrianKaris Very interesting, thanks for sharing! A few Qs: - what's the max poly count you ingested using your technique on standard gpu? - plan to translate to webgpu to use compute shaders? - why splitting the depth over 1/4th of the 0-1000 range rather than using full precision of int32?
English
2
0
2
627
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@edankwan hey ! sorry for the late reply. I tried using uint, but was not sure how to encode both the depth & triangleIndex without breaking the gl.MIN test. imagine you have a depth of 0.01 and a triangleIndex of 999999, how to encode it to only use depth for the gl.MIN test ?
English
1
0
1
499
ᴇᴅᴀɴ ᴋᴡᴀɴ
ᴇᴅᴀɴ ᴋᴡᴀɴ@edankwan·
@twodelab Why using float instead of uint? It seems more flexible to encode the data and precision however you want if you are using webgl2 anyway.
English
1
0
0
269
Nicolas Riciotti
Nicolas Riciotti@twodelab·
There’s obviously lot more going on in the pipeline, but this is just an example of the type of hacks used to make it work in Webgl. Hope this is helpful !
English
0
0
11
1.2K
Nicolas Riciotti
Nicolas Riciotti@twodelab·
And on top of this we can use MultipleRenderTargets to store more info, with 1 MRT for each info (triangle index, cluster index, page index, instance index). Since we store the depth in the integer part, that's what will be used to decide which fragment wins the gl.MIN test.
English
0
0
3
782
Nicolas Riciotti
Nicolas Riciotti@twodelab·
Using a Float32 frame buffer, storing the depth as integer, and triangle info in the decimal part. But doing so reduces the precision of the depth value we can store, so to compensate we can split the depth into 4 ranges, and store in either red, blue, green or alpha channel.
English
1
1
5
834
Nicolas Riciotti
Nicolas Riciotti@twodelab·
But if we use a single gl_Point per triangle, compute the triangle screen-space bounding box in the vertex shader, use that bounding box to set the gl_PointSize, and use a custom software rasterizer in fragment shader, we can already get a x2 boost compared to hardware rasterizer
English
2
0
12
1K
Nicolas Riciotti
Nicolas Riciotti@twodelab·
One of the main optimisations is the usage of a (pseudo) software rasterizer for micro triangles. I say 'pseudo', because since Webgl 2 doesn’t support compute shaders we cannot build a full software rasterizer as they use in Nanite.
English
1
0
3
1K
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@pervasivesense Hey ! Yes that’s only the automatic LOD system. I also started working on streaming+compression but want to improve the rendering before focusing more on that.
English
0
0
1
528
Adrian Sanchez
Adrian Sanchez@pervasivesense·
@twodelab What in the eff?? So is this only the auto LOD graph work that we're seeing here? Does this also implement the optimized mesh disk streaming/compression that Nanite does?
English
1
0
0
556
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@cemdemir Thanks ! and yes, it does. Still quite some work to do, but it's promising
English
0
0
0
708
Cem
Cem@cemdemir·
@twodelab Do you think it could potentially can render more triangles without sacrificing performance on browser? It works for unreal but I wonder if it can work with web without performance issues? Cool demo, can’t wait to hear more about it!
English
1
0
0
887
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@nwpointer Hey ! Not yet sorry, but I’ll give some more insights on how this works soon :)
English
1
0
9
1.1K
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@N8Programs @N8Programs hey ! from my testing with Transform feedbacks it seemed to be lot slower than the GPGPU approach. I guess it depends on the use case, but it was 3 to 5 times slower
English
1
0
1
313
N8 Programs
N8 Programs@N8Programs·
Something I'm confused about: why is transform feedback superior to simple GPGPU in fragment shaders? What does transform feedback enable that the GPGPU in fragment shaders doesn't?
English
5
0
18
5K
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@Pjbomb2 Hey ! Really nice work :) random guess, but did you try either clamping your new samples or making sure a new sample cannot be x times more brighter than the previously accumulated color? This helps with fireflies and a-trou filtering, so maybe it can help here
English
1
0
0
153
Payton
Payton@Pjbomb2·
so this is my current problem with ASVGF The first picture is obviously the problem exagerated(with all objects being almost perfect mirrors), but the second picture shows more how bad it is with even a metallic surface thats quite rough How do you fix/improve this?
Payton tweet mediaPayton tweet media
English
1
3
7
859
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@chrizzlibit Oh yes sorry, TLAS stands for « Top Level Acceleration Structure » so exactly like TLBVH + BLBVH :)
English
0
0
1
55
chris
chris@chrizzlibit·
@twodelab Nice! What does TLAS stand for? Is it something like TLBVH + BLBVH, like you would do for instancing? My problem currently is finding a spatial hierarchy that scales well for large open worlds ^^
English
1
0
0
52
Nicolas Riciotti
Nicolas Riciotti@twodelab·
@chrizzlibit Yes indeed it’s okay to have several objects/triangles using the same Morton code,at least at first. But these duplicates need to be handled when building the binary radix tree. Also as you guessed it doesn’t scale to large scenes (yet) but got plans for this(TLAS+per object bvh)
English
1
0
0
61
chris
chris@chrizzlibit·
@twodelab Ah, you‘re getting the bits differently then! That does sound more logical for „normal size“ scenes tho. Some objects would prob get the same morton code, but i guess exact order in localised areas doesn‘t matter much, right? I wonder if this scales to a large open world though…
English
1
0
0
71