Jon Kennedy

281 posts

Jon Kennedy

Jon Kennedy

@jon_c_kennedy

Nvidia dev tech engineer. Opinions are my own, blah, blah, blah... 😁

UK Katılım Ağustos 2015
210 Takip Edilen215 Takipçiler
Jon Kennedy retweetledi
NVIDIAGameDev
NVIDIAGameDev@NVIDIAGameDev·
Tomorrow at 9 AM PT, get your questions answered about NVIDIA #RTX Path Tracing. Join our #AMA to discover how you can take advantage of NVIDIA technologies to accurately recreate the physics of all light sources in a scene. Start asking questions now.
English
0
3
15
4.7K
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@Dachsjaeger @D_S_O_Gaming I would think there *is* a difference in fluidity between configs. SMT/e-cores increase variance, depending on where different threads are run. If a thread is limiting perf and it suddenly runs on an e-core, or has to share resources on an SMT core, then it will cause variance.
English
1
0
0
177
Alexander Battaglia
Alexander Battaglia@Dachsjaeger·
@D_S_O_Gaming Those slightly different frame-rates though are not really an interesting thing for real performance. The real performance is found in fluidity, right? There is not a difference in frame-time spikes between different E, P, HT, etc. configurations.
English
2
0
53
8.5K
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@matiasgoldberg IIRC it can speculatively get the neighbouring cachelines. Not a good thing if you are already hitting false sharing!
English
0
0
0
55
Matías N. Goldberg
Matías N. Goldberg@matiasgoldberg·
Being a seasoned C++ developer means you know you need a local container variable for better performance if you're going to iterate a lot, due to how C++ rules work.
Matías N. Goldberg tweet media
English
29
66
491
115.4K
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@matiasgoldberg don't forget the prefetcher which can catch you out with false sharing - might want padding[256] to be sure... 🫤
English
1
0
1
147
Matías N. Goldberg
Matías N. Goldberg@matiasgoldberg·
When you swap the container into a local variable, you make the problem go away. Technically it still exists, but the false sharing only happens at the beginning & end of the function. But not per iteration. Another solution would be to do:
Matías N. Goldberg tweet media
English
3
0
15
6.8K
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@CapFrameX @antdavis1986 @Dachsjaeger @matiasgoldberg but isn't that due, in part, to the massive benefit of the huge L3 in the 5800X3D? Combine that cache with a fast CPU and its a killer combo! The faster CPUs get, the more other resources become the bottleneck, like L3 and DRAM.
English
1
0
0
105
Matías N. Goldberg
Matías N. Goldberg@matiasgoldberg·
If single core CPU usage is low and GPU usage is also low then the bottleneck is neither CPU nor GPU. Could be something else like PCIe or RAM BW bottleneck, or too many thread sync points per frame, or a buggy sleep(), or the game bugs out the more cores you have.
English
9
3
68
27.4K
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@matiasgoldberg @Dachsjaeger HT doesn't always deliver - always worth trying to disable it to see if you get extra perf 🤔 Unless the SW threads are pinned, I wouldn't expect a core to be 100% utilised in a game.
English
0
0
0
45
Matías N. Goldberg
Matías N. Goldberg@matiasgoldberg·
@Dachsjaeger @jon_c_kennedy What a weird spread: The game is ignoring Hyperthreading (questionable choice, but understandable), no CPU core ever reaches 100% (?!) and 3 E cores at 18%. Maybe the missing 20% in those two 80% threads can be explained by CPU-hopping, waiting for GPU, for PCIe or E Cores
English
3
0
2
661
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@toncijukic @matiasgoldberg @Dachsjaeger I've also seen issues a unity title with negative scaling related to core count. Too many threads trying to get work in a badly designed job system that only has a single job queue lock - results in job queue starvation.
English
1
0
1
43
Tonči Jukić
Tonči Jukić@toncijukic·
There might be other issues at hand, but I'm not exactly certain about the culprit. I've seen a Unity title (stock engine and HDRP) go so low on CPU and GPU that GPU doesn't even clock to 3D mode and neither of the cores reached above 30%, not even E-cores. PCIE was not loaded either. Smells like some bug other than usual suspects. API overhead? Scheduler? Cache issues? Driver? It "fixed itself" some weeks later apparently as I wasn't on that project for a while before trying again and finding it fixed.
English
1
0
1
124
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@CapFrameX @Dachsjaeger @matiasgoldberg I think single thread perf is still pretty important, but cinebench MT is not representative at all of games, despite certain IHV's insistance that it is 😉 DRAM and L3 cache can have massive impacts. Ultimately though, it is all game dependent!
English
0
0
4
129
CapFrameX
CapFrameX@CapFrameX·
@Dachsjaeger @matiasgoldberg @jon_c_kennedy You cannot adapt Cinebench numbers to gaming performance. Gaming is a completely different workload. Singlethread performance isn't that important for gaming performance. Memory performance is more important.
English
5
0
7
849
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@Dachsjaeger @matiasgoldberg HT off won't help as its not using the HT cores by the looks of it (core 1, 3, 5 etc are 0%. If you can capture an ETW/ETL file, that should tell us something. You can view by thread, then by CPU to see if it is bouncing around but fully saturating 'a' single core.
English
1
0
1
133
Alexander Battaglia
Alexander Battaglia@Dachsjaeger·
@matiasgoldberg @jon_c_kennedy I will test it with HT off and E-Cores off as a control for that behaviour you mention but I cannot atm as the game's DRM has literally locked me out of it for 24 hours since I tried the game on 2 different PCs back to back.
English
1
0
7
609
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@Dachsjaeger @matiasgoldberg Yes - could still be CPU bound though. Either for 70% of the workload (waiting on the GPU for the rest), or the bound thread could just hop CPUs, so it is 70% on core 15 and 30% on core N == 100%
English
0
0
2
60
Jon Kennedy
Jon Kennedy@jon_c_kennedy·
@matiasgoldberg @Dachsjaeger Note that CPU utilisation shown will be for the full CPU, so could easily be CPU bound on a single thread. What CPU is he using and what freq is it clocked at? Just open up task manager to see whats going on (per logical core)...
English
1
0
5
597
Matías N. Goldberg
Matías N. Goldberg@matiasgoldberg·
@Dachsjaeger has anyone tried disabling CPU cores? If the game sees less cores, it may spawn fewer threads and behave differently. Also monitor PCIe Load in GPU-Z
English
1
1
13
2.3K
Laura Reznikov 🇺🇦
Laura Reznikov 🇺🇦@TheAnimator·
@KostasAAA Yup! Was gunna do rendering equation stickers too :D For the UK, I’m just trying to figure out if I need to do anything about VAT 🤔
English
1
0
3
268
Jon Kennedy retweetledi
Laura Reznikov 🇺🇦
Laura Reznikov 🇺🇦@TheAnimator·
RenderThreads.com is open for business! 😱 Ever wanted graphics geek swag that wasn’t conference or business branded? Me too! It’s currently setup USA only as I figure out shipping etc, but if you send me a message I’ll see what I can do. #rendering #SmallBusiness
English
15
19
106
21.8K
Jon Kennedy retweetledi
Digital Foundry
Digital Foundry@digitalfoundry·
In the space of just four years we've somehow moved from a path-traced Quake 2 to a path-traced Cyberpunk 2077... but how? Here's @Dachsjaeger with a new Tech Focus on the hardware and software advances that made this remarkable achievement possible: youtu.be/vigxRma2EPA
YouTube video
YouTube
English
16
33
367
147.3K
Jon Kennedy retweetledi
Intel
Intel@intel·
Today, we lost a visionary. Gordon Moore, thank you for everything.
Intel tweet media
English
446
6.2K
19.6K
4.5M