Pete Brubaker (profile guided optimizer)

4.8K posts

Pete Brubaker (profile guided optimizer) banner
Pete Brubaker (profile guided optimizer)

Pete Brubaker (profile guided optimizer)

@pbrubaker

Principal Graphics Software Engineer @ Qualcomm ISPC contributor, machinist, welder, jack-of-all trades Ex. LucasArts/EA/ATVI/R*/Intel All opinions are my own.

Oregon Katılım Aralık 2008
519 Takip Edilen1.1K Takipçiler
Ryan Fleury
Ryan Fleury@rfleury·
Getting closer.
English
38
35
823
76.2K
Steven Savold
Steven Savold@atoi6664·
@rfleury Linux programmers will be getting the last 40 years of debugger technology improvements all at once. You are doing the lords work Ryan! Thank you
English
2
0
92
3.4K
FFmpeg
FFmpeg@FFmpeg·
This post shows the speed differences (larger is better) with intrinsics vs hand written assembly.
quink@quink_lamy

@FFmpeg The performance of same functions implemented in intrinsics vs asm on ARM64

English
10
9
376
33.3K
Jebrim
Jebrim@AgileJebrim·
@pbrubaker @FFmpeg @3131hue Yeah and there’s been nothing else like that aside from shader languages that have drivers with a CPU target.
English
1
0
1
99
shaur
shaur@xXshaurizardXx·
@FFmpeg in fact done correctly, and assuming our compiler is functioning correctly, it should have zero difference. if it maps to the right instructions, it maps to the right instructions. doesn't matter if you map it yourself these sorts of benchmarks are near useless without source
English
1
0
2
88
FFmpeg
FFmpeg@FFmpeg·
@3131hue No but there have been lots of similar things like that in the past that haven't stood the test of time and are now unmaintained
English
1
0
6
904
Jebrim
Jebrim@AgileJebrim·
@pbrubaker That’s a bit of a ridiculous claim considering phone calls are all about streaming audio and the whole point of digital is to get better compression than analog is able to deliver?
English
2
0
1
120
Jebrim
Jebrim@AgileJebrim·
I’m usually pretty optimistic about what computers can handle, but one thing I’m dead set on not being financially or technically feasible is a playable 120hz MMOFPS with 100k+ CCU. It’s simply not possible and you’d bankrupt yourself trying. Don’t bother.
Jebrim@AgileJebrim

That’s too optimistic. You cannot design tech like this assuming an ideal best case scenario. It’ll choke when everyone comes together into a dense area in sight of each other. Your packet sizes will explode. Here’s what the worst case represents: 100k players being told what 100k other players are doing 128 times a second. Even if every update was only 8 bytes per player, that’s over 100 Terabits per second. You need a serious networking culling mechanism designed to cut this down to only 1/10,000th of this in the worst-case to get a potentially viable 10 Gbps. Fiddling around the edges with small gains won’t be enough. And the task is further exacerbated by the fact that you’re now needing to transfer the workload from the network to compute instead. You want distance-based updates? You must compute at least roughly how far everyone is from everyone else within that 8ms. That’s potentially O(N^2) itself. And what about chat messages? Those are going to take up a bunch of packet bytes too, which are already in short supply. You’ll need to of course cull out these messages by locality and clan too. And we haven’t even talked about anything dynamic in the environment. An MMOFPS on this scale is simply not possible. I can see 1024 players. Maybe even 2048 players. But not anything beyond that.

English
5
0
24
3.5K
Pete Brubaker (profile guided optimizer) retweetledi
Dennis Gustafsson
Dennis Gustafsson@voxagonlabs·
My talk from @BetterSoftwareC last week is up on youtube. I present my findings on thread synchronization and job systems that I learned while parallelizing the physics solver. youtube.com/watch?v=Kvsvd6…
YouTube video
YouTube
English
17
96
728
94.9K
Pete Brubaker (profile guided optimizer) retweetledi
Kecho
Kecho@kechogarcia·
Awazing talk! Another technique (common in GPU) is persitant jobs: - let atomic C = 0 - fill machine->N jobs: - do work - prev = InterlockAdd(C, 1) - while (prev != N-1) spin - if (prev==N-1) prep next loop & break spin. - all threads persistent and require no schedule!
Dennis Gustafsson@voxagonlabs

My talk from @BetterSoftwareC last week is up on youtube. I present my findings on thread synchronization and job systems that I learned while parallelizing the physics solver. youtube.com/watch?v=Kvsvd6…

English
3
8
92
6.9K
Pete Brubaker (profile guided optimizer) retweetledi
NOTimothyLottes
NOTimothyLottes@NOTimothyLottes·
Re: i[ndy]Tech 7 (not 8) youtube.com/watch?v=0QYETn… - In simplified terms, the baked vintage lightmap is no more, replaced by simplified frame amortized raytracing building stable probe cascades ...
YouTube video
YouTube
NOTimothyLottes tweet media
English
2
7
53
4.7K
gingerBill
gingerBill@TheGingerBill·
C11 is the last version of C that I will use. All other versions after that are not C any more.
English
46
8
364
101.8K
Tim Sweeney
Tim Sweeney@TimSweeneyEpic·
Indeed Unreal Engine is moving to Left-Up-Forward coordinates everywhere, starting with UEFN, and coming to UE5-6 in an incrementally-adoptable way through UI settings and C++ helper functions/macros to ease the transition. This will align Unreal with Y-Up, right handed standards of USD and glTF. Why? Because future 3d tools and ecosystems will be increasingly interoperable and standards-based. There are a lot of missing standards we’ll need to propose, and Team Unreal will be far more successful proposing new things if we adopt and add to existing standards and conventions. The USD-glTF-Maya-Houdini quadrant is the center of mass for complex code-art-pipeline tooling that is highly sensitive to coordinates. (Flipping coordinates when exporting from AutoCAD or Blender is easy enough; changing a movie vfx pipeline is not). Coordinates based on project settings sound like a have-it-your-way compromise but are a combinatorial mess when projects are a mix of code modules and content packages from many independent authors. The best time to make this change would have been 1995, but I believe the second best time is now with the launch of Scene Graph in UEFN.
Tim Sweeney tweet media
Inu Games@games_inu

They changed the orientation of axis in ortho view in #UE5.6. Is the LUF coming here too?

English
243
638
4.5K
915K
Khoa Pham
Khoa Pham@KhoaPha47916056·
@phyronnaz Ispc is great, never tried it but really wanna get into it one day.
English
1
0
3
175
Victor Careil
Victor Careil@phyronnaz·
Doing a talk about ISPC in Linköping :)
Victor Careil tweet media
English
3
0
55
6.8K
Forrest Smith
Forrest Smith@ForrestTheWoods·
Linux package managers are amazing! sudo apt install llvm installs LLVM 14 need LLVM 19 Linux package managers are garbage trash and I'm tired of people pretending they're not.
English
158
19
807
80.6K