muke

1.3K posts

muke banner
muke

muke

@muke1010101

computer architecture phd student vroom vroom

Imperial Collage London Katılım Temmuz 2021
66 Takip Edilen141 Takipçiler
Sabitlenmiş Tweet
muke
muke@muke1010101·
Pleased to share my first ever paper :) 'Improving Memory Dependence Prediction with Static Analysis': arxiv.org/abs/2403.08056
English
3
6
22
2K
muke
muke@muke1010101·
@loveryteks well how cross-arch is ffmeg code, technically? doesn't it just detect the target its running on and jump to appropriate code at run time already?
English
1
0
1
21
sq
sq@loveryteks·
@muke1010101 then again, i don't keep up with the current state of the art with compilers or SIMD, the last time i was seriously into either wasin highschool.. maybe it's different now somehow, especially given that you can probably assume AVX2 is available everywhere now..
English
1
0
1
19
muke
muke@muke1010101·
@loveryteks yea I was definitely getting ahead of myself with superoptimisation there. there have been some interesting things in academia lately but nothing that translates to production yet (for instance arxiv.org/pdf/2306.00229) I do wonder how good MLIR for vectorisation could be though.
English
1
0
1
25
sq
sq@loveryteks·
@muke1010101 this isn't to imply endorsement of the perspective of the ffmpeg post, though.
English
1
0
1
87
muke
muke@muke1010101·
@csjh__ entirely from the assembly, which you could still use with Rust anyway!
English
0
0
0
12
muke
muke@muke1010101·
@csjh__ You're probably right (though there might also be something in knowing int's can't overflow for ffmpeg), but that just goes to show how little the difference between C and Rust matters so you should go with the memory safe option anyway. The extra performance they boast comes -
English
1
0
0
63
muke retweetledi
LaurieWired
LaurieWired@lauriewired·
Wow, RISC-V is really gaining traction again. Alibaba just announced the Xuantie C950…which is basically claiming Apple M1 (ish) levels of performance. I don’t see a lot of people talking about it! (2.6/Ghz SPECint2017, Apple M1 P-core also around ~2.6)
LaurieWired tweet mediaLaurieWired tweet media
English
41
176
2.1K
99K
muke
muke@muke1010101·
big discovery: the fields on job applications that make you manually fill out all the info in your CV are pointless, i left them blank and got an interview anyway
English
0
0
0
67
muke
muke@muke1010101·
@asian_catfish hadn't seen the message for literally 3 minutes 😭
English
0
0
0
7
muke
muke@muke1010101·
tragically mcpat is useless but still
English
1
0
0
68
muke
muke@muke1010101·
these also contain what is apparently the only gem5tomcpat.py parser script that works on simpoint checkpoint restores
English
1
0
0
75
muke
muke@muke1010101·
Also anyone knowledgeable help me out here - coroutines don't mess up stack operation order right? even when they return to the middle of another function, they don't leave values pushed to the stack without cleaning them up first
English
0
0
1
39
muke
muke@muke1010101·
this does complicate situation enough to where I'm more convinced this might be a novel idea though, so that's cool
English
1
0
1
45
muke
muke@muke1010101·
also its possible the technique i just described is actually more aggressive than what intel really does (it looks like they still need to check for validity) so that's cool too
English
1
0
3
80
muke
muke@muke1010101·
today's procrastination from my existing projects is implementing an open source version of something intel probably already does called push-pop acceleration
muke tweet media
English
1
2
16
1.4K