Jeff Plaisance

23 posts

Jeff Plaisance

Jeff Plaisance

@jeffplaisance

Austin, TX Katılım Mart 2009
52 Takip Edilen48 Takipçiler
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@geofflangdale Yep, and the "interpreter-like" approach is generally what is done, but in many circumstances it leaves a lot of performance on the table over human-written SIMD code for the specific query. RTCG has its own issues though, like balancing compilation time against execution time.
English
0
0
0
12
Geoff Langdale
Geoff Langdale@geofflangdale·
@jeffplaisance to always find primitives that are big enough that composing them via "interpreter-like" mechanisms isn't a particularly big penalty. But I can see that for truly optimal performance on "not known in advance" queries it might be necessary to take the RTCG path.
English
1
0
0
19
Geoff Langdale
Geoff Langdale@geofflangdale·
Thinking a bit about "pipeline design" for software. If we've got a reasonably complex problem, we might have a pipeline with 4-5 major stages and perhaps dozens of instructions. I spend a lot of time struggling with not just how the problem should be solved, but how it ...
English
3
1
9
1.1K
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@geofflangdale On a related note, do you happen to know why _mm512_mask_compressstoreu_epi8 is way slower than _mm512_maskz_compress_epi8 followed by _mm512_mask_storeu_epi8 on zen 4 (and possibly other microarchitectures?)
English
1
0
0
16
Geoff Langdale
Geoff Langdale@geofflangdale·
without being mugged by some ludicrously slow operation. Most vendors seem to have perpetrated this at some stage: it's not a dig at any particular company.
English
3
0
5
476
Geoff Langdale
Geoff Langdale@geofflangdale·
Radical idea: ISAs come with performance bounds. Implementations outside the bounds (which might be expressed in terms of 'relative cost' of operations: e.g. this double-shuffle is no more than 6x the latency and reciprocal throughput of an ADD or AND) are non-conforming.
English
2
1
20
4K
Emil Stenström 🎛️
Emil Stenström 🎛️@EmilStenstrom·
@lemire @Love2Code To add some more nuance: compute is only relevant when the code you run are in the “hot path”. For almost all _web_ applications, that’s database queries and network requests. The backend language is mostly just waiting.
English
2
0
1
257
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
I've open sourced my B++ Trees library, which is a B+ tree library that I wrote in C++. It can be used as a normal B+ tree, but it also has optional mixins that can be used to access elements by index, calculate prefix sums, or find the min/max element github.com/jeffplaisance/…
English
0
0
0
84
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@xeraa @fulmicoton @papers_we_love there is a third option of using userfaultfd and madvise to get the benefits of hardware accelerated virtual address resolution with more control over eviction than is possible with mmap. still has some of the downsides of both but could be the right tradeoff for many systems.
English
0
0
1
0
Philipp Krenn
Philipp Krenn@xeraa·
interesting take on MMAP. so is the counter take ("compares apples to camels"): ravendb.net/articles/re-ar… PS: looking forward to when we can do @papers_we_love vienna meetups again (in-person only)
Andy Pavlo (@andypavlo.bsky.social)@andy_pavlo

After many years of warning people to not use MMAP as a buffer pool in their databases, our "Never use MMAP in Your Database" @cidrdb paper + video is finally available. #CIDR2022 db.cs.cmu.edu/mmap-cidr2022/

English
2
3
6
0
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@GrahamJenson @ohunt if you squint at it hard enough it's just insertion sort with some extra unnecessary steps. it still works if the inner loop goes from 0..i although the intermediate steps turn out a bit different.
English
0
0
1
0
Graham Jenson
Graham Jenson@GrahamJenson·
x = [5,2,4,3,1] for i in range(len(x)): for j in range(len(x)): if (x[i] < x[j]): x[i], x[j] = x[j], x[i] # swap print(x) # [1, 2, 3, 4, 5] A weird sorting algorithm from arxiv.org/pdf/2110.01111… via @ohunt How this works?
GIF
English
4
0
2
0
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@fulmicoton not sure but the best source for this type of info is the intel optimization manual
English
1
0
0
0
Paul Masurel 🦀
Paul Masurel 🦀@fulmicoton·
Super newbie branch prediction question. Can a CPU engage in more branches when it is doing speculative execution? If yes, is there a depth limit?
English
2
0
1
0
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@JussiMeresmaa Annoyed about European Amateur Open signup. Site said it opened at 3 pm CEST but it was full by 2:45. Wrong timezone on site.
English
0
0
0
0
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@DGWorldTour Annoyed about European Amateur Open signup. Site said it opened at 3 pm CEST but it was full by 2:45. You put wrong timezone.
English
0
0
0
0
Jeff Plaisance retweetledi
Jed Kolko
Jed Kolko@JedKolko·
Planning your exit? Job searches on @indeed to Canada were up 10x in the hours after election was called.
Jed Kolko tweet media
English
4
100
56
0
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@jedisct1 Yes the docs only cover AWS setup right now but we will be adding instructions for non-AWS installations in the future.
English
1
0
0
0
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
@jedisct1 Yes it can, it can use HDFS as an alternative to S3 for shard storage.
English
1
0
0
0
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
Proud to annouce that Indeed has open sourced our distributed analytics and machine learning platform, Imhotep. go.indeed.com/uvque
English
1
12
20
0
Jeff Plaisance
Jeff Plaisance@jeffplaisance·
my twitter got haxed... some of that stuff was kind of offensive but it wasn't me :(
English
0
0
0
0