Johannes Rudolph

487 posts

Johannes Rudolph

Johannes Rudolph

@virtualvoid

Fiddling bits since 2342 @[email protected]

Присоединился Ocak 2009
297 Подписки1.5K Подписчики
Закреплённый твит
Johannes Rudolph
Johannes Rudolph@virtualvoid·
My job as a Rust Engineering Lead at a startup became victim of restructuring and I'm looking for a new role. I enjoy working on infra software, Open Source, with Rust, Scala, or whatever is needed. I'm also available contractually to support previous libs like Pekko Http.
English
6
27
94
15.2K
Johannes Rudolph
Johannes Rudolph@virtualvoid·
@hadilq @debasishg If you need fast append/insert/delete, usually the solution is not "just an array" but clever pre-allocation schemes and a mixture between linear and linked data to amortize the cost of updates while still keeping most of the linear memory benefits for sequential access.
English
1
0
1
85
Hadi
Hadi@hadilq·
@debasishg Wait what?! Did I understand it correctly? If you use LinkedList, with O(1) add/remove, in your LRU implementation, instead of array, with O(n) insert/delete, the array one will be faster because of CPU architecture!?
English
3
0
4
4.3K
Debasish (দেবাশিস্) Ghosh 🇮🇳
Linux recently improved their page fault handling replacing Linked Lists and Red Black trees with a new data structure that offers better cache friendliness. This talk has some pointers to why linked lists fail on modern CPUs and what it takes to make a cache friendly data structure. The Maple Tree FTW .. youtu.be/TEHRMzZ01nE?si…
YouTube video
YouTube
English
7
72
669
75.6K
Johannes Rudolph
Johannes Rudolph@virtualvoid·
@debasishg @hadilq Another downside is that the reference chain in a linked list adds a dependency chain between elements in the list that prevents potential OOO benefits and SIMD optimizations.
English
0
0
2
57
Debasish (দেবাশিস্) Ghosh 🇮🇳
With a linked list every time u traverse to the next item in the list, u r jumping to a totally random location in the memory. And potentially it's a cache miss. With earlier processors, processor speeds were roughly comparable to that of a memory access. So there was no substantial diff between accessing the next element of an array and that of a linked list. But this diff is huge with modern processors using L1, L2 or even L3 caches. So a cache miss is much costlier today. With arrays, elements are placed in contiguous locations and hence cache hit ratio is much more compared to linked lists where elements are placed in random memory locations.
English
1
3
73
4.6K
Johannes Rudolph
Johannes Rudolph@virtualvoid·
@forked_franz I see. I looked into uprobes again for github.com/jvm-profiling-… but JIT code / code in anon mappings is still not supported. It seems that you can use mem eXecution watchpoints to trace function entry but something like return probes still seems quite elusive.
English
0
0
1
33
Francesco Nigro
Francesco Nigro@forked_franz·
@virtualvoid But the problem is that will observe just the java side , and with some overhead too :/. And will be blind to native frames and the rest of the kernel stacks, clearly...(2/2)
English
1
0
0
119
Francesco Nigro
Francesco Nigro@forked_franz·
Anyone aware of #function-graph-tracer" target="_blank" rel="nofollow noopener">kernel.org/doc/html/v4.18… but in some Java profiler?
English
2
0
4
1.1K
Johannes Rudolph
Johannes Rudolph@virtualvoid·
Almost 50 years ago mastermind Loriot created this visualization of what happens during a rolling upgrade if you implemented sharding naively (w/o consistent hashing) dailymotion.com/video/x2x2dhm
GIF
English
1
0
2
298
Johannes Rudolph
Johannes Rudolph@virtualvoid·
@lukasz_bialy @WojciechM_dev Great stuff, maybe I should retry llama2.scala on Scala Native as well since it was also hampered e.g. by ByteBuffer issues.
English
0
0
6
343
Łukasz Biały
Łukasz Biały@lukasz_bialy·
In these last hours of #1brc challenge I want to share with you my small exploration not of how fast I can make JVM go BRRRRR but of whether Scala Native is now a real, usable contender in the space of functional langs compiled to native binaries. The biggest issue, of course, was whether I can parallelise the load. Scala Native did not support multithreading for the longest time but now that has changed! In mid-January I implemented a relatively (no Panama and Unsafe, no SWAR) fast solution inspired by work of @sampullara, Yavuz Tas and @royvanrijn. Then, I cooked the binary using scala-cli and benchmarked it against the fastest JVM in the tournament. #scala 1/*
English
2
20
53
9.8K
Johannes Rudolph
Johannes Rudolph@virtualvoid·
Thank you, @scrollprize, for awarding me a prize in the Vesuvius Challenge for my Open Source contributions cataloguing released segments and running OS ink detection models on them. It has been a fun journey and great collaboration! blog.virtual-void.net/2023/12/11/ves…
English
0
0
2
594
Johannes Rudolph
Johannes Rudolph@virtualvoid·
@ernerfeldt @scrollprize Thanks for egui :) For the first time in 15 years, I have been enjoying working on a gui after avoiding GUIs at all cost because of the lack of good options...
English
0
0
1
49
Johannes Rudolph
Johannes Rudolph@virtualvoid·
@hetzner The whole point of cloud volumes is to be able to view storage detached from compute. Optimally, storage can be easily migrated to new compute in cases of problems. Quite a bummer if storage cannot be relocated in cases like this... @Hetzner_Online
English
0
0
1
373
Johannes Rudolph
Johannes Rudolph@virtualvoid·
Mmh, @hetzner cloud volume hangs while unmounting since one hour. No more actions on the server are possible, after shutting the node down other mounted volumes are also blocked, server cannot be restarted. Big share of cluster storage is unavailable => extended downtime.
English
1
0
1
518
Johannes Rudolph
Johannes Rudolph@virtualvoid·
Also, the day, when you first run a code generation model on your engine asking for suggestions about how to improve algorithms of the underlying engine 🤯 (the suggestions are mostly weird and broken, admittedly, but had some ideas to follow up) gist.github.com/jrudolph/fb764…
English
0
1
1
396
Johannes Rudolph
Johannes Rudolph@virtualvoid·
@RealNeilC @karpathy's llama2.c is interesting because it removes all abstraction and breaks down inference into a single file of C (which can be seen as a lingua franca of computation because data structures map closely to actual memory layout and operations to CPU instructions)
English
0
0
1
60
Johannes Rudolph
Johannes Rudolph@virtualvoid·
@RealNeilC Python's speed is mostly irrelevant since ML means instructing the GPU which matrix op to do next. In big models, each multiplication takes so long that it dwarfs the overhead of the language. If you are looking at state-of-the-art inference on CPU look at llama.cpp.
English
1
0
1
198
Johannes Rudolph ретвитнул
Apache Pekko
Apache Pekko@ApachePekko·
Apache Pekko Http 1.0.0 has been released, see #pekko-http" target="_blank" rel="nofollow noopener">pekko.apache.org/download.html#… and pekko.apache.org/docs/pekko-htt… for more details. Apache Pekko Http also includes Scala 3.3 support, the result of an ongoing community effort that spanned years!
English
0
36
75
8.8K