Post

@Adriksh I once implemented a ring buffer for something, no longer recall what. Worked a treat 🙂
English

@Adriksh Thanks 4 sharing. U explain complicated concepts well. My days as a code writer have ended many years ago and while I believe the "new generation" is far more talented l, capable, & has tools we could only dream of, they lack basic understanding that would make them 10x better.
English

@Adriksh "no memcpy" -> writes a bytewise memory copy manually and stores the whole data in the ring.
The way to do this right is to pair it with an object pool and pass the pointers through the ring. That is zero copy. Of course your object pool is likely to use the same structure.
English

@Adriksh Nice refresher!
Looking forward to a similar write-up on MPMC implementation using CAS
English

@Adriksh If the slot content fits into a single world you can use it both to check wrapping/emptiness in offer/poll - enabling interesting optimization like looking ahead by N for emptiness. That distribute the contention vs the producer/consume sequences
English

@Adriksh An important point is that the acquire loads let previous instructions be reordered after it. This means that tail can be loaded after head, but not after the following instructions because of data dependency. This guarantees correctness.
English

@Adriksh This leaves a lot of performance on the table. Count cache misses per message! See intel.com/content/www/us…
English

@Adriksh Been using ring buffers since my early days writing async communications code in the 80's.
English

@Adriksh In push I think you’re still missing std::atomic_thread_fence(std::memory_order_release); after writing data to ring buffer, but before updating atomic. You need to flush that data to be visible to other thread, otherwise only head and tail are guaranteed to be visible.
English

@Adriksh Why don't we use memcpy instead of running for loop ?
English

















