Martin Genzel

126 posts

Martin Genzel

@MartinGenzel

Staff Machine Learning Researcher @MMerantix | Applied mathematician | Interested in Deep Learning, LLMs, Model Compression & Efficiency

Berlin, Germany Katılım Haziran 2021

614 Takip Edilen273 Takipçiler

Sabitlenmiş Tweet

Martin Genzel@MartinGenzel·8 Nis

The LLM compression community is obsessed with quantizing to ever lower bit-widths. But below 4 bits, returns start diminishing, requiring many tricks & hacks to prevent collapse. What if 8-bit + smarter compression beats the low-bit race? 📢 Thrilled to introduce EntQuant …

English

Martin Genzel retweetledi

Mattes Mollenhauer@gaussianmeasure·5 May

…catch the gang presenting this one in Seoul!

Mattes Mollenhauer@gaussianmeasure

New paper! We compress LLM parameters down to effective lengths of 2 bits/parameter while retaining 8 bit performance. The core idea is to first quantize with an entropy regularizer and then use lossless compression in the latent space.

English

123

Martin Genzel@MartinGenzel·8 Nis

👏 Shout-out to all co-authors for an amazing collaboration: @patrickputzky, @gaussianmeasure, Sebastian Schulze, Thomas Wollmann, and Stefan Dietzel 📄 Paper: arxiv.org/abs/2601.22787 🧑‍💻 Code: github.com/merantix-momen…

English

132

Martin Genzel@MartinGenzel·8 Nis

So what’s the price? Entropy coding adds some decode overhead at inference time. Inspired by the recent DFloat11 paper, we integrate a GPU-based ANS decoder into the forward pass, decoding weights on-the-fly, leading to a modest overhead compared to Marlin FP8 and BF16.

English

Martin Genzel@MartinGenzel·8 Nis

English

Martin Genzel retweetledi

Dan Alistarh@DAlistarh·15 Eyl

We're releasing the DASLab GGUF Quantization Toolkit! 🚀 First open-source toolkit bringing GPTQ + EvoPress to @ggerganov's GGUF format, enabling heterogeneous quantization based on importance. Result: Better models at the same file size. [1/5]

English

267

66.4K

Martin Genzel@MartinGenzel·20 Tem

That was fun! Check out our work here 👉 arxiv.org/abs/2502.01717

Mattes Mollenhauer@gaussianmeasure

Wrapping up ICML - had a blast, great to see so many people again! @MartinGenzel @patrickputzky presenting our work on variable size model compression at ES-FoMO workshop😎

English

149

Martin Genzel@MartinGenzel·10 Tem

Very much looking forward to it 😀 The whole @MMerantix research team will be at ICML! Let's connect 🤝 More details about our work on LLM compression 👉 x.com/MartinGenzel/s…

Mattes Mollenhauer@gaussianmeasure

I’m in Vancouver! Reach out if you want to grab a coffee and talk learning theory, model compression, Markov processes, inverse problems… We’ll also present a poster at Efficient Systems for Foundation Models Workshop!

English

126

Martin Genzel@MartinGenzel·25 Haz

Image sources: Llama: pixabay.com/illustrations/… Slider Tool: squoosh.app

English

Martin Genzel@MartinGenzel·25 Haz

👏 Big shout out to all co-authors for an amazing collab: @patrickputzky, Pengfei Zhao, Sebastian Schulze, @gaussianmeasure, Robert Seidel, Stefan Dietzel, and Thomas Wollmann 📄 Paper: arxiv.org/abs/2502.01717 🧑‍💻 Code: github.com/merantix-momen… 🤗 Models: tinyurl.com/bdacy658

Deutsch

248

Martin Genzel@MartinGenzel·25 Haz

📢 Excited to share our latest research @MMerantix on Any Compression of Foundation Models. We all know how intuitive and seamless image compression is: use a slider to specify your target size and get an instant preview. Our quest: Can compressing an LLM be just as easy? 🧵👇

English

850

Keşfet

@patrickputzky @gaussianmeasure @ggerganov @MMerantix @elonmusk @BarackObama @taylorswift13 @cristiano