Maximilian Beck

305 posts

Maximilian Beck banner
Maximilian Beck

Maximilian Beck

@maxmbeck

AI Research Scientist @Meta FAIR. Prev. ELLIS PhD Student @ JKU Linz & PhD Researcher @nx_ai_com, Research Scientist Intern @Meta FAIR

Linz, Österreich Katılım Haziran 2021
868 Takip Edilen1.2K Takipçiler
Sabitlenmiş Tweet
Maximilian Beck
Maximilian Beck@maxmbeck·
Yesterday, we shared the details on our xLSTM 7B architecture. Now, let's go one level deeper🧑‍🔧 We introduce ⚡️Tiled Flash Linear Attention (TFLA), ⚡️ A new kernel algorithm for the mLSTM and other Linear Attention variants with Gating. We find TFLA is really fast! 🧵(1/11)
Maximilian Beck tweet media
English
3
60
344
47.9K
Maximilian Beck
Maximilian Beck@maxmbeck·
Life update: A few weeks ago, I moved to Paris 🇫🇷 to start a new position as AI Scientist at Meta FAIR. I am excited about this new chapter and look forward to the opportunities ahead.✨
GIF
English
7
0
48
1.6K
Maximilian Beck retweetledi
Ai2
Ai2@allen_ai·
Recipes for teaching language models to handle long inputs don't work equally well across model families. We wanted to know why—is it the architecture, the training data, or both? 🧵
Ai2 tweet media
English
5
15
84
25K
Maximilian Beck retweetledi
Günter Klambauer
Günter Klambauer@gklambauer·
# GREAT news!!! 4 papers from our group got accepted at ICML 2026!!! # - 🧬 Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design - 🔁 xLSTM Distillation: Achieving Teacher-Student Parity Through Efficient Hybrid Architectures
English
1
4
19
2.9K
Maximilian Beck retweetledi
Sepp Hochreiter
Sepp Hochreiter@HochreiterSepp·
RNNs like xLSTM with vertically chunked inference strategy for efficient memory: arxiv.org/abs/2604.18199 Chunking enables a linear-time and constant-memory like TFLA for xLSTM arxiv.org/abs/2503.14376 To chunk blocks via recurrent updates and speed up computation considerably.
English
1
14
89
8.9K
Korbinian Poeppel
Korbinian Poeppel@KorbiPoeppel·
Well deserved! I can only thank you as well for being such a outstandingly great collaborator, too! 🙏 It's been an amazing time in Linz - thanks for putting us together, @HochreiterSepp !
Maximilian Beck@maxmbeck

👨‍🎓Last week, I successfully defended my PhD thesis - an incredibly exciting and rewarding milestone after 3.5 years of work on xLSTM: Recurrent Neural Network Architectures for Scalable and Efficient Large Language Models

English
1
1
10
544
Maximilian Beck
Maximilian Beck@maxmbeck·
And of course many thanks to @KorbiPoeppel for being an amazing co-author on nearly all xLSTM papers. I also want to thank all collaborators, friends, and family for their support.🤗
English
1
0
3
340
Maximilian Beck
Maximilian Beck@maxmbeck·
👨‍🎓Last week, I successfully defended my PhD thesis - an incredibly exciting and rewarding milestone after 3.5 years of work on xLSTM: Recurrent Neural Network Architectures for Scalable and Efficient Large Language Models
Maximilian Beck tweet media
English
16
3
138
8.6K
Maximilian Beck retweetledi
Niklas Schmidinger
Niklas Schmidinger@smdrnks·
Excited to share our new paper: Effective Distillation to Hybrid xLSTM Architectures. TL;DR: we retrofit / graft / distill / linearize Transformers into xLSTM-SWA hybrids with fixed-size states. This gives a practical path to studying linear and hybrid architectures starting from already strong pretrained models.
Sepp Hochreiter@HochreiterSepp

xLSTM Distillation: arxiv.org/abs/2603.15590 Near-lossless distillation of quadratic Transformer LLMs into linear xLSTM architectures enables cost- and energy-efficient alternatives without sacrificing performance. xLSTM variants of instruction-tuned Llama, Qwen, & Olmo models.

English
1
6
15
1.2K
Babak Rahmani
Babak Rahmani@babakRmni·
@maxmbeck Thanks Maximilian. Looking forrward to reading your upcoming work on CWMs :)
English
1
0
1
28
Maximilian Beck
Maximilian Beck@maxmbeck·
Very cool in depth prediction error analysis of Code World Model (CWM) 🌍 ⬇️⬇️⬇️ However, instead of „debugging code world models“, what about debugging WITH code world models? Stay tuned for more on this soon
Babak Rahmani@babakRmni

🧵Debugging Code World Models A few months ago we started studying CWMs. The plan was post-training an LLM on code execution traces. Two weeks in, we realised a paper by Meta had already done much of this : arxiv.org/pdf/2510.02387. We however identified what's wrong with them!

English
1
0
6
480