
Training LFMs at scale means solving parallelism across every layer of the architecture. And not all layers are the same. Our CTO Mathias Lechner (@mlech26l) sits down with Liquid's founding engineer Paul Pak (@paulpak__) to talk training infrastructure: Data, tensor, pipeline, expert, and context parallelism, and how they make context parallelism work across hybrid architectures with both attention and convolution operators.














