Paul Pak

65 posts

Paul Pak

@paulpak__

Founding ML Engineer @ https://t.co/McxuNB4efY. __syncthreads();

Katılım Mayıs 2022

203 Takip Edilen383 Takipçiler

Paul Pak@paulpak__·1d

@mlech26l and I review some of the basics of ML training infra here. These days we’ve been spending a lot of time deeper in the stack, iterating on some exciting bets in compilers, quantization, long-context optimization, communication-compute overlap, etc. Even though @liquidai models are optimized for deployment on edge, pushing the frontier of large-scale training on GPUs is just as relevant today.

Liquid AI@liquidai

Training LFMs at scale means solving parallelism across every layer of the architecture. And not all layers are the same. Our CTO Mathias Lechner (@mlech26l) sits down with Liquid's founding engineer Paul Pak (@paulpak__) to talk training infrastructure: Data, tensor, pipeline, expert, and context parallelism, and how they make context parallelism work across hybrid architectures with both attention and convolution operators.

English

3.3K

Paul Pak retweetledi

Liquid AI@liquidai·30 May

fine-tune LFM2.5-8B-A1B for your tasks and let it cook! 🧑‍🍳 huggingface.co/LiquidAI/LFM2.…

atomic.chat@atomic_chat_hq

Liquid's LFM2.5-8B-A1B smashed OpenAI's gpt-oss-20b on tool calling We ran both locally on a MacBook Pro M5 Max, 64GB, and gave each the same trip-planning request that only completes if the model fires all 7 tool calls - weather for 3 cities, two currency conversions, an email and a reminder Outputs: LFM2.5-8B-A1B: 4.8 GB RAM usage, 7/7 tool-calls, 266 tok/s, 6.9s OpenAI gpt-oss-20b: 11 GB RAM usage, 3/7 tool-calls, 146 tok/s, 15.0s The 8B used less than half the RAM and still fired all 7 calls, while the 20B silently dropped more than half of its own. It also ran ~2x faster, wrapping the full agentic request in 6.9s against 15s. That's what 38T training tokens buy: a 1B-active MoE that nails the agentic tool calls a model 2.5x its active size keeps dropping

English

443

37.6K

Paul Pak retweetledi

Liquid AI@liquidai·28 May

Today, we're releasing LFM2.5-8B-A1B, a device-optimized model designed to power real-life applications on phones, laptops, PCs, robots, and fast & lightweight server-side use-cases. > 8B MoE, 1.5B active > Expanded 128K context > LFM2.5 flagship hybrid MoE architecture > Trained on 38T tokens + large-scale RL > fast, reliable tool calling, punching above its weight, comparable to models with up to 4x its size > customizable on a single GPU for any specialized task > LFM2 open-weight license 🧵

English

137

505

3.8K

1.2M

Paul Pak retweetledi

Liquid AI@liquidai·11 May

We’re on our way, Tokyo! Apply to join our 2-day hackathon co-hosted with @wayequity and @AMD. Engineers, founders, and mentors across the Liquid AI and WAY ecosystems will gather to ship real-world applications to accelerate Japanese industry, powered by our Liquid Foundation Models (LFMs). Selected participants, in teams of 1-3, will create applications/workflows to address real-world problems only possible with LFMs. Top projects will be awarded: / Gold Prize - $3K USD / Silver Prize - $2K USD / +Internship offers, community recognition, and more.

English

111

19.3K

Paul Pak retweetledi

Liquid AI@liquidai·23 Nis

We’re entering a multi-year partnership with @MercedesBenz to scale embedded, on-device intelligence for their third- and fourth-generation MBUX. Our goal: to make the driver/vehicle relationship even more natural and effortless. Read more about our partnership: liquid.ai/press/liquid-a…

English

225

42.4K

Paul Pak retweetledi

Liquid AI@liquidai·31 Mar

Today, we release LFM2.5-350M. Agentic loops at 350M parameters. A 350M model trained for reliable data extraction and tool use, where models at this scale typically struggle. <500MB when quantized, built for environments where compute, memory, and latency are constrained. 🧵

English

278

2.3K

346.4K

Paul Pak retweetledi

Liquid AI@liquidai·13 Mar

a vision language model too fast for human eyes! kudos @xenovacom 🐐

Ramin@ramin_m_h

model’s so fast, Josh had to slow down the video capture to show case this demo! @liquidai

English

447

50.1K

Paul Pak retweetledi

Liquid AI@liquidai·24 Şub

Today, we release our largest LFM2 model: LFM2-24B-A2B 🐘 > 24B total parameters > 2.3B active per token > Built on our hybrid, hardware-aware LFM2 architecture It combines LFM2’s fast, memory-efficient design with a Mixture of Experts setup, so only 2.3B parameters activate each run. The result: best-in-class efficiency, fast edge inference, and predictable log-linear scaling all in a 32GB, 2B-active MoE footprint. 🧵

English

154

1.1K

229.1K

Paul Pak retweetledi

Liquid AI@liquidai·20 Oca

Today we release LFM2.5-1.2B-Thinking, a reasoning model that runs entirely on-device. What needed a data center two years ago now runs on any phone with 900 MB of memory. > Trained specifically for concise reasoning > Generates internal thinking traces before producing answers > Enables systematic problem-solving at edge-scale latency > Shines on tool use, math, and instruction following

English

253

1.8K

319.2K

Paul Pak retweetledi

Ramin@ramin_m_h·15 Oca

5.1M LFM downloads and going! not too bad @liquidai

San Francisco, CA 🇺🇸 English

116

11.2K

Paul Pak retweetledi

Liquid AI@liquidai·6 Oca

Today, we release LFM2.5, our most capable family of tiny on-device foundation models. It’s built to power reliable on-device agentic applications: higher quality, lower latency, and broader modality support in the ~1B parameter class. > LFM2.5 builds on our LFM2 device-optimized hybrid architecture > Pretraining scaled from 10T → 28T tokens > Expanded reinforcement learning post-training > Higher ceilings for instruction following 🧵

English

257

1.6K

210.7K

Paul Pak@paulpak__·6 Oca

@bnjmn_marie Hey Ben, there's an open PR in vLLM to resolve this issue and allow for quantization on conv layers. Should be merged into main relatively soon.

English

Benjamin Marie@bnjmn_marie·6 Oca

Quantizing LFM2.5 and running the models with vLLM: ignore the "conv" layers during quantization. vLLM will crash if you quantize them.

English

278

Paul Pak retweetledi

Liquid AI@liquidai·4 Ara

Today we introduce Liquid Labs, our advanced research unit, with the goal of understanding and building efficient and adaptive intelligence systems. Liquid Labs consolidates our existing research efforts at Liquid across architecture of foundation models, multimodality, training, data, and inference. The lab also will be home to new frontier research work across the broad range of foundation model build-up stack. Read the full announcement: liquid.ai/blog/introduci… We are hiring: jobs.ashbyhq.com/liquid-ai Also find us at NeurIPS 2025 exhibition hall! 🚀

English

245

42.7K

Paul Pak retweetledi

Liquid AI@liquidai·1 Ara

The LFM2 Tech Report is now live on arXiv! We share everything from our novel hardware-in-the-loop architecture design, pre-training, and knowledge distillation, to the post-training recipe for small models. > 🤗LFM2 class of models has over 3.3M downloads > ⚛️LFM2 nanos from 350M to 8.3B MoE > 👁️Vision-language capabilities (LFM2-VL) > 👄👂Multimodal speech processing (LFM2-Audio) > 🗒️Information retrieval (LFM2-ColBERT) We hope this serves as a useful resource and inspiration for anyone building open and efficient foundation models. 🚀

English

198

29.8K

Paul Pak retweetledi

Liquid AI@liquidai·19 Kas

This weekend, more than 70 engineers, researchers, and builders convened at our SF office for Hack the Edge, a 48-hour sprint co-hosted with @AIatAMD. Equipped with @AMD mini-PCs powered by Ryzen™ AI, Liquid Foundation Models and the ROCm™ stack, teams built, fine-tuned, and deployed edge-native applications entirely on-device. They produced more than 20 applications spanning multimodal search, sensor intelligence, real-time audio understanding, and low-latency agents – demonstrating what’s possible when speed, efficiency and capability converge on the edge. Congratulations to our Hack the Edge cash prize winners! See what they built 👇

English

Paul Pak retweetledi

Liquid AI@liquidai·13 Kas

Today, we’re announcing our partnership with @Shopify to bring Liquid Foundation Models (LFMs) to core commerce experiences. Shopify will license LFMs to enhance search and recommendations, improving relevance, conversions, and customer experience at scale. The first production deployment is a sub‑20ms LFM that enhances search. Shopify and Liquid have also co-developed a generative recommender model with a novel HSTU architecture. In controlled tests, the model beat the previous stack, leading to higher conversion rates from recommendations. 👇

English

154

101.4K

Paul Pak@paulpak__·7 Eki

LFM2-8B-A1B for fast, efficient inference on edge CPUs and GPUs. Gated Short-Convs + GQA + Mixture-of-Experts. Mixed-precision training w/ blockwise FP8+BF16. At Liquid, we value rigorous and robust optimization of ML systems across pretraining, postraining, and inference. Come join our movement!

Liquid AI@liquidai

Meet LFM2-8B-A1B, our first on-device Mixture-of-Experts (MoE)! 🐘 > LFM2-8B-A1B is the best on-device MoE in terms of both quality and speed. > Performance of a 3B-4B model class, with up to 5x faster inference profile on CPUs and GPUs. > Quantized variants fit comfortably on high-end phones, tablets, and laptops. Enabling fast, private, low-latency applications across modern phones, tablets, laptops, and embedded systems. 1/n 🧵

English

1.6K

Paul Pak retweetledi

Liquid AI@liquidai·1 Eki

Today, we expand our LFM2 family to audio. 👂👄 LFM2-Audio is an end-to-end audio-text omni foundation model, and delivers responsive, real-time conversation on-device at just 1.5B parameters. One model. Seamless multimodal support. No chains. > Speech-to-speech > Speech-to-text > Text-to-speech > Audio classification > Open weights 10x faster inference vs peers, with quality rivaling systems 10x larger. 1/n 🧵

English

455

42.3K

Paul Pak retweetledi

Liquid AI@liquidai·25 Eyl

Introducing Liquid Nanos ⚛️ — a new family of extremely tiny task-specific models that deliver GPT-4o-class performance while running directly on phones, laptops, cars, embedded devices, and GPUs with the lowest latency and fastest generation speed. > model size: 350M to 2.6B > built on LFM2, our v2 efficient model architecture > perform competitively with models up to hundreds of times larger > enable core agentic tasks: precise data extraction, multilingual translation, tool call, math, and RAG. 1/n

English

143

1.2K

590.8K

Paul Pak retweetledi

Liquid AI@liquidai·18 Ağu

Big step for on-device AI: Liquid AI’s Edge Platform, LEAP, now supports @AMD Ryzen™ and Ryzen AI™ processors, bringing powerful, low-latency AI directly to laptops. Here’s what it means for developers and enterprises 🧵

English

186

66.9K

Keşfet

@mlech26l @liquidai @wayequity @AMD @MercedesBenz @xenovacom @bnjmn_marie @AIatAMD