Chumeng Liang (@lowerbad) - Twitter 个人资料

置顶推文

Continuous diffusion dominates image & video generation, but people used to believe that it inherently lags behind its discrete counterparts in language modeling. Today, we challenge this belief with LangFlow: the first continuous diffusion language model that rivals—and even beats—discrete diffusion. (1/7) Blog: caradryanl.github.io/blog/2026/lang… GitHub: github.com/nealchen2003/L… Arxiv: arxiv.org/abs/2604.11748

English

7

31

177

22.9K

Chumeng Liang@lowerbad·16 Nis

@sedielem Thank you! It is always good to find theoretical supports for classical works!

English

0

2

94

Sander Dieleman@sedielem·16 Nis

This work provides theoretical grounding for some of the design decisions (cross-entropy loss, learnable embeddings, self-conditioning, entropy-based schedule) in CDCD (arxiv.org/abs/2211.15089), and brings it into the modern era. Continuous text diffusion is still competitive!

Chumeng Liang@lowerbad

Continuous diffusion dominates image & video generation, but people used to believe that it inherently lags behind its discrete counterparts in language modeling. Today, we challenge this belief with LangFlow: the first continuous diffusion language model that rivals—and even beats—discrete diffusion. (1/7) Blog: caradryanl.github.io/blog/2026/lang… GitHub: github.com/nealchen2003/L… Arxiv: arxiv.org/abs/2604.11748

English

3

17

113

11.9K

Chumeng Liang@lowerbad·16 Nis

@jarridrb @FEijkelboom Yes!!! We will release the code after paper acceptance.

English

0

193

Jarrid Rector-Brooks@jarridrb·16 Nis

@lowerbad @FEijkelboom Will training code be released? Would love to try this out for real!

English

1

0

308

Chumeng Liang@lowerbad·16 Nis

Continuous diffusion dominates image & video generation, but people used to believe that it inherently lags behind its discrete counterparts in language modeling. Today, we challenge this belief with LangFlow: the first continuous diffusion language model that rivals—and even beats—discrete diffusion. (1/7) Blog: caradryanl.github.io/blog/2026/lang… GitHub: github.com/nealchen2003/L… Arxiv: arxiv.org/abs/2604.11748

English

7

31

177

22.9K

Chumeng Liang@lowerbad·16 Nis

@punyajoysaha Thank you for your attention. TESS series are great pretrained models while our work focuses on methodology at smaller scales for now. If we have the chance to scale up our model, we would love to compare it to TESS.

English

0

1

176

Punyajoy Saha@punyajoysaha·16 Nis

@lowerbad Why have u not compared with papers like TESS/TESS 2

English

1

0

1

243

Chumeng Liang@lowerbad·16 Nis

Thank you for the note. To our best knowledge, we believe LangFlow is the first to provide comprehensive and size-controlled ppl/gen ppl/entropy comparison across LM1B/OWT/zero-shot, and demonstrated clear win over best DDLM in significant portion of the tasks. We have included discussion on several brilliant recent concurrent work in DLMs, such as FMLM, we believe these few-step distillation techniques can be synergistically combined with our embedding-space DLM to further improve efficiency.

English

0

4

335

Nicholas Boffi@nmboffi·16 Nis

@lowerbad really nice work! but are you sure it's the first?

English

1

35

1.1K

Chumeng Liang@lowerbad·16 Nis

Thanks to our great advisors Prof. Liu @GeLiuSaber and Prof. You @youjiaxuan! Also to our fantastic team @nealchen2003 @SuiHangke @RuihanGuo2 @ccr_cheng!

English

0

1

178

Chumeng Liang@lowerbad·16 Nis

The potential of continuous DLMs extends far beyond just performance. They open the door for all continuous diffusion techniques to be introduced into language modeling: - One-step generation, such as Consistency Models - Guided generation, such as CFG - Unified multimodal generation, such as protein structure-sequence co-design LangFlow suggests: continuous diffusion is NOW a viable and promising paradigm for language modeling. (7/7)

English

1

10

769

Chumeng Liang 已转推

Jiaxuan You@youjiaxuan·21 Ara

🚨 RL for LLMs is finally accessible. Introducing OpenTinker: The first community-driven, open-source framework designed to democratize Reinforcement Learning for LLMs. github.com/open-tinker/Op… Inspired by @thinkymachines's amazing Tinker, we realize the biggest bottleneck in agentic LLM research isn’t the math—it’s the setup. Current RL pipelines are messy. Configuring VeRL for every single experiment is a productivity killer. OpenTinker fixed it. 🛠 How OpenTinker Works: Decoupled Design of Server and Client - Setup Once, Run Forever: Configure the OpenTinker backend on your GPU cluster once. - Develop Locally: Define your RL environments directly on your laptop. - Train on the Cloud: Simply point your local client to the backend. The cluster handles the compute; you handle the science. 📉 The 10x Development Efficiency Thanks to our elegant architectural decomposition, OpenTinker reduces the time to develop a new RL training pipeline by at least an order of magnitude. ⚡ Turn Idle GPU Compute into Gold Small labs often have underutilized hardware. OpenTinker turns your idle GPUs into an internal/external API service for - RL Training - SFT - Inference 🎯 Who needs OpenTinker? - Researchers tired of infrastructure hell. - Labs needing to standardize workflows. - Teams wanting to maximize hardware ROI. Thanks my amazing PhD student @realagi25 for leading the project. We are building the future of open RL infra. Be the first to build with us. 👇 Start Building with OpenTinker Now 🚀 Repo: github.com/open-tinker/Op… 🌐 Blog: open-tinker.github.io/opentinker-pag… If you believe RL should be accessible to everyone, give us a star, repost this 🔄 post, and let us know what agents you plan to build!

English

15

148

1.1K

57.9K

Chumeng Liang@lowerbad·9 Ara

Goooood job

Zhanhui Zhou@asapzzhou

(1/n) Tiny-A2D: An Open Recipe to Turn Any AR LM into a Diffusion LM Code (dLLM): github.com/ZHZisZZ/dllm Checkpoints: huggingface.co/collections/dl… With dLLM, you can turn ANY autoregressive LM into a diffusion LM (parallel generation + infilling) with minimal compute. Using this recipe, we built a 🤗collection of the smallest diffusion LMs that work well in practice. Key takeaways: 1. Finetuned on Qwen3-0.6B, we obtain the strongest small (~0.5/0.6B) diffusion LMs to date. 2. The base AR LM matters: Investing compute in improving the base AR model is potentially more efficient than scaling compute during adaptation. 3. Block diffusion (BD3LM) generally outperforms vanilla masked diffusion (MDLM), especially on math-reasoning and coding tasks.

English

0

149

Chumeng Liang 已转推

Jiaxuan You@youjiaxuan·12 Kas

Multi-Agent Evolve is now fully open-source 🚀 With our codebase, you can pick your favorite LLM checkpoint and let it self-evolve, WITHOUT external supervision 💻Code: github.com/ulab-uiuc/Mult… 🤗Model Checkpoints: huggingface.co/collections/ul… Feedback and contributions are welcome!

Jiaxuan You@youjiaxuan

Introducing Multi-Agent Evolve 🧠 A new paradigm beyond RLHF and RLVR: More compute → closer to AGI No need for expensive data or handcrafted rewards We show that an LLM can self-evolve — improving itself through co-evolution among roles (Proposer, Solver, Judge) via RL — all without external supervision. On Qwen2.5-3B-Instruct, Multi-Agent Evolve boosts average accuracy from 55% → 58% across 22 benchmarks. Remarkably, the model automatically learns balance among roles: - The Proposer first generates easy tasks. - The Judge refines the difficulty metric. - The Proposer then raises the challenge, forcing the Solver to improve. - The system co-evolves until reaching equilibrium. Multi-Agent Evolve: LLMs self-improve through co-evolution. 📄 Paper: arxiv.org/abs/2510.23595 💻 Code (coming soon): github.com/ulab-uiuc/Mult…

English

4

44

219

32.1K

Chumeng Liang 已转推

Jiaxuan You@youjiaxuan·12 Kas

We believe future forecasting is the ultimate challenge for agentic LLMs. 🚀 Live Trade Bench is now fully open-sourced! It’s the first live, real-world benchmark testing 20+ LLMs on financial forecasting. 📄 Read our 37-page paper detailing insights from a 2-month live trading experiment: 👉 arxiv.org/abs/2511.03628 📊 Track real-time performance across 20 LLMs here: 👉 trade-bench.live 💻 Developers interested in LLM benchmarking or trading? Try it out with: pip install live-trade-bench 🔗 Code: github.com/ulab-uiuc/live…

English

6

20

133

10.8K

Chumeng Liang@lowerbad·12 Kas

Great job!

Zhanhui Zhou@asapzzhou

(1/n) 🚨 BERTs that chat: turn any BERT into a chatbot with diffusion hi @karpathy, we just trained a few BERTs to chat with diffusion — we are releasing all the model checkpoints, training curves, and recipes! Hopefully this spares you the side quest into training nanochat with diffusion for now 🙂. It’s both a hands-on tutorial for beginners and an example showing how to use our complete toolkit (dLLM) for deeper projects. Code: github.com/ZHZisZZ/dllm Report: api.wandb.ai/links/asap-zzh… Checkpoints: huggingface.co/collections/dl… Motivation: I couldn’t find a good “Hello World” example for training a minimally working yet useful diffusion language models, a class of bidirectional language models capable of parallel token generation in arbitrary order. So I tried finetuning BERTs to make it chat with discrete diffusion—and it turned out more fun than I expected. TLDR: With a small amount of open-source instruction-following data, a standard BERT can gain conversational ability with diffusion. Specifically, a finetuned ModernBERT-large, with a similar number of parameters, performs close to Qwen1.5-0.5B.

English

0

2

84

Chumeng Liang@lowerbad·12 Kas

@asapzzhou @karpathy Great job!

English

0

1

26

Zhanhui Zhou@asapzzhou·11 Kas

(1/n) 🚨 BERTs that chat: turn any BERT into a chatbot with diffusion hi @karpathy, we just trained a few BERTs to chat with diffusion — we are releasing all the model checkpoints, training curves, and recipes! Hopefully this spares you the side quest into training nanochat with diffusion for now 🙂. It’s both a hands-on tutorial for beginners and an example showing how to use our complete toolkit (dLLM) for deeper projects. Code: github.com/ZHZisZZ/dllm Report: api.wandb.ai/links/asap-zzh… Checkpoints: huggingface.co/collections/dl… Motivation: I couldn’t find a good “Hello World” example for training a minimally working yet useful diffusion language models, a class of bidirectional language models capable of parallel token generation in arbitrary order. So I tried finetuning BERTs to make it chat with discrete diffusion—and it turned out more fun than I expected. TLDR: With a small amount of open-source instruction-following data, a standard BERT can gain conversational ability with diffusion. Specifically, a finetuned ModernBERT-large, with a similar number of parameters, performs close to Qwen1.5-0.5B.

Andrej Karpathy@karpathy

Nice, short post illustrating how simple text (discrete) diffusion can be. Diffusion (i.e. parallel, iterated denoising, top) is the pervasive generative paradigm in image/video, but autoregression (i.e. go left to right bottom) is the dominant paradigm in text. For audio I've seen a bit of both. A lot of diffusion papers look a bit dense but if you strip the mathematical formalism, you end up with simple baseline algorithms, e.g. something a lot closer to flow matching in continuous, or something like this in discrete. It's your vanilla transformer but with bi-directional attention, where you iteratively re-sample and re-mask all tokens in your "tokens canvas" based on a noise schedule until you get the final sample at the last step. (Bi-directional attention is a lot more powerful, and you get a lot stronger autoregressive language models if you train with it, unfortunately it makes training a lot more expensive because now you can't parallelize across sequence dim). So autoregression is doing an `.append(token)` to the tokens canvas while only attending backwards, while diffusion is refreshing the entire token canvas with a `.setitem(idx, token)` while attending bidirectionally. Human thought naively feels a bit more like autoregression but it's hard to say that there aren't more diffusion-like components in some latent space of thought. It feels quite possible that you can further interpolate between them, or generalize them further. And it's a component of the LLM stack that still feels a bit fungible. Now I must resist the urge to side quest into training nanochat with diffusion.

English

21

118

980

176.1K

Chumeng Liang@lowerbad·3 Kas

We therefore build a benchmark to extract paper diagrams from arXiv with one command and evaluate the quality of LLM-generated diagrams accordingly, with a agentic template to generate diagrams. Show your tricks for producing high-quality paper diagrams on our new benchmark!

English

0

100

Chumeng Liang@lowerbad·3 Kas

Representing a diagram as a directed graph, our EMNLP paper shows that over 50% of nodes and 60% of edges (between correct nodes) are incorrect in LLM-generated paper diagrams (see the last figure). Diagram generation remains a mission incomplete.

English

1

0

119

Chumeng Liang@lowerbad·3 Kas

[EMNLP 2025] Wanna generate diagrams with Nano Banana? Our new benchmark show that the quality might not be as good as you expect. (1/n) 👇 Paper: arxiv.org/abs/2510.25761 GitHub: github.com/ulab-uiuc/diag…

English

2

4

13

3.7K

Chumeng Liang

发现