Zhen Liu (@ItsTheZhen) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Zhen Liu@ItsTheZhen·18 Eki

Can LLMs design real machines — from 🚗 cars to 🏹 catapults? Can they engineer through both 🧠 agentic workflows and 🌀 reinforcement learning (RL) — learning from physical simulation instead of text alone? We treat machine design as “machine code writing”, where LLMs assemble mechanisms from standard parts. To explore this, we built 🧩 BesiegeField — a real-time, physics-based sandbox where LLMs can build, test, and evolve machines through agentic planning or RL-based self-improvement. Our findings: 1️⃣ Even top LLMs fail to build working catapults — easy for humans but highly dynamic ⚙️ and nonlinear. 2️⃣ RL helps — working designs emerge through interaction. 3️⃣ Aligning reasoning 🧩 with construction 🔩 remains a key challenge. This marks the first step toward LLMs that learn to design through action — bridging reasoning, physics, and embodiment. 🛠️🤖 🌐 Project Website: besiegefield.github.io 💻 GitHub (RL & Agentic Workflow): github.com/Godheritage/Be… 👥 Joint work w/ @Besteuler & Wenqian Zhang

English

2

16

78

18.8K

Zhen Liu retweetledi

Weiyang Liu@Besteuler·6 Mar

🚀 Excited to introduce POET-X, a scalable and highly memory-efficient algorithm for LLM pretraining. ✨ LoRA-level GPU memory, better-than-AdamW pretraining performance! POET-X finally marries training stability (from POET's spectrum preservation) and practical scalability (from our new implementation and CUDA kernels). POET-X can pretrain billion-parameter LLMs (eg., Llama-8B) on a single NVIDIA H100, where standard optimizers like AdamW run out of memory under the same settings. We carefully reimplemented every computation step of POET (arxiv.org/pdf/2506.08001). POET-X combines many small checkpointing and parallelization tricks. While each may appear incremental, together they dramatically improve scalability and reduce memory usage by over 70% compared to the original POET. The memory-efficiency of POET-X comes from the unique parameter-efficient reparameterization (where sparsity comes in) of the weight update rule. POET-X bridges this gap between parameter efficiency and memory efficiency. Code is now public. Feel free to try it! ➡️ paper: arxiv.org/pdf/2603.05500 💻 Code: github.com/Sphere-AI-Lab/… 🌐 Website: spherelab.ai/poetx #AI #LLM #MachineLearning #DeepLearning

English

1

12

55

9.3K

Zhen Liu retweetledi

Haiwen (Haven) Feng@HavenFeng·23 Oca

✨Thinking with Blender~ Meet VIGA: a multimodal agent that autonomously codes 3D/4D blender scenes from any image, with no human, no training! @berkeley_ai #LLMs #Blender #Agent 🧵1/6

English

72

309

2.1K

332.4K

Zhen Liu retweetledi

Weiyang Liu@Besteuler·23 Oca

Interesting work! Doing proper normalization is definitely important for training neural networks stably. We considered hyperball normalization for convolutional neural networks back in 2018, see arxiv.org/pdf/1804.08071. Besides hyperball normalization, we also proposed multiple other normalization methods for weight/activation. Quite surprisingly, we also did gradient normalizaiton in order to make it actually work. See Section 4 of the Decoupled Networks paper. I somehow got the impression that many many old ideas are worth revisiting for LLM pretraining, especially those that stablizes the training (but may slightly hurt the performance for conventional CNNs).

Kaiyue Wen@wen_kaiyue

(1/n) Introducing Hyperball — an optimizer wrapper that keeps weight & update norm constant and lets you control the effective (angular) step size directly. Result: sustained speedups across scales + strong hyperparameter transfer.

English

3

27

206

16.4K

Zhen Liu@ItsTheZhen·15 Ara

@Michael_J_Black @SIGGRAPHAsia Congrats Michael! Cannot wait to see how physical AI is made SMPLer :D

English

0

1

164

Michael Black@Michael_J_Black·15 Ara

SMPL has just won the 2025 @SIGGRAPHAsia Test of Time Award. At 10 years old, it’s still going strong, and the community continues to find creative new uses. With the increasing interest in physical AI and humanoid robotics, the need for a low-dimensional, metrically accurate 3D representation of people and their movement is crucial. SMPL is an example of my motto — Build what you need and use what you build. SMPL’s success is due, in part, to the many methods and datasets that we've been built using it. But the largest impact has come from the broad community of researchers who have built an ecosystem and infrastructure that makes SMPL widely useful. We are not done yet with SMPL and related methods. There’s lots more to come! Congratulations to my wonderful co-authors: Matthew Loper, Javier Romero, @GerardPonsMoll1 , and @naureenmahmood. I am very fortunate to call them collaborators and friends. Keep it SMPL! meshcapade.com/smpl/

English

16

21

297

27.7K

Zhen Liu retweetledi

Weiyang Liu@Besteuler·13 Ara

Something related we have done in the past: SphereNet (arxiv.org/abs/1711.03189), Decoupled Networks (arxiv.org/abs/1804.08071). The central idea is to find a bounded activation to replace the normalization back then in CNNs. We have tried a few angular activation functions. Could be interesting to check them out. :)

Zhuang Liu@liuzhuang1234

Stronger Normalization-Free Transformers – new paper. We introduce Derf (Dynamic erf), a simple point-wise layer that lets norm-free Transformers not only work, but actually outperform their normalized counterparts.

English

0

13

116

17.4K

Zhen Liu@ItsTheZhen·5 Ara

GitHub: github.com/lzzcd001/vggfl…

English

0

2

223

Zhen Liu@ItsTheZhen·5 Ara

Can we efficiently and robustly finetune flow matching models with reinforcement learning using differentiable rewards, in an amortized way? Hint: use optimal control and match your velocity field with value gradients! Please come by our poster “Value Gradient Guidance for Flow Matching Alignment” at #NeurIPS2025 (Exhibit Hall C, D, E — #4906 Fri, Dec 5 | 4:30pm – 7:40pm PST) and learn more about our VGG-Flow! 🔗ArXiv: arxiv.org/abs/2512.05116 Joint work w/ @zdhnarsil @TimZXiao @cdomingoenrich @Besteuler

English

2

6

33

9.5K

Zhen Liu retweetledi

Weiyang Liu@Besteuler·15 Kas

🤩 This is awesome. When we are doing the agentic design project (besiegefield.github.io) using the Besiege game environment, we have to hack the game to get as much feedback as possible to do RL and stuff. However, I start to think differently after seeing the Genshin agent. We humans don’t need that much feedback to learn to master the game, and visual feedback is already sufficient. I am wondering what will happen if the agent learns to master many games this way. Will it develop some universal skills for game playing? Will it see the world differently?🤔

Weihao Tan@WeihaoTan64

🚀Introducing Lumine, a generalist AI agent trained within Genshin Impact that can perceive, reason, and act in real time, completing hours-long missions and following diverse instructions within complex 3D open-world environments.🎮 Website: lumine-ai.org 1/6

English

0

7

36

7.5K

Zhen Liu retweetledi

Weiyang Liu@Besteuler·27 Eki

🤯 Merging many finetuned LLMs into one model, effectively? Introducing Functional Dual Anchor (FDA), a new framework for model merging. 🚀 Current merging works poorly due to the underlying parameter conflicts. FDA shifts knowledge integration to the input-representation space for seamless merging. This "dual" perspective bridges the gap between post-hoc merging and joint multi-task training, reducing the knowledge conflicts. ✨ FDAs are synthetic anchors that precisely capture a finetuned model's functional shift. ✨ FDAs can complement existing model merging methods and achieves SOTA performance. ➡️ Paper: arxiv.org/abs/2510.21223 💻 Code: github.com/Sphere-AI-Lab/… 🌐 Project: spherelab.ai/fda #AI #LLM #MachineLearning #DeepLearning

English

10

92

602

34.5K

Zhen Liu@ItsTheZhen·18 Eki

Curious about how it all works under the hood? 📄 Full paper on arXiv: arxiv.org/abs/2510.14980

English

0

3

310

Zhen Liu@ItsTheZhen·18 Eki

Can LLMs design real machines — from 🚗 cars to 🏹 catapults? Can they engineer through both 🧠 agentic workflows and 🌀 reinforcement learning (RL) — learning from physical simulation instead of text alone? We treat machine design as “machine code writing”, where LLMs assemble mechanisms from standard parts. To explore this, we built 🧩 BesiegeField — a real-time, physics-based sandbox where LLMs can build, test, and evolve machines through agentic planning or RL-based self-improvement. Our findings: 1️⃣ Even top LLMs fail to build working catapults — easy for humans but highly dynamic ⚙️ and nonlinear. 2️⃣ RL helps — working designs emerge through interaction. 3️⃣ Aligning reasoning 🧩 with construction 🔩 remains a key challenge. This marks the first step toward LLMs that learn to design through action — bridging reasoning, physics, and embodiment. 🛠️🤖 🌐 Project Website: besiegefield.github.io 💻 GitHub (RL & Agentic Workflow): github.com/Godheritage/Be… 👥 Joint work w/ @Besteuler & Wenqian Zhang

English

2

16

78

18.8K

Zhen Liu@ItsTheZhen·18 Eki

🎬 We built this demo — and yes, edited it ourselves 😆 👇 Watch how LLMs design real machines through reasoning and reinforcement learning! x.com/ItsTheZhen/sta…

Zhen Liu@ItsTheZhen

Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?

English

0

3

768

Zhen Liu@ItsTheZhen·18 Eki

Joint work with @Besteuler and Wenqian Zhang

English

0

1

200

Zhen Liu@ItsTheZhen·17 Eki

Also available in on Hugging Face daily papers: huggingface.co/papers/2510.14…

English

1

0

2

273

Zhen Liu@ItsTheZhen·17 Eki

Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?

English

2

3

12

4.2K

Zhen Liu@ItsTheZhen·17 Eki

TL;DR: Meet BesiegeField—a playground where LLMs build, test, and refine machines from standard parts in real time. We tested agentic workflows and RLVR with top LLMs: even the strongest still show limits in compositional machine design. 🔗 besiegefield.github.io 🧵 below

Zhen Liu@ItsTheZhen

Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?

English

0

2

8

1.1K

Zhen Liu

Keşfet