Tim Xiao

GRaM Workshop at ICLR 2026@GRaM_org_

3

85

Tim Xiao أُعيد تغريده

Weiyang Liu@Besteuler·6 Mar

🚀 Excited to introduce POET-X, a scalable and highly memory-efficient algorithm for LLM pretraining. ✨ LoRA-level GPU memory, better-than-AdamW pretraining performance! POET-X finally marries training stability (from POET's spectrum preservation) and practical scalability (from our new implementation and CUDA kernels). POET-X can pretrain billion-parameter LLMs (eg., Llama-8B) on a single NVIDIA H100, where standard optimizers like AdamW run out of memory under the same settings. We carefully reimplemented every computation step of POET (arxiv.org/pdf/2506.08001). POET-X combines many small checkpointing and parallelization tricks. While each may appear incremental, together they dramatically improve scalability and reduce memory usage by over 70% compared to the original POET. The memory-efficiency of POET-X comes from the unique parameter-efficient reparameterization (where sparsity comes in) of the weight update rule. POET-X bridges this gap between parameter efficiency and memory efficiency. Code is now public. Feel free to try it! ➡️ paper: arxiv.org/pdf/2603.05500 💻 Code: github.com/Sphere-AI-Lab/… 🌐 Website: spherelab.ai/poetx #AI #LLM #MachineLearning #DeepLearning

English

1

12

55

9.2K

Tim Xiao أُعيد تغريده

Weiyang Liu@Besteuler·23 Oca

Interesting work! Doing proper normalization is definitely important for training neural networks stably. We considered hyperball normalization for convolutional neural networks back in 2018, see arxiv.org/pdf/1804.08071. Besides hyperball normalization, we also proposed multiple other normalization methods for weight/activation. Quite surprisingly, we also did gradient normalizaiton in order to make it actually work. See Section 4 of the Decoupled Networks paper. I somehow got the impression that many many old ideas are worth revisiting for LLM pretraining, especially those that stablizes the training (but may slightly hurt the performance for conventional CNNs).

Kaiyue Wen@wen_kaiyue

(1/n) Introducing Hyperball — an optimizer wrapper that keeps weight & update norm constant and lets you control the effective (angular) step size directly. Result: sustained speedups across scales + strong hyperparameter transfer.

English

3

27

206

16.4K

Tim Xiao أُعيد تغريده

Sharvaree Vadgama@SharvVadgama·18 Ara

😍Excited to organize @GRaM_org_ 2.0 this time at #ICLR2026 🇧🇷 🌟 Looking forward to your best works on geometry-grounded representations, inductive bias, and structure in learning. This year, we also welcome works on 🌐open problems, ⚔️discussions on scale vs symmetry, 👊 position papers and more! Deadline: 30th January AOE

📢The second edition of ✨GRaM workshop✨ is here this time at #ICLR26. 🌟Submit your exciting works in Geometry-grounded representations. We welcome submissions in multiple tracks i.e. 📄 Proceedings 📝extended abstract 👩‍🏫Tutorial/blogpost as well as an exciting challenge!

English

9

29

2.5K

Tim Xiao أُعيد تغريده

Yuxuan Xue@yxue_yxue·15 Ara

It's time! I will present InfiniHuman at 16:30 in room S421 at #SIGGRAPHAsia2025 . Please join me if you want to generate avatars with fine-grained multi-modal control! @ympradyumna will present PhySIC at 16:30 in room S221. Join him to turn 2D image to 3D human + Scene!

Yuxuan Xue@yxue_yxue

#InfiniHuman: Infinite 3D Human Generation with Precise Control How do you want to generate a 3D avatar? From text description? With clothing images? Or some desired body shape? All can be done at once with InfiniHuman! 🔗Page: yuxuan-xue.com/infini-human/ #SIGGRAPHAsia2025 #AI

English

1

14

871

Tim Xiao أُعيد تغريده

yingzhen@liyzhen2·12 Ara

An exciting PhD opportunity at StatML CDT (Imperial) + Institute of Cancer Research, with Oliver Ratmann, Richard Houlston and yours truly ☺️: "Machine Learning for Cancer Susceptibility Genetics" Oct 2026 entry, apply to StatML CDT by Jan 8 2026. RT🙏 docs.google.com/document/d/1lS…

English

3

34

2.2K

Tim Xiao أُعيد تغريده

Zhen Liu@ItsTheZhen·5 Ara

Can we efficiently and robustly finetune flow matching models with reinforcement learning using differentiable rewards, in an amortized way? Hint: use optimal control and match your velocity field with value gradients! Please come by our poster “Value Gradient Guidance for Flow Matching Alignment” at #NeurIPS2025 (Exhibit Hall C, D, E — #4906 Fri, Dec 5 | 4:30pm – 7:40pm PST) and learn more about our VGG-Flow! 🔗ArXiv: arxiv.org/abs/2512.05116 Joint work w/ @zdhnarsil @TimZXiao @cdomingoenrich @Besteuler

English

6

33

9.5K

Tim Xiao@TimZXiao·28 Kas

@Besteuler I guess the review quality for the coming ICML will be very high😆

English

1

130

Weiyang Liu@Besteuler·28 Kas

I was wondering whether it was ICLR's bold move to improve review quality. 🤣

ICLR 2026@iclr_conf

English

🚀Introducing Lumine, a generalist AI agent trained within Genshin Impact that can perceive, reason, and act in real time, completing hours-long missions and following diverse instructions within complex 3D open-world environments.🎮 Website: lumine-ai.org 1/6

1

8

4.2K

Tim Xiao أُعيد تغريده

Weiyang Liu@Besteuler·15 Kas

🤩 This is awesome. When we are doing the agentic design project (besiegefield.github.io) using the Besiege game environment, we have to hack the game to get as much feedback as possible to do RL and stuff. However, I start to think differently after seeing the Genshin agent. We humans don’t need that much feedback to learn to master the game, and visual feedback is already sufficient. I am wondering what will happen if the agent learns to master many games this way. Will it develop some universal skills for game playing? Will it see the world differently?🤔

Weihao Tan@WeihaoTan64

English

7

36

7.5K

Tim Xiao أُعيد تغريده

Weiyang Liu@Besteuler·27 Eki

🤯 Merging many finetuned LLMs into one model, effectively? Introducing Functional Dual Anchor (FDA), a new framework for model merging. 🚀 Current merging works poorly due to the underlying parameter conflicts. FDA shifts knowledge integration to the input-representation space for seamless merging. This "dual" perspective bridges the gap between post-hoc merging and joint multi-task training, reducing the knowledge conflicts. ✨ FDAs are synthetic anchors that precisely capture a finetuned model's functional shift. ✨ FDAs can complement existing model merging methods and achieves SOTA performance. ➡️ Paper: arxiv.org/abs/2510.21223 💻 Code: github.com/Sphere-AI-Lab/… 🌐 Project: spherelab.ai/fda #AI #LLM #MachineLearning #DeepLearning

English

10

92

602

34.5K

Tim Xiao أُعيد تغريده

Weiyang Liu@Besteuler·23 Eki

The physics prior matters in molecular structures. We model potential energy between molecules for drug design. This happens to have a coincident yet interesting connection to my past work, hyperspherical energy (arxiv.org/abs/1805.09298), which considers potential energy between imaginary electrons (i.e. neurons in neural networks). But this time we are modeling real molecules for drug design. :) Excited that our new AI-for-science paper is finally online: "Manifold-Constrained Nucleus-Level Denoising Diffusion Model for Structure-Based Drug Design." Very glad to be part of the wonderful team. @Shengchao_Liu Caltech news: caltech.edu/about/news/new… Paper link: pnas.org/doi/10.1073/pn… Project page: yanliang3612.github.io/NucleusDiff/

English

Verbalized Machine Learning (VML) moves machine learning into natural language space, where one learns a model parameterized by natural language using LLMs. How does VML connect LLMs with: Universal function approximator? von Neumann architecture? Interpretable learning? How well does VML perform in classical machine learning tasks? Let’s dive into the details! 🔗arxiv.org/abs/2406.04344

2

17

1.9K

Tim Xiao أُعيد تغريده

Center of The Maze@Maze_s_Center·19 Eki

@kenneth0stanley @ai_bread This prompt baking reminds me of verbalized machine learning. Though they don't modify the weights, but update the parameters. x.com/TimZXiao/statu…

Tim Xiao@TimZXiao

English

1

2

194

Tim Xiao أُعيد تغريده

Anna Kuzina@a_kzna·18 Eki

Polymer simulations, but make them Vivace ⚡ It was a pleasure to work on Vivace architecture during my time in @MSFTResearch together with Lixin Sun and @gncsimm .

Gregor Simm@gncsimm

MLFFs 🤝 Polymers — SimPoly works! Our team at @MSFTResearch AI for Science is proud to present SimPoly (SIM-puh-lee) — a deep learning solution for polymer simulation. Polymeric materials are foundational to modern life—found in everything from the clothes we wear and the food we consume to high-performance materials in aerospace, electronics, and medicine. Today, we introduce a new way to simulate them. We built a machine learning force field (MLFF) to predict macroscopic properties across a broad range of polymers—trained only on quantum-chemical data, with no experimental fitting. Specifically, we accurately compute polymer densities via large-scale MD simulations, achieving higher accuracy than classical force fields. We also capture second-order phase transitions, enabling prediction of glass transition temperatures. These two properties are fundamental to processing and application design. Finally, we created a benchmark based on experimental data for 130 polymers plus an accompanying quantum-chemical dataset—laying the foundation for a fully in silico design pipeline for next-generation polymeric materials. The incredible team: Jean Helie, @temporaer, Yicheng Chen, Guillem Simeon, @a_kzna, @ErnestoCheco, @erunzzz, Gabriele Tocci, @chc273, @yatao_li, @SherryLixueC, @zunwang_msr, Bichlien H. Nguyen, Jake A. Smith, and Lixin Sun. 📄 Preprint: arxiv.org/abs/2510.13696 ⚙️ Data and code release: in progress⏳ #MLFFs #Polymers #AIforScience #DeepLearning #SimPoly #ScientificML #Microsoft #MicrosoftResearch #MicrosoftQuantum

English

1

11

716

Tim Xiao أُعيد تغريده

Weiyang Liu@Besteuler·18 Eki

This is almost a year-long project and led by @ItsTheZhen. My biggest takeaway is that physical simulation is very effective as a reward signal, and this efficient verification is crucial for unlocking LLMs’ design novelty. This conclusion is actually aligned with our previous work spherelab.ai/SGP-Gen, where the verification is done by a renderer.

Can LLMs design real machines — from 🚗 cars to 🏹 catapults? Can they engineer through both 🧠 agentic workflows and 🌀 reinforcement learning (RL) — learning from physical simulation instead of text alone? We treat machine design as “machine code writing”, where LLMs assemble mechanisms from standard parts. To explore this, we built 🧩 BesiegeField — a real-time, physics-based sandbox where LLMs can build, test, and evolve machines through agentic planning or RL-based self-improvement. Our findings: 1️⃣ Even top LLMs fail to build working catapults — easy for humans but highly dynamic ⚙️ and nonlinear. 2️⃣ RL helps — working designs emerge through interaction. 3️⃣ Aligning reasoning 🧩 with construction 🔩 remains a key challenge. This marks the first step toward LLMs that learn to design through action — bridging reasoning, physics, and embodiment. 🛠️🤖 🌐 Project Website: besiegefield.github.io 💻 GitHub (RL & Agentic Workflow): github.com/Godheritage/Be… 👥 Joint work w/ @Besteuler & Wenqian Zhang

English

4

33

5.6K

Tim Xiao@TimZXiao·18 Eki

Sharing a fascinating work: BesiegeField. It explores how LLMs can think and design directly in the space of natural language — a meaningful and fitting challenge for LLMs. A great example of verbalized computing, where design goals are defined in words rather than formal specs.

Can LLMs design real machines — from 🚗 cars to 🏹 catapults? Can they engineer through both 🧠 agentic workflows and 🌀 reinforcement learning (RL) — learning from physical simulation instead of text alone? We treat machine design as “machine code writing”, where LLMs assemble mechanisms from standard parts. To explore this, we built 🧩 BesiegeField — a real-time, physics-based sandbox where LLMs can build, test, and evolve machines through agentic planning or RL-based self-improvement. Our findings: 1️⃣ Even top LLMs fail to build working catapults — easy for humans but highly dynamic ⚙️ and nonlinear. 2️⃣ RL helps — working designs emerge through interaction. 3️⃣ Aligning reasoning 🧩 with construction 🔩 remains a key challenge. This marks the first step toward LLMs that learn to design through action — bridging reasoning, physics, and embodiment. 🛠️🤖 🌐 Project Website: besiegefield.github.io 💻 GitHub (RL & Agentic Workflow): github.com/Godheritage/Be… 👥 Joint work w/ @Besteuler & Wenqian Zhang

English

4

152

Tim Xiao أُعيد تغريده

Weiyang Liu@Besteuler·18 Eki

🤖 Can LLMs learn to create? Introducing "Agentic Design of Compositional Machines" — a new frontier where AI builds functional machines from standardized parts. We present BesiegeField, a simulation testbed to benchmark LLMs on tasks like building cars & catapults. Key findings: 🔧 Compositional design is extremely challenging even for SOTA LLMs (human can easily do better) 🛠️ Multi-agent workflows + RLVR boost performance ⚙️ Physical simulation as a verifiable reward is effective for eliciting LLM’s design capabilities. 🧠 High-level planning ≠ precise execution — a core challenge Paper: arxiv.org/abs/2510.14980 Project: besiegefield.github.io #AI #Agents #LLM #GenerativeAI #AIGC #Simulation #RLVR

English

1

2

14

1.1K

Tim Xiao أُعيد تغريده

Zhen Liu@ItsTheZhen·17 Eki

TL;DR: Meet BesiegeField—a playground where LLMs build, test, and refine machines from standard parts in real time. We tested agentic workflows and RLVR with top LLMs: even the strongest still show limits in compositional machine design. 🔗 besiegefield.github.io 🧵 below

Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?

English

2

8

1.1K

Tim Xiao أُعيد تغريده

Zhen Liu@ItsTheZhen·17 Eki

Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?

English

3

12

4.2K

Tim Xiao أُعيد تغريده

Weiyang Liu@Besteuler·17 Eki

This is a wonderful collaboration with @ItsTheZhen and Wenqian. I’ve long been curious whether large language models truly possess creativity -- the ability to build something genuinely novel. This project represents our first step toward answering that question. It also aligns with my recent interest in formal reasoning of LLMs, where the reasoning process can be verified by an expert engine (e.g., a compiler). In our case, the XML-based language used in BesiegeField can be directly rendered and simulated via a physics engine. I believe such automatic verification is essential for accelerating the discovery of novel mechanical designs.

Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?

English