Tim Xiao

314 posts

Tim Xiao

Tim Xiao

@TimZXiao

PhD student in Machine Learning @ University of Tübingen · IMPRS-IS scholar

انضم Haziran 2012
325 يتبع259 المتابعون
تغريدة مثبتة
Tim Xiao
Tim Xiao@TimZXiao·
✨ New paper: Flipping Against All Odds We found that large language models (LLMs) can describe probabilities—but fail to sample from them faithfully. Yes, even flipping a fair coin is hard. 🪙 🧵 Here’s what we learned—and how we fixed it. 🔗arxiv.org/abs/2506.09998 1/
Tim Xiao tweet media
English
4
7
16
2.8K
Tim Xiao
Tim Xiao@TimZXiao·
clock rate -> FLOPS -> heart rate?
English
0
0
3
85
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
🚀 Excited to introduce POET-X, a scalable and highly memory-efficient algorithm for LLM pretraining. ✨ LoRA-level GPU memory, better-than-AdamW pretraining performance! POET-X finally marries training stability (from POET's spectrum preservation) and practical scalability (from our new implementation and CUDA kernels). POET-X can pretrain billion-parameter LLMs (eg., Llama-8B) on a single NVIDIA H100, where standard optimizers like AdamW run out of memory under the same settings. We carefully reimplemented every computation step of POET (arxiv.org/pdf/2506.08001). POET-X combines many small checkpointing and parallelization tricks. While each may appear incremental, together they dramatically improve scalability and reduce memory usage by over 70% compared to the original POET. The memory-efficiency of POET-X comes from the unique parameter-efficient reparameterization (where sparsity comes in) of the weight update rule. POET-X bridges this gap between parameter efficiency and memory efficiency. Code is now public. Feel free to try it! ➡️ paper: arxiv.org/pdf/2603.05500 💻 Code: github.com/Sphere-AI-Lab/… 🌐 Website: spherelab.ai/poetx #AI #LLM #MachineLearning #DeepLearning
Weiyang Liu tweet media
English
1
12
55
9.2K
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
Interesting work! Doing proper normalization is definitely important for training neural networks stably. We considered hyperball normalization for convolutional neural networks back in 2018, see arxiv.org/pdf/1804.08071. Besides hyperball normalization, we also proposed multiple other normalization methods for weight/activation. Quite surprisingly, we also did gradient normalizaiton in order to make it actually work. See Section 4 of the Decoupled Networks paper. I somehow got the impression that many many old ideas are worth revisiting for LLM pretraining, especially those that stablizes the training (but may slightly hurt the performance for conventional CNNs).
Weiyang Liu tweet mediaWeiyang Liu tweet media
Kaiyue Wen@wen_kaiyue

(1/n) Introducing Hyperball — an optimizer wrapper that keeps weight & update norm constant and lets you control the effective (angular) step size directly. Result: sustained speedups across scales + strong hyperparameter transfer.

English
3
27
206
16.4K
Tim Xiao أُعيد تغريده
Sharvaree Vadgama
Sharvaree Vadgama@SharvVadgama·
😍Excited to organize @GRaM_org_ 2.0 this time at #ICLR2026 🇧🇷 🌟 Looking forward to your best works on geometry-grounded representations, inductive bias, and structure in learning. This year, we also welcome works on 🌐open problems, ⚔️discussions on scale vs symmetry, 👊 position papers and more! Deadline: 30th January AOE
GRaM Workshop at ICLR 2026@GRaM_org_

📢The second edition of ✨GRaM workshop✨ is here this time at #ICLR26. 🌟Submit your exciting works in Geometry-grounded representations. We welcome submissions in multiple tracks i.e. 📄 Proceedings 📝extended abstract 👩‍🏫Tutorial/blogpost as well as an exciting challenge!

English
2
9
29
2.5K
Tim Xiao أُعيد تغريده
Yuxuan Xue
Yuxuan Xue@yxue_yxue·
It's time! I will present InfiniHuman at 16:30 in room S421 at #SIGGRAPHAsia2025 . Please join me if you want to generate avatars with fine-grained multi-modal control! @ympradyumna will present PhySIC at 16:30 in room S221. Join him to turn 2D image to 3D human + Scene!
Yuxuan Xue tweet media
Yuxuan Xue@yxue_yxue

#InfiniHuman: Infinite 3D Human Generation with Precise Control How do you want to generate a 3D avatar? From text description? With clothing images? Or some desired body shape? All can be done at once with InfiniHuman! 🔗Page: yuxuan-xue.com/infini-human/ #SIGGRAPHAsia2025 #AI

English
0
1
14
871
Tim Xiao أُعيد تغريده
yingzhen
yingzhen@liyzhen2·
An exciting PhD opportunity at StatML CDT (Imperial) + Institute of Cancer Research, with Oliver Ratmann, Richard Houlston and yours truly ☺️: "Machine Learning for Cancer Susceptibility Genetics" Oct 2026 entry, apply to StatML CDT by Jan 8 2026. RT🙏 docs.google.com/document/d/1lS…
English
0
3
34
2.2K
Tim Xiao أُعيد تغريده
Zhen Liu
Zhen Liu@ItsTheZhen·
Can we efficiently and robustly finetune flow matching models with reinforcement learning using differentiable rewards, in an amortized way? Hint: use optimal control and match your velocity field with value gradients! Please come by our poster “Value Gradient Guidance for Flow Matching Alignment” at #NeurIPS2025 (Exhibit Hall C, D, E — #4906 Fri, Dec 5 | 4:30pm – 7:40pm PST) and learn more about our VGG-Flow! 🔗ArXiv: arxiv.org/abs/2512.05116 Joint work w/ @zdhnarsil @TimZXiao @cdomingoenrich @Besteuler
Zhen Liu tweet media
English
2
6
33
9.5K
Tim Xiao
Tim Xiao@TimZXiao·
@Besteuler I guess the review quality for the coming ICML will be very high😆
English
0
0
1
130
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
🤩 This is awesome. When we are doing the agentic design project (besiegefield.github.io) using the Besiege game environment, we have to hack the game to get as much feedback as possible to do RL and stuff. However, I start to think differently after seeing the Genshin agent. We humans don’t need that much feedback to learn to master the game, and visual feedback is already sufficient. I am wondering what will happen if the agent learns to master many games this way. Will it develop some universal skills for game playing? Will it see the world differently?🤔
Weihao Tan@WeihaoTan64

🚀Introducing Lumine, a generalist AI agent trained within Genshin Impact that can perceive, reason, and act in real time, completing hours-long missions and following diverse instructions within complex 3D open-world environments.🎮 Website: lumine-ai.org 1/6

English
0
7
36
7.5K
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
🤯 Merging many finetuned LLMs into one model, effectively? Introducing Functional Dual Anchor (FDA), a new framework for model merging. 🚀 Current merging works poorly due to the underlying parameter conflicts. FDA shifts knowledge integration to the input-representation space for seamless merging. This "dual" perspective bridges the gap between post-hoc merging and joint multi-task training, reducing the knowledge conflicts. ✨ FDAs are synthetic anchors that precisely capture a finetuned model's functional shift. ✨ FDAs can complement existing model merging methods and achieves SOTA performance. ➡️ Paper: arxiv.org/abs/2510.21223 💻 Code: github.com/Sphere-AI-Lab/… 🌐 Project: spherelab.ai/fda #AI #LLM #MachineLearning #DeepLearning
Weiyang Liu tweet media
English
10
92
602
34.5K
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
The physics prior matters in molecular structures. We model potential energy between molecules for drug design. This happens to have a coincident yet interesting connection to my past work, hyperspherical energy (arxiv.org/abs/1805.09298), which considers potential energy between imaginary electrons (i.e. neurons in neural networks). But this time we are modeling real molecules for drug design. :) Excited that our new AI-for-science paper is finally online: "Manifold-Constrained Nucleus-Level Denoising Diffusion Model for Structure-Based Drug Design." Very glad to be part of the wonderful team. @Shengchao_Liu Caltech news: caltech.edu/about/news/new… Paper link: pnas.org/doi/10.1073/pn… Project page: yanliang3612.github.io/NucleusDiff/
Weiyang Liu tweet media
English
0
2
17
1.9K
Tim Xiao أُعيد تغريده
Tim Xiao أُعيد تغريده
Anna Kuzina
Anna Kuzina@a_kzna·
Polymer simulations, but make them Vivace ⚡ It was a pleasure to work on Vivace architecture during my time in @MSFTResearch together with Lixin Sun and @gncsimm .
Gregor Simm@gncsimm

MLFFs 🤝 Polymers — SimPoly works! Our team at @MSFTResearch AI for Science is proud to present SimPoly (SIM-puh-lee) — a deep learning solution for polymer simulation. Polymeric materials are foundational to modern life—found in everything from the clothes we wear and the food we consume to high-performance materials in aerospace, electronics, and medicine. Today, we introduce a new way to simulate them. We built a machine learning force field (MLFF) to predict macroscopic properties across a broad range of polymers—trained only on quantum-chemical data, with no experimental fitting. Specifically, we accurately compute polymer densities via large-scale MD simulations, achieving higher accuracy than classical force fields. We also capture second-order phase transitions, enabling prediction of glass transition temperatures. These two properties are fundamental to processing and application design. Finally, we created a benchmark based on experimental data for 130 polymers plus an accompanying quantum-chemical dataset—laying the foundation for a fully in silico design pipeline for next-generation polymeric materials. The incredible team: Jean Helie, @temporaer, Yicheng Chen, Guillem Simeon, @a_kzna, @ErnestoCheco, @erunzzz, Gabriele Tocci, @chc273, @yatao_li, @SherryLixueC, @zunwang_msr, Bichlien H. Nguyen, Jake A. Smith, and Lixin Sun. 📄 Preprint: arxiv.org/abs/2510.13696 ⚙️ Data and code release: in progress⏳ #MLFFs #Polymers #AIforScience #DeepLearning #SimPoly #ScientificML #Microsoft #MicrosoftResearch #MicrosoftQuantum

English
0
1
11
716
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
This is almost a year-long project and led by @ItsTheZhen. My biggest takeaway is that physical simulation is very effective as a reward signal, and this efficient verification is crucial for unlocking LLMs’ design novelty. This conclusion is actually aligned with our previous work spherelab.ai/SGP-Gen, where the verification is done by a renderer.
Zhen Liu@ItsTheZhen

Can LLMs design real machines — from 🚗 cars to 🏹 catapults? Can they engineer through both 🧠 agentic workflows and 🌀 reinforcement learning (RL) — learning from physical simulation instead of text alone? We treat machine design as “machine code writing”, where LLMs assemble mechanisms from standard parts. To explore this, we built 🧩 BesiegeField — a real-time, physics-based sandbox where LLMs can build, test, and evolve machines through agentic planning or RL-based self-improvement. Our findings: 1️⃣ Even top LLMs fail to build working catapults — easy for humans but highly dynamic ⚙️ and nonlinear. 2️⃣ RL helps — working designs emerge through interaction. 3️⃣ Aligning reasoning 🧩 with construction 🔩 remains a key challenge. This marks the first step toward LLMs that learn to design through action — bridging reasoning, physics, and embodiment. 🛠️🤖 🌐 Project Website: besiegefield.github.io 💻 GitHub (RL & Agentic Workflow): github.com/Godheritage/Be… 👥 Joint work w/ @Besteuler & Wenqian Zhang

English
0
4
33
5.6K
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
🤖 Can LLMs learn to create? Introducing "Agentic Design of Compositional Machines" — a new frontier where AI builds functional machines from standardized parts. We present BesiegeField, a simulation testbed to benchmark LLMs on tasks like building cars & catapults. Key findings: 🔧 Compositional design is extremely challenging even for SOTA LLMs (human can easily do better) 🛠️ Multi-agent workflows + RLVR boost performance ⚙️ Physical simulation as a verifiable reward is effective for eliciting LLM’s design capabilities. 🧠 High-level planning ≠ precise execution — a core challenge Paper: arxiv.org/abs/2510.14980 Project: besiegefield.github.io #AI #Agents #LLM #GenerativeAI #AIGC #Simulation #RLVR
Weiyang Liu tweet media
English
1
2
14
1.1K
Tim Xiao أُعيد تغريده
Zhen Liu
Zhen Liu@ItsTheZhen·
TL;DR: Meet BesiegeField—a playground where LLMs build, test, and refine machines from standard parts in real time. We tested agentic workflows and RLVR with top LLMs: even the strongest still show limits in compositional machine design. 🔗 besiegefield.github.io 🧵 below
Zhen Liu tweet media
Zhen Liu@ItsTheZhen

Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?

English
0
2
8
1.1K
Tim Xiao أُعيد تغريده
Zhen Liu
Zhen Liu@ItsTheZhen·
Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?
English
2
3
12
4.2K
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
This is a wonderful collaboration with @ItsTheZhen and Wenqian. I’ve long been curious whether large language models truly possess creativity -- the ability to build something genuinely novel. This project represents our first step toward answering that question. It also aligns with my recent interest in formal reasoning of LLMs, where the reasoning process can be verified by an expert engine (e.g., a compiler). In our case, the XML-based language used in BesiegeField can be directly rendered and simulated via a physics engine. I believe such automatic verification is essential for accelerating the discovery of novel mechanical designs.
Zhen Liu@ItsTheZhen

Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?

English
0
1
9
1.2K
Tim Xiao أُعيد تغريده
Weiyang Liu
Weiyang Liu@Besteuler·
🚀 Glad to introduce SimKO (Simple Pass@K Optimization) Current GRPO-based methods overfit to safe responses -- great Pass@1, poor Pass@K. 🔍 We find this stems from probability over-concentration: the model collapses onto its top-1 token, losing exploration. This appears to be a more accurate observation metric than commonly used entropy. ✨ SimKO fixes this with probability redistribution: ✅ Encoruage top-K candidates for high-entropy tokens in correct responses ❌ Penalize over-confident top-1s for incorrect responses 🧮 Improves Pass@K across math & logic benchmarks -- simple, stable, effective. 📄 Paper: arxiv.org/abs/2510.14807… 🌐 Project: spherelab.ai/simko #LLM #ReinforcementLearning #Reasoning #RLVR #AI
Weiyang Liu tweet media
English
5
18
158
10.4K