Jiang Bian

38 posts

Jiang Bian

@jbian22

Partner Research Manager at Microsoft Research

Beijing Katılım Ekim 2015

331 Takip Edilen39 Takipçiler

Jiang Bian@jbian22·15 Nis

Environment Scaling Is Real. Counting Environments Is Not. open.substack.com/pub/jiangbian/…

English

138

Jiang Bian@jbian22·13 Oca

Ready to optimize your MLLMs? 📄 Paper: arxiv.org/pdf/2512.22120💻 Code: github.com/zss02/BiPS #AI #MLLM #ComputerVision #DeepLearning #OpenSource #MachineLearning

English

Jiang Bian@jbian22·13 Oca

See Right (Robustness):BiPS enforces perceptual consistency via a bi-directional KL divergence constraint. This aligns predictive distributions between noisy and focused views, effectively mitigating hallucinations caused by visual noise. 🧠

English

Jiang Bian@jbian22·13 Oca

🚀 New Research: Efficient & Robust MLLMs via Bi-directional Perceptual Shaping (BiPS) High-res inputs in Multimodal LLMs cause compute bottlenecks & noise sensitivity. Our new framework solves the "redundancy vs. utility" trade-off. See Less, See Right. 🧵👇

English

Jiang Bian@jbian22·29 Kas

🏗️ Generative Design Bridging the gap between parametric history and 3D geometry: 🔹 CADMorph: Geometry-Driven Parametric CAD Editing via a Plan-Generate-Verify Loop #GenerativeAI #CAD

English

Jiang Bian@jbian22·29 Kas

🧬 AI for Science & Healthcare 🔹 MIRA: Medical Time Series Foundation Model for Real-World Health Data 🔹 Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations 🔹 Functional Complexity-adaptive Temporal Tensor Decomposition

English

Jiang Bian@jbian22·29 Kas

🚀 We are heading to #NeurIPS2025 in San Diego! Excited to announce my group has 7 accepted papers this year, tackling the frontiers of Agentic AI, AI for Health & Science, and Generative Design. A breakdown of our work 🧵👇 #AI #MachineLearning #MicrosoftResearch

English

Jiang Bian@jbian22·5 Kas

PixelCraft significantly boosts performance for strong MLLMs (GPT-40, Claude 3.7) on tough benchmarks like ChartXiv, ChartQAPro, & Geometry3K. Paper: arxiv.org/pdf/2509.25185 Code: github.com/microsoft/Pixe… #AI #MLLM #VisualReasoning #ComputerVision #MultiAgent

English

Jiang Bian@jbian22·5 Kas

Second fix: Flexible, Non-Linear Reasoning. No more rigid, one-way chains! PixelCraft has a "Planner" and an "Image Memory" (a "cognitive whiteboard"). This lets the system adaptively revisit any prior visual step, backtrack from errors, and explore different reasoning branches

English

Jiang Bian@jbian22·5 Kas

MLLMs are great, but they're surprisingly bad at reading charts and geometry. A tiny "perceptual slip" can wreck the whole reasoning process. We're thrilled to introduce PixelCraft 👾, a new multi-agent system to solve this.

English

Jiang Bian@jbian22·25 Eki

Key Insight: Perfect self-verification isn't required. We frame reasoning as a probabilistic process. As long as the chance of improvement is > chance of degradation, the model can converge to the correct answer. Result: An 8B model beat its 600B teacher on AIME.

English

Jiang Bian@jbian22·25 Eki

How can small LLMs solve "unsolvable" problems? Our new research, Deep Self-Evolving Reasoning (DSER), shows a path. arxiv.org/pdf/2510.17498 #AI #LLM #Reasoning #OpenSourceAI #MachineLearning

English

Jiang Bian retweetledi

Hanze Dong@hendrydong·7 Eki

💥Thrilled to share our new work Reinforce-Ada, which fixes signal collapse in GRPO 🥳No more blind oversampling or dead updates. Just sharper gradients, faster convergence, and stronger models. ⚙️ One-line drop-in. Real gains. arxiv.org/html/2510.0499… github.com/RLHFlow/Reinfo…

English

181

18.9K

Jiang Bian@jbian22·1 Ağu

🔑 Under the hood • Grounded latent via a proprio Forward-Dynamics Model → deeper motion understanding • Joint diffusion policy where latent & low-level actions co-evolve → long-horizon reasoning • Superior performance on SIMPLER, LIBERO and gripper & dexterous-hand 🏆

English

Jiang Bian@jbian22·1 Ağu

🌉 Why it matters 1⃣ Latent actions = mid-level bridge 📷🗣️ ➡️ 🤖, enabling structured planning. 2⃣ Embodiment-agnostic latents unlock cross-robot transfer & ultra-fast adaptation to new hardware. 🔄⚡️

English

Jiang Bian@jbian22·1 Ağu

🚀 Introducing villa-X — our new Visual-Language-Latent-Action (ViLLA) framework pushing embodied AI to the next level! 📄 Paper | arxiv.org/pdf/2507.23682 🌐 Project | microsoft.github.io/villa-x/ 💻 Code | github.com/microsoft/vill… 🤗 Models | huggingface.co/microsoft/vill…

English

Jiang Bian@jbian22·16 Tem

The impact is clear: Geometry Forcing substantially improves visual quality and 3D consistency over baseline methods, slashing the FVD score from 364 to 243 on a long-term video generation task. Read the full paper here: arxiv.org/pdf/2507.07982

English

Jiang Bian@jbian22·16 Tem

Our solution, Geometry Forcing, aligns the video model’s internal representations with features from a pretrained geometric foundation model (VGGT). We introduce two new objectives, Angular Alignment and Scale Alignment, to enforce geometric consistency during training.

English

Jiang Bian@jbian22·16 Tem

Video diffusion models are often blind to 3D geometry. We taught them to see. Excited to share our new work, Geometry Forcing, a method for generating stunningly consistent and 3D-aware video. Project Page 👇 geometryforcing.github.io #ComputerVision #AI #3DModeling

English

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry