Jiahao Lu

46 posts

Jiahao Lu

Jiahao Lu

@FFzzf08

PhD student @hkust. My research interests are in 3D reconstruction, world model and 3D perception.

Hong Kong Katılım Kasım 2021
186 Takip Edilen17 Takipçiler
Jiahao Lu retweetledi
Wang Zhao
Wang Zhao@WangZhao_0849·
🚀🚀 Introducing Pixal3D (SIGGRAPH’26) — a new pixel-aligned image-to-3D generation paradigm for high-fidelity 3D asset creation. Today’s Image-to-3D has become pretty good at producing plausible 3D assets. But plausibility is not enough. Fidelity is a hidden bottleneck. ❓A generated model may look “about right,” yet still fail to truly align with the input pixels. Can we make 3D generation as faithful as reconstruction, while still allowing it to complete the unseen? Pixal3D is our answer. 💡We believe the core bottleneck behind fidelity is 2D–3D correspondence. Most 3D-native generators synthesize shapes in canonical space and inject image cues through cross-attention, forcing the model to implicitly search for which pixels correspond to which 3D regions. 🍀Pixal3D takes a different route. Instead of generating in canonical space, Pixal3D generates directly in pixel-aligned camera space — what you see is what you get. The generated 3D asset is aligned with the input view from the start. ☕️Meanwhile, Pixal3D introduces back-projection-based image condition scheme - explicitly back-projects multi-scale pixel features into 3D voxels, thus resolving the 2D-3D association problem. The input image is no longer just a prompt - it becomes a geometric anchor. 🚩Pixal3D shows that pixel-aligned 3D generation is not only feasible and scalable, but also significantly improves fidelity, pushing 3D-native generation closer to reconstruction-level faithfulness. It also naturally extends to multi-view and scene-level 3D generation. ✅Faithful to the input view. ✅Generative for the unseen. Closer to reconstruction-level fidelity, with the creativity of 3D generation. Pixal3D also represents an effort towards the unification of 3D generation and reconstruction. 📢Paper, code, and demo are fully released — try it out and let us know your feedback! 🌐Project page: ldyang694.github.io/projects/pixal… 🤗Huggingface Demo: huggingface.co/spaces/Tencent… 💻Code: github.com/TencentARC/Pix… 📄Paper: arxiv.org/abs/2605.10922
English
27
144
1.2K
180.5K
Jiahao Lu retweetledi
Yuan Liu
Yuan Liu@YuanLiu41955461·
🚀 Introducing CoMoVi! From a start image & text prompt, it simultaneously generates realistic human videos and corresponding 3D motion sequences. ✨ No reference videos needed to extract skeletons anymore! 🧠 By co-generating motion and video, CoMoVi directly inherits the massive generalization power of video gen models, making it adaptable to various diverse text prompts! 🌍 This co-generation approach also makes CoMoVi look like a human-centric World Action Model (WAM), simulating not just the visual world, but the physical state of human actions within it. arxiv: arxiv.org/abs/2601.10632 HF page: huggingface.co/papers/2601.10… Project page: igl-hkust.github.io/CoMoVi/ Code: github.com/IGL-HKUST/CoMo…
English
1
7
46
3.3K
Jiahao Lu retweetledi
⚡AI Search⚡
⚡AI Search⚡@aisearchio·
What a crazy week in AI! 🚀 LTX 2.3 GPT 5.4 FireRed Edit 1.1 Kiwi Edit HY WU Qwen 3.5 small Cuda Agent CubeComposer Helios Spatial T2I Spectrum Utonia & more! Watch the full recap: youtu.be/KRE8JqTAEQk
YouTube video
YouTube
English
5
14
154
8K
Jiahao Lu retweetledi
Alexandre Morgand
Alexandre Morgand@Almorgand·
"Track4World: Feedforward World‑centric Dense 3D Tracking of All Pixels" TL;DR: feed‑forward model that predicts pixel‑level 2D and 3D dense flows for holistic world‑centric 3D tracking from monocular video, outperforming prior flow and tracking baselines.
English
2
11
87
4.6K
Jiahao Lu retweetledi
Jiahao Lu retweetledi
Jiahao Lu retweetledi
Wildminder
Wildminder@wildmindai·
Track4World. Feedforward world-centric dense 3D tracking; - tracks every pixel in 3D. - 16-frame sequences in 3.4s with 14GB VRAM; - Depth Anything v3 as backbone. jiah-cloud.github.io/Track4World.gi…
English
4
25
204
18.9K
Jiahao Lu retweetledi
AIQUEST
AIQUEST@AiquestAcademy·
Track4World: what if you could track every single pixel's 3D movement in a video, accurately and instantly? this new model turns any regular video into a detailed 3D scene, figuring out teh precise 3D path of everything moving in the frame, fast. it's like rebuilding the entire world from a single clip! 🤯 code and demo are available.
English
1
1
2
142
Jiahao Lu retweetledi
AI Bites | YouTube Channel
CoMoVi, a co-generative framework that couples two video diffusion models (VDMs) to generate 3D human motions and videos synchronously within a single diffusion denoising loop. the generation of 3D human motions and 2D human videos is intrinsically coupled. 3D motions provide the structural prior for plausibility and consistency in videos, while pre-trained video models offer strong generalization capabilities for motions, which necessitate coupling their generation processes. CoMoVi is based on this. Paper Title: CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos Project: igl-hkust.github.io/CoMoVi/ Link: arxiv.org/abs/2601.10632
English
0
1
4
131
Jiahao Lu retweetledi
Ying Shan
Ying Shan@yshan2u·
🚀🚀We’re building a new Applied Research Team in Tencent IEG for Game AI, with a research culture similar to ARC Lab. This newly formed team focuses on research-driven Game AI, operating at the intersection of fundamental research and large-scale game environments. Our goal is to develop principled models that can understand, simulate, and act within complex virtual worlds—while remaining grounded enough to eventually shape real games. Our research directions include (but are not limited to): 🎮 Interactive & Dynamic World Modeling — learning, simulating, and reasoning about evolving game worlds 🤖 NPC World-to-Action Modeling — connecting world understanding to decision and action, with strong ties to Embodied AI and agent behavior 🌍 Game Scene Generation — generative modeling of diverse, controllable, and scalable game scenes We are looking for researchers with the following minimum qualifications: ✨ A recent Ph.D. in related fields ✨ 5+ top conference or journal papers ✨ 1000+ GitHub stars 🌟 Evidence of a “make it work” mindset We are also open to strong graduate students for intern positions. Feel free to DM me or contact: hanswen@tencent.com.
English
6
12
152
12.8K
Jiahao Lu retweetledi
Justin Ryan ᯅ
Justin Ryan ᯅ@justinryanio·
People are underestimating @Apple in AI. I just ran Apple’s new SHARP model locally and watched my photos turn into 3D Gaussian splats in seconds, then stepped inside them on Vision Pro. This feels like the beginning of something special. You really have to try it.
English
91
231
3K
255.2K
Jiahao Lu retweetledi
AI at Meta
AI at Meta@AIatMeta·
🔉 Introducing SAM Audio, the first unified model that isolates any sound from complex audio mixtures using text, visual, or span prompts. We’re sharing SAM Audio with the community, along with a perception encoder model, benchmarks and research papers, to empower others to explore new forms of expression and build applications that were previously out of reach. 🔗 Learn more: go.meta.me/568e5d
English
406
916
6.4K
1.2M