Zichen Wang

49 posts

Zichen Wang

@Zichen2501

เข้าร่วม Haziran 2023

336 กำลังติดตาม208 ผู้ติดตาม

ทวีตที่ปักหมุด

Zichen Wang@Zichen2501·30 Eyl

Differentiable rendering made SIMPLE❗️ Differentiating physically based renderers is hard: Dirac-delta discontinuities arise at object silhouette. Our #SIGGRAPHAsia2024 work shows how a simple relaxation can rescue the day, enabling easy 3D reconstruction and relighting! (1/N)

English

349

45.3K

Zichen Wang@Zichen2501·6d

DINO loss on 3D models gives better scene stability. Interesting!

Qianqian Wang@QianqianWang5

Most multi-view reconstruction models need full supervision. We show they can self-improve without any ground truth labels. Introducing SelfEvo: Self-Improving 4D Perception via Self-Distillation. Up to +36.5% in video depth, +20.1% in camera estimation, zero annotation.

English

184

Zichen Wang รีทวีตแล้ว

Qianqian Wang@QianqianWang5·14 Nis

English

263

23.1K

Zichen Wang รีทวีตแล้ว

Junyi Zhang@junyi42·9 Mar

𝗢𝗻𝗲 𝗺𝗲𝗺𝗼𝗿𝘆 𝗰𝗮𝗻’𝘁 𝗿𝘂𝗹𝗲 𝘁𝗵𝗲𝗺 𝗮𝗹𝗹. We present 𝗟𝗼𝗚𝗲𝗥, a new 𝗵𝘆𝗯𝗿𝗶𝗱 𝗺𝗲𝗺𝗼𝗿𝘆 architecture for long-context geometric reconstruction. LoGeR enables stable reconstruction over up to 𝟭𝟬𝗸 𝗳𝗿𝗮𝗺𝗲𝘀 / 𝗸𝗶𝗹𝗼𝗺𝗲𝘁𝗲𝗿 𝘀𝗰𝗮𝗹𝗲, with 𝗹𝗶𝗻𝗲𝗮𝗿-𝘁𝗶𝗺𝗲 𝘀𝗰𝗮𝗹𝗶𝗻𝗴 in sequence length, 𝗳𝘂𝗹𝗹𝘆 𝗳𝗲𝗲𝗱𝗳𝗼𝗿𝘄𝗮𝗿𝗱 inference, and 𝗻𝗼 𝗽𝗼𝘀𝘁-𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻. Yet it matches or surpasses strong optimization-based pipelines. (1/5) @GoogleDeepMind @Berkeley_AI

English

450

3.4K

557K

Zichen Wang รีทวีตแล้ว

Peter Tong@TongPetersb·4 Mar

Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision. We share our exploration: visual representations, data, world modeling, architecture, and scaling behavior! [1/9]

English

219

1.1K

214.7K

Zichen Wang รีทวีตแล้ว

Haiwen (Haven) Feng@HavenFeng·23 Oca

✨Thinking with Blender~ Meet VIGA: a multimodal agent that autonomously codes 3D/4D blender scenes from any image, with no human, no training! @berkeley_ai #LLMs #Blender #Agent 🧵1/6

English

311

2.1K

334.9K

Zichen Wang@Zichen2501·11 Oca

@YifeiZhou02 Wish you the best!

English

Yifei Zhou@YifeiZhou02·10 Oca

Today is my last day @xai. xAI has been an incredible place where hard work is truly rewarded. On leave from my PhD at Berkeley, 6 months here felt like 2 years anywhere else: I contributed to 4 different agent teams and had the opportunity to lead an agent research team myself. This was a difficult decision — but I’m excited for what’s next 🚀

English

93.5K

Zichen Wang รีทวีตแล้ว

Kwang Moo Yi@kwangmoo_yi·9 Oca

Wang et al., "MoE3D: A Mixture-of-Experts Module for 3D Reconstruction" Flying pixels in DPT-based models are coming from the fact that DPT modules are convolutional. Introducing MoEs allows you to circumvent that. So...sort of bilateral filtering?

English

5.2K

Zichen Wang@Zichen2501·10 Oca

Thanks for sharing! Please check out our new paper——

Zhenjun Zhao@zhenjun_zhao

MoE3D: A Mixture-of-Experts Module for 3D Reconstruction @Zichen2501, @AngCao3, Liam J. Wang, @jjpark3D tl;dr: multiple depth predictions and weights->softmax weighting-based fusion->depth estimation arxiv.org/abs/2601.05208

English

154

Zichen Wang รีทวีตแล้ว

Zhenjun Zhao@zhenjun_zhao·9 Oca

English

Zichen Wang@Zichen2501·26 Eki

@Haian_Jin @Jimantha Huge congrats!

English

116

Haian Jin@Haian_Jin·25 Eki

So excited to share that I’ve been awarded the Google PhD Fellowship in Machine Perception! Huge thanks to my PhD advisor @Jimantha and all my amazing collaborators for their support and inspiration along the way.

Google.org@Googleorg

🎉 We're excited to announce the 2025 Google PhD Fellows! @GoogleOrg is providing over $10 million to support 255 PhD students across 35 countries, fostering the next generation of research talent to strengthen the global scientific landscape. Read more: goo.gle/43wJWw8

English

148

32.4K

Zichen Wang รีทวีตแล้ว

Eric Ming Chen@ericmchen1·13 Ağu

Come see our talk on "Pocket Time-Lapse" at SIGGRAPH today at 4pm in the Image Representation, Editing, & Generation session! West Building, Rooms 118-120. With @zzigakovacic , @madhavaggar and @AbeDavis

English

8.7K

Zichen Wang รีทวีตแล้ว

Vlad Erium 🇯🇵@ssh4net·4 Ağu

Quadric-Based Silhouette Sampling for Differentiable Rendering Mariia Soroka ,Christoph Peters @MomentsInCG ,Steve Marschner Project: mariasoroka.github.io/papers/EdgeSam… Paper: mariasoroka.github.io/papers/Data/Ed… Code (MIT): github.com/mariasoroka/Qu… Abstract Physically based differentiable rendering has established itself as key to inverse rendering, in which scenes are recovered from images through gradient-based optimization. Taking the derivative of the rendering equation is made difficult by the presence of discontinuities in the integrand at object silhouettes. To obtain correct derivatives w.r.t. changing geometry, accounting e.g. for changing penumbras or silhouettes in glossy reflections, differentiable renderers must compute an integral over these silhouettes. Prior work proposed importance sampling of silhouette edges for a given shading point. The main challenge is to efficiently reject parts of the mesh without silhouettes during sampling, which has been done using top-down traversal of a tree. Inaccuracies of this existing rejection procedure result in many samples with zero contribution. Thus, variance remains high and subsequent work has focused on alternatives such as area sampling or path space differentiable rendering. We propose an improved rejection test. It reduces variance substantially, which makes edge sampling in a unidirectional path tracer competitive again. Our rejection test relies on two approximations to the triangle planes of a mesh patch: A bounding box in dual space and dual quadrics. Additionally, we improve the heuristics used for stochastic traversal of the tree. We evaluate our method in a unidirectional path tracer and achieve drastic improvements over the original edge sampling and outperform methods based on area sampling.

English

Zichen Wang รีทวีตแล้ว

Gene Chou@gene_ch0u·13 Haz

I'll be presenting our work with @KaiZhang9546 at #cvpr2025. We finetune video models to be 3d consistent without any 3d supervision! Feel free to stop by our poster or reach out to chat: Sunday, Jun 15, 4-6pm ExHall D, poster #168 cvpr.thecvf.com/virtual/2025/p…

Gene Chou@gene_ch0u

We've released our paper "Generating 3D-Consistent Videos from Unposed Internet Photos"! Video models like Luma generate pretty videos, but sometimes struggle with 3D consistency. We can do better by scaling them with 3D-aware objectives. 1/N page: genechou.com/kfcw

English

5.7K

Zichen Wang รีทวีตแล้ว

Jeongsoo Park@jespark0·13 Haz

Can AI image detectors keep up with new fakes? Mostly, no. Existing detectors are trained using a handful of models. But there are thousands in the wild! Our work, Community Forensics, uses 4800+ generators to train detectors that generalize to new fakes. #CVPR2025 🧵 (1/5)

English

1.8K

Zichen Wang รีทวีตแล้ว

Chao Feng@chaof1234·13 Haz

Sharing our #CVPR2025 paper: "GPS as a Control Signal for Image Generation"! 🛰️+✍️ We turn the GPS tag stored in EXIF of photos into a control signal for diffusion models—so they don’t just know what you asked for, but where you want it to look like. Come to see our poster at Friday 13 Jun 10:30 a.m. — 12:30 p.m. (CT) in ExHall D, Poster #250.

English

3.1K

Zichen Wang@Zichen2501·10 Haz

Awesome! Have been waiting for this for a long time

Wenzel Jakob {deprecation notice}@wenzeljakob

The latest development version of Dr.Jit now provides built-in support for evaluating and training MLPs (including fusing them into rendering workloads). They compile to efficient Tensor Core operations via NVIDIA's Cooperative Vector extension. Details: drjit.readthedocs.io/en/latest/nn.h…

English

258

Zichen Wang รีทวีตแล้ว

Chong Zeng@iam_NCJ·30 May

What if a Transformer could render? Not text → image. But mesh → image — with global illumination. No rasterizers. No ray-tracers. Just a Transformer without per-scene training. RenderFormer does exactly that. #SIGGRAPH2025 🔗microsoft.github.io/renderformer

English

556

40.4K

Zichen Wang@Zichen2501·31 Mar

Interesting to see more works on the multiple-surface representations

Vlad Erium 🇯🇵@ssh4net

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes Stefano Esposito, Anpei Chen, Christian Reiser, Samuel Rota Bulò, Lorenzo Porzi, Katja Schwarz, Christian Richardt, Michael Zollhöfer, Peter Kontschieder, Andreas Geiger (University of Tűbingen, Meta Reality Labs) Paper: arxiv.org/abs/2409.02482 Project: autonomousvision.github.io/volsurfs/ Code (CC BY4.0) github.com/autonomousvisi… Abstract: High-quality view synthesis relies on volume rendering, splatting, or surface rendering. While surface rendering is typically the fastest, it struggles to accurately model fuzzy geometry like hair. In turn, alpha-blending techniques excel at representing fuzzy materials but require an unbounded number of samples per ray (P1). Further overheads are induced by empty space skipping in volume rendering (P2) and sorting input primitives in splatting (P3). We present a novel representation for real-time view synthesis where the (P1) number of sampling locations is small and bounded, (P2) sampling locations are efficiently found via rasterization, and (P3) rendering is sorting-free. We achieve this by representing objects as semi-transparent multi-layer meshes rendered in a fixed order. First, we model surface layers as signed distance function (SDF) shells with optimal spacing learned during training. Then, we bake them as meshes and fit UV textures. Unlike single-surface methods, our multi-layer representation effectively models fuzzy objects. In contrast to volume and splatting-based methods, our approach enables real-time rendering on low-power laptops and smartphones.

English

194

Zichen Wang รีทวีตแล้ว

Congyue Deng@CongyueD·11 Mar

In the past, we extended the convolution operator to go from low-level image processing to high-level visual reasoning. Can we also extend physical operators for more high-level physical reasoning? Introducing the Denoising Hamiltonian Network (DHN): arxiv.org/pdf/2503.07596

English

315

41.1K

Zichen Wang@Zichen2501·2 Şub

Reminds me of my functional analysis class, where the prof. kept saying “gradients ≠ derivatives”

Jeremy Bernstein@jxbz

I ran this experiment to show that duality-based optimizers like Muon are not only *fast* but also *numerically different* to vanilla gradient descent. In particular, the weights move a qualitatively different amount in the same number of training steps. (1/4)

English

686

ค้นพบ

@GoogleDeepMind @Berkeley_AI @berkeley_ai @YifeiZhou02 @xai @AngCao3 @jjpark3D @Haian_Jin