Zichen Wang

49 posts

Zichen Wang

Zichen Wang

@Zichen2501

Tham gia Haziran 2023
336 Đang theo dõi208 Người theo dõi
Tweet ghim
Zichen Wang
Zichen Wang@Zichen2501·
Differentiable rendering made SIMPLE❗️ Differentiating physically based renderers is hard: Dirac-delta discontinuities arise at object silhouette. Our #SIGGRAPHAsia2024 work shows how a simple relaxation can rescue the day, enabling easy 3D reconstruction and relighting! (1/N)
English
5
57
349
45.3K
Zichen Wang đã retweet
Qianqian Wang
Qianqian Wang@QianqianWang5·
Most multi-view reconstruction models need full supervision. We show they can self-improve without any ground truth labels. Introducing SelfEvo: Self-Improving 4D Perception via Self-Distillation. Up to +36.5% in video depth, +20.1% in camera estimation, zero annotation.
English
4
35
263
23K
Zichen Wang đã retweet
Junyi Zhang
Junyi Zhang@junyi42·
𝗢𝗻𝗲 𝗺𝗲𝗺𝗼𝗿𝘆 𝗰𝗮𝗻’𝘁 𝗿𝘂𝗹𝗲 𝘁𝗵𝗲𝗺 𝗮𝗹𝗹. We present 𝗟𝗼𝗚𝗲𝗥, a new 𝗵𝘆𝗯𝗿𝗶𝗱 𝗺𝗲𝗺𝗼𝗿𝘆 architecture for long-context geometric reconstruction. LoGeR enables stable reconstruction over up to 𝟭𝟬𝗸 𝗳𝗿𝗮𝗺𝗲𝘀 / 𝗸𝗶𝗹𝗼𝗺𝗲𝘁𝗲𝗿 𝘀𝗰𝗮𝗹𝗲, with 𝗹𝗶𝗻𝗲𝗮𝗿-𝘁𝗶𝗺𝗲 𝘀𝗰𝗮𝗹𝗶𝗻𝗴 in sequence length, 𝗳𝘂𝗹𝗹𝘆 𝗳𝗲𝗲𝗱𝗳𝗼𝗿𝘄𝗮𝗿𝗱 inference, and 𝗻𝗼 𝗽𝗼𝘀𝘁-𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻. Yet it matches or surpasses strong optimization-based pipelines. (1/5) @GoogleDeepMind @Berkeley_AI
English
63
450
3.4K
556.9K
Zichen Wang đã retweet
Peter Tong
Peter Tong@TongPetersb·
Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision. We share our exploration: visual representations, data, world modeling, architecture, and scaling behavior! [1/9]
Peter Tong tweet media
English
36
219
1.1K
214.6K
Zichen Wang đã retweet
Haiwen (Haven) Feng
Haiwen (Haven) Feng@HavenFeng·
✨Thinking with Blender~ Meet VIGA: a multimodal agent that autonomously codes 3D/4D blender scenes from any image, with no human, no training! @berkeley_ai #LLMs #Blender #Agent 🧵1/6
English
72
311
2.1K
334.9K
Yifei Zhou
Yifei Zhou@YifeiZhou02·
Today is my last day @xai. xAI has been an incredible place where hard work is truly rewarded. On leave from my PhD at Berkeley, 6 months here felt like 2 years anywhere else: I contributed to 4 different agent teams and had the opportunity to lead an agent research team myself. This was a difficult decision — but I’m excited for what’s next 🚀
Yifei Zhou tweet media
English
70
31
1K
93.4K
Zichen Wang đã retweet
Kwang Moo Yi
Kwang Moo Yi@kwangmoo_yi·
Wang et al., "MoE3D: A Mixture-of-Experts Module for 3D Reconstruction" Flying pixels in DPT-based models are coming from the fact that DPT modules are convolutional. Introducing MoEs allows you to circumvent that. So...sort of bilateral filtering?
Kwang Moo Yi tweet media
English
1
12
93
5.2K
Haian Jin
Haian Jin@Haian_Jin·
So excited to share that I’ve been awarded the Google PhD Fellowship in Machine Perception! Huge thanks to my PhD advisor @Jimantha and all my amazing collaborators for their support and inspiration along the way.
Google.org@Googleorg

🎉 We're excited to announce the 2025 Google PhD Fellows! @GoogleOrg is providing over $10 million to support 255 PhD students across 35 countries, fostering the next generation of research talent to strengthen the global scientific landscape. Read more: goo.gle/43wJWw8

English
14
6
148
32.4K
Zichen Wang đã retweet
Eric Ming Chen
Eric Ming Chen@ericmchen1·
Come see our talk on "Pocket Time-Lapse" at SIGGRAPH today at 4pm in the Image Representation, Editing, & Generation session! West Building, Rooms 118-120. With @zzigakovacic , @madhavaggar and @AbeDavis
English
2
10
58
8.7K
Zichen Wang đã retweet
Vlad Erium 🇯🇵
Vlad Erium 🇯🇵@ssh4net·
Quadric-Based Silhouette Sampling for Differentiable Rendering Mariia Soroka ,Christoph Peters @MomentsInCG ,Steve Marschner Project: mariasoroka.github.io/papers/EdgeSam… Paper: mariasoroka.github.io/papers/Data/Ed… Code (MIT): github.com/mariasoroka/Qu… Abstract Physically based differentiable rendering has established itself as key to inverse rendering, in which scenes are recovered from images through gradient-based optimization. Taking the derivative of the rendering equation is made difficult by the presence of discontinuities in the integrand at object silhouettes. To obtain correct derivatives w.r.t. changing geometry, accounting e.g. for changing penumbras or silhouettes in glossy reflections, differentiable renderers must compute an integral over these silhouettes. Prior work proposed importance sampling of silhouette edges for a given shading point. The main challenge is to efficiently reject parts of the mesh without silhouettes during sampling, which has been done using top-down traversal of a tree. Inaccuracies of this existing rejection procedure result in many samples with zero contribution. Thus, variance remains high and subsequent work has focused on alternatives such as area sampling or path space differentiable rendering. We propose an improved rejection test. It reduces variance substantially, which makes edge sampling in a unidirectional path tracer competitive again. Our rejection test relies on two approximations to the triangle planes of a mesh patch: A bounding box in dual space and dual quadrics. Additionally, we improve the heuristics used for stochastic traversal of the tree. We evaluate our method in a unidirectional path tracer and achieve drastic improvements over the original edge sampling and outperform methods based on area sampling.
Vlad Erium 🇯🇵 tweet mediaVlad Erium 🇯🇵 tweet mediaVlad Erium 🇯🇵 tweet mediaVlad Erium 🇯🇵 tweet media
English
0
3
20
1K
Zichen Wang đã retweet
Gene Chou
Gene Chou@gene_ch0u·
I'll be presenting our work with @KaiZhang9546 at #cvpr2025. We finetune video models to be 3d consistent without any 3d supervision!  Feel free to stop by our poster or reach out to chat: Sunday, Jun 15, 4-6pm ExHall D, poster #168 cvpr.thecvf.com/virtual/2025/p…
Gene Chou@gene_ch0u

We've released our paper "Generating 3D-Consistent Videos from Unposed Internet Photos"! Video models like Luma generate pretty videos, but sometimes struggle with 3D consistency. We can do better by scaling them with 3D-aware objectives. 1/N page: genechou.com/kfcw

English
0
8
68
5.7K
Zichen Wang đã retweet
Jeongsoo Park
Jeongsoo Park@jespark0·
Can AI image detectors keep up with new fakes? Mostly, no. Existing detectors are trained using a handful of models. But there are thousands in the wild! Our work, Community Forensics, uses 4800+ generators to train detectors that generalize to new fakes. #CVPR2025 🧵 (1/5)
English
1
9
24
1.8K
Zichen Wang đã retweet
Chao Feng
Chao Feng@chaof1234·
Sharing our #CVPR2025 paper: "GPS as a Control Signal for Image Generation"! 🛰️+✍️ We turn the GPS tag stored in EXIF of photos into a control signal for diffusion models—so they don’t just know what you asked for, but where you want it to look like. Come to see our poster at Friday 13 Jun 10:30 a.m. — 12:30 p.m. (CT) in ExHall D, Poster #250.
English
2
10
37
3.1K
Zichen Wang đã retweet
Chong Zeng
Chong Zeng@iam_NCJ·
What if a Transformer could render? Not text → image. But mesh → image — with global illumination. No rasterizers. No ray-tracers. Just a Transformer without per-scene training. RenderFormer does exactly that. #SIGGRAPH2025 🔗microsoft.github.io/renderformer
Chong Zeng tweet media
English
13
85
556
40.4K
Zichen Wang
Zichen Wang@Zichen2501·
Interesting to see more works on the multiple-surface representations
Vlad Erium 🇯🇵@ssh4net

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes Stefano Esposito, Anpei Chen, Christian Reiser, Samuel Rota Bulò, Lorenzo Porzi, Katja Schwarz, Christian Richardt, Michael Zollhöfer, Peter Kontschieder, Andreas Geiger (University of Tűbingen, Meta Reality Labs) Paper: arxiv.org/abs/2409.02482 Project: autonomousvision.github.io/volsurfs/ Code (CC BY4.0) github.com/autonomousvisi… Abstract: High-quality view synthesis relies on volume rendering, splatting, or surface rendering. While surface rendering is typically the fastest, it struggles to accurately model fuzzy geometry like hair. In turn, alpha-blending techniques excel at representing fuzzy materials but require an unbounded number of samples per ray (P1). Further overheads are induced by empty space skipping in volume rendering (P2) and sorting input primitives in splatting (P3). We present a novel representation for real-time view synthesis where the (P1) number of sampling locations is small and bounded, (P2) sampling locations are efficiently found via rasterization, and (P3) rendering is sorting-free. We achieve this by representing objects as semi-transparent multi-layer meshes rendered in a fixed order. First, we model surface layers as signed distance function (SDF) shells with optimal spacing learned during training. Then, we bake them as meshes and fit UV textures. Unlike single-surface methods, our multi-layer representation effectively models fuzzy objects. In contrast to volume and splatting-based methods, our approach enables real-time rendering on low-power laptops and smartphones.

English
0
0
3
194
Zichen Wang đã retweet
Congyue Deng
Congyue Deng@CongyueD·
In the past, we extended the convolution operator to go from low-level image processing to high-level visual reasoning. Can we also extend physical operators for more high-level physical reasoning? Introducing the Denoising Hamiltonian Network (DHN): arxiv.org/pdf/2503.07596
Congyue Deng tweet media
English
6
58
315
41.1K