Umangi Jain

31 posts

Umangi Jain

@JainUmangi

CS PhD student @UofT

Toronto, Canada Katılım Nisan 2019

419 Takip Edilen191 Takipçiler

Umangi Jain@JainUmangi·12 May

For evaluation, we create a 100-mesh benchmark with manually refined material groups. Material Magic Wand outperforms geometry, vision foundation embeddings, and 3D part-feature baselines. We hope this work enables faster material assignment workflows!

English

100

Umangi Jain@JainUmangi·12 May

For training, we curate supervision from material IDs in Objaverse. Raw data is highly imbalanced: many materials appear in only one part, while some meshes have a single material covering nearly all parts. We use data balancing to mitigate both within and across-mesh imbalance.

English

102

Umangi Jain@JainUmangi·12 May

🪄 Introducing Material Magic Wand: Material-Aware Grouping of 3D Parts in Untextured Meshes accepted to #CVPR2026. We present a tool for selecting material-consistent parts in untextured meshes. Click on one part and retrieve other parts likely to share the same material.

English

4.4K

Umangi Jain retweetledi

Ziyi Wu@Dazitu_616·5 Haz

📢 Introducing DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models Compared to vanilla DPO, we improve paired data construction and preference label granularity, leading to better visual quality and motion strength with only 1/3 of the data. 🧵

English

181

35.3K

Umangi Jain retweetledi

Kai He@Kai__He·3 Mar

🚀Excited to announce that our paper “CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion” has been accepted to #CVPR2025 ! 🌟We introduce a simple yet effective framework for controllable and consistent editing in dynamic 3D scenes. (1/5)

English

138

12.7K

Umangi Jain retweetledi

Yash Kant@yash2kant·12 Şub

🚀 Introducing Pippo – our diffusion transformer pre-trained on 3B Human Images and post-trained with 400M high-res studio images! ✨Pippo can generate 1K resolution turnaround video from a single iPhone photo! 🧵👀 Full deep dive thread coming up next!

Aran Komatsuzaki@arankomatsuzaki

Meta presents: Pippo : High-Resolution Multi-View Humans from a Single Image Generates 1K resolution, multi-view, studio-quality images from a single photo in a one forward pass

English

160

16.7K

Umangi Jain@JainUmangi·12 Ara

This enables using the graph-cut algorithm, which has been used extensively in image segmentation, to minimize an energy function and effectively partition the Gaussians into foreground and background.

English

226

Umangi Jain@JainUmangi·12 Ara

We propose a method for extracting objects in scenes obtained from 3DGS without modifications to the Gaussian optimization process. Our method interprets the Gaussians in 3DGS as nodes in a graph and introduces weighted edges based on proximity and perceptual similarity.

English

311

Umangi Jain@JainUmangi·12 Ara

Excited to be at #NeurIPS2024 this week! I’ll be presenting our paper “GaussianCut: Interactive Segmentation via Graph Cut for 3D Gaussian Splatting” 🗓️Thursday, 11-2 PM PST 📍 East Exhibit Hall A-C #1303 Paper: arxiv.org/abs/2411.07555 Project: umangi-jain.github.io/gaussiancut/

English

3.6K

Umangi Jain retweetledi

Andrea Tagliasacchi 🇨🇦@taiyasaki·2 Ara

📢📢📢 RoMo: Robust Motion Segmentation Improves Structure from Motion romosfm.github.io arxiv.org/pdf/2411.18650 TL;DR: boost your SfM pipeline on dynamic scenes. We use epipolar cues + SAMv2 features to find robust masks for moving objects in a zero-shot manner. 🧵👇

English

18.6K

Umangi Jain retweetledi

MrNeRF@janusch_patas·13 Kas

GaussianCut: Interactive Segmentation nvia Graph Cut for 3D Gaussian Splatting Contributions: 1) We propose a method for graph construction from a 3DGS model that utilizes the properties of the corresponding Gaussians to obtain edge weights, and 2) based on nthis graph, we propose and minimize an energy function (Equation 3) that combines the user inputs with the inherent representation of the scene. Our experimental evaluations show that GaussianCut obtains high-fidelity segmentation outperforming previous segmentation baselines.

English

6.4K

Umangi Jain retweetledi

Ziyi Wu@Dazitu_616·25 Eyl

Super excited to share that Neural Assets is accepted by NeurIPS as a Spotlight! See you all in Vancouver!

Thomas Kipf@tkipf

Excited to share our work on Neural Assets: a new method for enabling 3D asset-level control in image diffusion models – scalable & without any 3D inductive biases. Neural Assets goes beyond text or pixel-based control & provides an interface inspired by 3D graphics tools. 🧵

English

143

14.6K

Umangi Jain retweetledi

Yash Kant@yash2kant·14 Haz

📢🔍 Super excited to present Spatially Aware Multiview Diffusers (SPAD) at #CVPR2024! SPAD enables 3D consistent multi-view image generation from text or image inputs. It is trained using a high-quality Objaverse subset on 32 H100s! Code & Paper links at the end! 🧵👇

English

6.9K

Umangi Jain retweetledi

Ziyi Wu@Dazitu_616·14 Haz

1/ Excited to share our #CVPR2024 work LEOD! We propose a label-efficient learning framework for object detection with event cameras, which performs on par with SOTA models with **10x fewer labels**! Paper: arxiv.org/abs/2311.17286 Code: github.com/Wuziyi616/LEOD

English

3.9K

Umangi Jain retweetledi

Ashkan Mirzaei@ashmrz10·18 Nis

📢We introduce “RefFusion”, a novel inpainting method for scenes reconstructed using 3D Gaussian Splatting. 🔗reffusion.github.io TLDR: we personalize an image diffusion model to a given reference image and distill its knowledge to 3D through score distillation sampling.

English

186

34.7K

Umangi Jain retweetledi

Prateek Jain@jainprateek_·4 Nis

Gecko, a new text embedding model is released with surprisingly strong MTEB performance despite using 1B sized encoder. It is equipped with MRL -- nested 256 and 512 dimensional embeddings! Provides nearly SOTA performance for 256 dimensional embeddings as well. [1/2]

Jinhyuk Lee@leejnhk

Introducing Gecko 🦎, a new text embedding model from Google DeepMind! Distilled from LLMs, Gecko offers powerful embeddings for various NLP tasks. Gecko is now available in Google Cloud API 👉bit.ly/google-gecko-a… Paper: bit.ly/google-gecko Colab: bit.ly/google-gecko-c…

English

6.1K

Umangi Jain retweetledi

Sherwin Bahmani@sherwinbahmani·28 Mar

Happy to share our new work 🥳 TC4D: Trajectory-Conditioned Text-to-4D Generation Project page: sherwinbahmani.github.io/tc4d Code: github.com/sherwinbahmani… We show controllable motion for text-to-4D object generation and compositional text-to-4D scenes! Thanks @_akhaliq for sharing!

AK@_akhaliq

TC4D Trajectory-Conditioned Text-to-4D Generation Recent techniques for text-to-4D generation synthesize dynamic 3D scenes using supervision from pre-trained text-to-video models. However, existing representations for motion, such as deformation models or time-dependent

English

136

20.2K

Keşfet

@_akhaliq @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine