Jiraphon Yenphraphai

8 posts

Jiraphon Yenphraphai

@JYenphraphai

New York, NY Katılım Aralık 2022

102 Takip Edilen36 Takipçiler

Jiraphon Yenphraphai@JYenphraphai·17 Nis

@TongPetersb @sainingxie @ylecun @mengyer @YiMaTweets @LukeZettlemoyer @liuzhuang1234 Congrats Peter!

English

Peter Tong@TongPetersb·17 Nis

I defended my thesis today! Sincere thanks to my advisors @sainingxie @ylecun and committee members: @mengyer @YiMaTweets @LukeZettlemoyer @liuzhuang1234. I could not have wished for a better PhD life, and I want to thank everyone who was part of this journey. Slides Link: tsb0601.github.io/data/defense_s…

English

111

1.6K

172.7K

Jiraphon Yenphraphai@JYenphraphai·10 Eki

@HectorNonce It works!

English

Hector@HectorNonce·9 Eki

@JYenphraphai does it work well for faces?

English

Jiraphon Yenphraphai@JYenphraphai·8 Eki

[1/3] 🚀 Introducing ShapeGen4D: video → high-quality 4D mesh sequences. A native, end-to-end video-to-4D model that turns monocular videos into high-quality mesh sequences (no per-frame optimization). details 👉 shapegen4d.github.io

English

142

9.1K

Jiraphon Yenphraphai@JYenphraphai·8 Eki

[3/3] Thank you to all of my team members: @wang12_gordon, @ashmrz10, @peter_wonka, @RaymondYeh, Jianqi Chen, Jiaxu Zhou, @SergeyTulyakov

English

239

Jiraphon Yenphraphai@JYenphraphai·8 Eki

[2/3] How? • Add spatiotemporal attention to a pretrained image-to-mesh DiT • Time-aware point sampling + 4D latent anchoring → aligned latents across frames • Shared noise across frames → stable pose & less flickering → Directly outputs a sequence of meshes

English

324

Jiraphon Yenphraphai retweetledi

Raymond A. Yeh@RaymondYeh·17 Tem

Tomorrow, we are presenting “Model Immunization from a Condition Number Perspective” at ICML: 📢Oral: Jul 17, 1:45–2:00 p.m. EDT @ West Exhib. Hall C 📌Poster: 2:00–4:30 p.m. EDT @ East Exhib. Hall A-B (E-1604) Come talk to Cedar and learn more about reducing model misuse!

English

858

Jiraphon Yenphraphai retweetledi

Saining Xie@sainingxie·16 Şub

Here's my take on the Sora technical report, with a good dose of speculation that could be totally off. First of all, really appreciate the team for sharing helpful insights and design decisions – Sora is incredible and is set to transform the video generation community. What we have learned so far: - Architecture: Sora is built on our diffusion transformer (DiT) model (published in ICCV 2023) — it's a diffusion model with a transformer backbone, in short: DiT = [VAE encoder + ViT + DDPM + VAE decoder]. According to the report, it seems there are not much additional bells and whistles. - "Video compressor network": Looks like it's just a VAE but trained on raw video data. Tokenization probably plays a significant role in getting good temporal consistency. By the way, VAE is a ConvNet, so DiT technically is a hybrid model ;) (1/n)

English

523

2.6K

1.3M

Jiraphon Yenphraphai retweetledi

Saining Xie@sainingxie·6 Oca

Really enjoyed working on this project; some thoughts on why I believe combining the creative freedom of generative models with the precision of the 3D graphics pipeline could be the future. (1/n)🧵

AK@_akhaliq

Intel and NYU present Image Sculpting Precise Object Editing with 3D Geometry Control paper page: huggingface.co/papers/2401.01… present Image Sculpting, a new framework for editing 2D images by incorporating tools from 3D geometry and graphics. This approach differs markedly from existing methods, which are confined to 2D spaces and typically rely on textual instructions, leading to ambiguity and limited control. Image Sculpting converts 2D objects into 3D, enabling direct interaction with their 3D geometry. Post-editing, these objects are re-rendered into 2D, merging into the original image to produce high-fidelity results through a coarse-to-fine enhancement process. The framework supports precise, quantifiable, and physically-plausible editing options such as pose editing, rotation, translation, 3D composition, carving, and serial addition. It marks an initial step towards combining the creative freedom of generative models with the precision of graphics pipelines.

English

142

31.5K

Keşfet

@TongPetersb @sainingxie @ylecun @mengyer @YiMaTweets @LukeZettlemoyer @liuzhuang1234 @HectorNonce