Ta-Ying Cheng

49 posts

Ta-Ying Cheng

@ChengTim0708

Research Scientist @Netflix | D.Phil. in Computer Science @UniofOxford

Inscrit le Ekim 2020

221 Abonnements176 Abonnés

Tweet épinglé

Ta-Ying Cheng@ChengTim0708·6 Haz

Imagine a Van Gogh-style teapot turning into glass with one simple slider🎨 Introducing MARBLE, material edits by simply changing CLIP embedding! 🔗 marblecontrol.github.io 👏 Internship project with @prafull7, @markb_boss , @jampani_varun at @StabilityAI

GIF

English

2.5K

Ta-Ying Cheng retweeté

Phillip Isola@phillip_isola·16 Haz

Slides from my talk on "Language as a Visual Format" at the Visual Generative Modeling workshop at CVPR (mostly derived from slides made by @hyojinbahng and @carolinemchan): dropbox.com/scl/fi/5ok46te…

English

302

18.3K

Ta-Ying Cheng@ChengTim0708·6 Haz

Thanks @_akhaliq ‼️ Be sure to checkout our Hugging Face Demo 🤗: huggingface.co/spaces/stabili…

AK@_akhaliq

Stability AI just released MARBLE on Hugging Face Material Recomposition and Blending in CLIP-Space

English

335

Ta-Ying Cheng@ChengTim0708·6 Haz

Check out more about MARBLE 👇 🌐 Project Page: marblecontrol.github.io 📝 Paper: arxiv.org/abs/2506.05313 🧑‍💻 Code: github.com/Stability-AI/m… 🤗 HF demo: huggingface.co/spaces/stabili…

English

Ta-Ying Cheng@ChengTim0708·6 Haz

Dial roughness down, crank metallic up, stack multiple attributes at once all in a single forward pass!

English

102

Ta-Ying Cheng@ChengTim0708·6 Haz

GIF

English

2.5K

Ta-Ying Cheng retweeté

Chun-Hsiao (Daniel) Yeh@danielyehhh·14 May

🚀 Glad to see our All-Angles Bench (github.com/Chenyu-Wang567…) being adopted to evaluate 3D spatial understanding in Seed-1.5-VL-thinking along with OpenAI (o1) and Gemini 2.5 Pro..!

Yujia Qin@TsingYoga

Introducing Seed-1.5-VL-thinking, the model achieves SOTA on 38 out of 60 VLM benchmarks🥳🥳🥳 github.com/ByteDance-Seed…

English

2.6K

Ta-Ying Cheng retweeté

Yi Ma@YiMaTweets·7 May

It seems there is still a long way to go for multi-modal large models to truly understand space and scene.

Chun-Hsiao (Daniel) Yeh@danielyehhh

❗️❗️ Can MLLMs understand scenes from multiple camera viewpoints — like humans? 🧭 We introduce All-Angles Bench — 2,100+ QA pairs on multi-view scenes. 📊 We evaluate 27 top MLLMs, including Gemini-2.0-Flash, Claude-3.7-Sonnet, and GPT-4o. 🌐 Project: danielchyeh.github.io/All-Angles-Ben…

English

9.7K

Ta-Ying Cheng retweeté

Chun-Hsiao (Daniel) Yeh@danielyehhh·7 May

English

18K

Ta-Ying Cheng retweeté

The Nobel Prize@NobelPrize·8 Eki

BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”

English

990

13.1K

32.4K

12.7M

Ta-Ying Cheng retweeté

AI at Meta@AIatMeta·4 Eki

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English

529

1.5K

6.6K

2.3M

Ta-Ying Cheng@ChengTim0708·26 Eyl

Amazing work combining a variety of 3D models with LLMs for better spatial reasoning! @ChenyangMa119

Chenyang Ma@ChenyangMa119

#NeurIPS #NeurIPSConf Thrilled to share that our paper SpatialPIN has been accepted at #NeurIPS2024! We introduce a modular plug-and-play framework that progressively enhances VLMs' 3D reasoning by prompting and interacting with 3D foundational models. (1/8)

English

245

Ta-Ying Cheng@ChengTim0708·2 Tem

@CMHungSteven @prafull7 @jampani_varun Thank you!!

English

Min-Hung (Steve) Chen@CMHungSteven·2 Tem

@ChengTim0708 @prafull7 @jampani_varun Congrats 🎉

English

Ta-Ying Cheng@ChengTim0708·1 Tem

Thrilled to share that ZeST has been accepted to #ECCV2024 !! A huge thanks to my collaborators/mentors @prafull7 , @jampani_varun, and my supervisors Niki Trigoni and Andrew Markham for the amazing support!

Ta-Ying Cheng@ChengTim0708

Today, with my collaborators @prafull7 (MIT CSAIL), @jampani_varun (@StabilityAI ), and my supervisors Niki Trigoni and Andrew Markham, we share with you ZeST, a zero-shot, training free method for image-to-image material transfer! Project Page: ttchengab.github.io/zest/ 1/8

English

3.2K

Ta-Ying Cheng@ChengTim0708·19 Haz

I will be presenting today at #CVPR2024 ! Drop by our poster session to learn more about Continuous 3D Words for text-to-image generation: Time & Place: 19 Jun 5pm / Arch 4A-E P190 Project page: ttchengab.github.io/continuous_3d_…

English

529

Ta-Ying Cheng@ChengTim0708·6 Haz

@prafull7 @MIT Congrats Prafull!!

English

125

Prafull Sharma@prafull7·6 Haz

Graduated with a PhD in Computer Science @MIT! Grateful to my advisors and teachers who helped me learn and grow in this journey! Thanks to all my friends and family members for their support.