Adam Polyak

44 posts

Adam Polyak

Adam Polyak

@adam_polyak90

Katılım Kasım 2014
259 Takip Edilen160 Takipçiler
Sabitlenmiş Tweet
Adam Polyak
Adam Polyak@adam_polyak90·
Excited to share our progress on Movie Gen, a SOTA model for video generation! 🎥✨ I worked on this project as part of a cutting-edge team 🔥, pushing the boundaries of video editing ✂️— all without supervised data. Can’t wait to show you what’s next! 🚀🎬
AI at Meta@AIatMeta

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English
3
8
47
4.5K
Adam Polyak retweetledi
AI at Meta
AI at Meta@AIatMeta·
Introducing Muse Spark, the first in the Muse family of models developed by Meta Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration. Muse Spark is available today at meta.ai and the Meta AI app. We’re also making it available in private preview via API to select partners, and we hope to open-source future versions of the model. Learn more: go.meta.me/43ea00
AI at Meta tweet media
English
472
1.1K
9K
2.9M
Adam Polyak retweetledi
Alexandr Wang
Alexandr Wang@alexandr_wang·
1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵
Alexandr Wang tweet media
English
719
1.2K
10.3K
4.4M
Adam Polyak retweetledi
moab.arar
moab.arar@ArarMoab·
Video models as Physics simulators. 🌍🎥 [1/] In our latest work, WinDiNet, we finetuned a pre-trained video model into a differentiable physics engine. 1000x faster than traditional CFD solvers. Project page: rbischof.github.io/windinet_web/ Abs: arxiv.org/abs/2603.21210
English
1
39
213
13.3K
Adam Polyak retweetledi
Yaron Lipman
Yaron Lipman@lipmanya·
**Transition Matching** is a new iterative generative paradigm using Flow Matching or AR models to transition between generation intermediate states, leading to an improved generation quality and speed!
GIF
Neta Shaul@shaulneta

[1/n] New paper alert! 🚀 Excited to introduce 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐓𝐌)! We're replacing short-timestep kernels from Flow Matching/Diffusion with... a generative model🤯, achieving SOTA text-2-image generation! @urielsinger @itai_gat @lipmanya

English
0
19
132
10.8K
Adam Polyak retweetledi
Ahmad Al-Dahle
Ahmad Al-Dahle@Ahmad_Al_Dahle·
Introducing our first set of Llama 4 models! We’ve been hard at work doing a complete re-design of the Llama series. I’m so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4 collection 🦙. Here are some highlights: 📌 The Llama series have been re-designed to use state of the art mixture-of-experts (MoE) architecture and natively trained with multimodality. We’re dropping Llama 4 Scout & Llama 4 Maverick, and previewing Llama 4 Behemoth. 📌 Llama 4 Scout is highest performing small model with 17B activated parameters with 16 experts. It’s crazy fast, natively multimodal, and very smart. It achieves an industry leading 10M+ token context window and can also run on a single GPU! 📌 Llama 4 Maverick is the best multimodal model in its class, beating GPT-4o and Gemini 2.0 Flash across a broad range of widely reported benchmarks, while achieving comparable results to the new DeepSeek v3 on reasoning and coding – at less than half the active parameters. It offers a best-in-class performance to cost ratio with an experimental chat version scoring ELO of 1417 on LMArena. It can also run on a single host! 📌 Previewing Llama 4 Behemoth, our most powerful model yet and among the world’s smartest LLMs. Llama 4 Behemoth outperforms GPT4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks. Llama 4 Behemoth is still training, and we’re excited to share more details about it even while it’s still in flight. A big thanks to all of our launch partners (full list in blog) for helping us bring Llama 4 to developers everywhere including @huggingface, @togethercompute, @SnowflakeDB, @ollama, @databricks and many others👏 This is just the start, we have more models coming and the team is really cooking – look out for Llama 4 Reasoning 😉 A few weeks ago, we celebrated Llama being downloaded over 1 billion times. Llama 4 demonstrates our long-term commitment to open source AI, the entire open source AI community, and our unwavering belief that open systems will produce the best small, mid-size and soon frontier models. Llama would be nothing without the global open source AI community & we are so ready to begin this next chapter with you. 🦙 Read more about the release here: llama.com, and try it in our products today.
Ahmad Al-Dahle tweet media
English
316
889
5.6K
1.1M
Adam Polyak retweetledi
Guy Yariv
Guy Yariv@guy_yariv·
I'm thrilled to announce that Through-The-Mask (TTM) has been accepted to #CVPR2025! TTM is an I2V generation framework that leverages mask-based motion trajectories to enhance object-specific motion and maintain consistency, especially in multi-object scenarios More details👇
Guy Yariv@guy_yariv

[1/8] Recent work has shown impressive Image-to-Video (I2V) generation results. However, accurately articulating multiple interacting objects and complex motions remains challenging. In our new work, we take a step toward addressing this challenge.

English
7
7
44
3K
Adam Polyak
Adam Polyak@adam_polyak90·
🚀 Introducing VideoJAM – a framework that instills a strong motion prior into any video model! By denoising an optical flow derivative alongside pixels, VideoJAM teaches models to generate coherent motion and physics with high-quality visuals. 📽️
Hila Chefer@hila_chefer

VideoJAM is our new framework for improved motion generation from @AIatMeta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

English
2
0
11
552
Adam Polyak retweetledi
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
This is extremely cool! They find diffusion loss is not very sensitive to motion. Thus they fine-tune videogen models with additional explicit motion prediction, making the model generate much more coherent videos. Also, Hila has been doing consistently good work, follow her!
Hila Chefer@hila_chefer

VideoJAM is our new framework for improved motion generation from @AIatMeta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵

English
6
23
275
22.5K
Adam Polyak retweetledi
AK
AK@_akhaliq·
Meta just dropped VideoJAM Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models comparison with openai sora and kling
English
17
120
679
60.9K
Adam Polyak retweetledi
Hila Chefer
Hila Chefer@hila_chefer·
VideoJAM is our new framework for improved motion generation from @AIatMeta We show that video generators struggle with motion because the training objective favors appearance over dynamics. VideoJAM directly adresses this **without any extra data or scaling** 👇🧵
English
61
197
1.1K
168.4K
Adam Polyak retweetledi
Guy Yariv
Guy Yariv@guy_yariv·
[1/8] Recent work has shown impressive Image-to-Video (I2V) generation results. However, accurately articulating multiple interacting objects and complex motions remains challenging. In our new work, we take a step toward addressing this challenge.
English
7
26
80
9.2K
Adam Polyak retweetledi
Danny Trinh
Danny Trinh@dtrinh·
VERY excited about the era of generative AR we're bringing to life. Check out this preview! It's early but so damn promising — this isn't "AI slop"... it's unlocking Creators' imaginations on their own videos. Change your wardrobe, scene, lighting etc. with little expertise. PS it's been so damn special to navigate this idea maze with some of the best & brightest folks from all across Meta. A highlight of my time here so far.
English
23
18
213
31.8K
Adam Polyak retweetledi
Andrew Brown
Andrew Brown@Andrew__Brown__·
So how did we get to these amazing videos for Meta Movie Gen? One of the things I’m proudest of is that we released a very detailed technical report (ai.meta.com/research/movie……) Lets dive into a technical summary of what we did & learnt 🧵 1/n x.com/AIatMeta/statu…
AI at Meta@AIatMeta

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English
25
151
1.2K
333.4K
Adam Polyak retweetledi
Joelle Pineau
Joelle Pineau@jpineau1·
Sharing some of our latest work on generative AI! The video editing features and sound generation are especially exciting. And it comes with a full research paper.
AI at Meta@AIatMeta

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English
3
13
90
9.3K
Adam Polyak
Adam Polyak@adam_polyak90·
Excited to share our progress on Movie Gen, a SOTA model for video generation! 🎥✨ I worked on this project as part of a cutting-edge team 🔥, pushing the boundaries of video editing ✂️— all without supervised data. Can’t wait to show you what’s next! 🚀🎬
AI at Meta@AIatMeta

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English
3
8
47
4.5K