Saketh Rambhatla (@rssaketh) - Twitter प्रोफ़ाइल

पिन किया गया ट्वीट

Do your supervised models fail to adapt to scenarios with novel classes? Sick of re-training your model for every new category introduced in the dataset? Checkout our #ICCV2021 paper “The Pursuit of Knowledge: Discovering and Localizing novel categories using dual memory”. (1/4)

English

2

15

35

0

Saketh Rambhatla रीट्वीट किया

Ananye Agarwal@anag004·2d

Warehouses are a critical part of the modern economy, but are still bottlenecked by human labor. We've acquired Fetch Robotics to tackle previously hard-to-automate parts of warehouse logistics with our omni-bodied AI! (Our factorio players are delighted)

Skild AI@SkildAI

We have acquired Zebra Technologies’ robotics arm (formerly Fetch Robotics). This is what happens when orchestration meets intelligence -- a major step toward fully autonomous warehouses. More robots. More environments. One unified brain.

English

0

2

12

644

Saketh Rambhatla रीट्वीट किया

Skild AI@SkildAI·2d

We have acquired Zebra Technologies’ robotics arm (formerly Fetch Robotics). This is what happens when orchestration meets intelligence -- a major step toward fully autonomous warehouses. More robots. More environments. One unified brain.

English

9

61

313

66.7K

Saketh Rambhatla रीट्वीट किया

Deepak Pathak@pathak2206·8 Nis

We hosted Prof. Alyosha Efros (UC Berkeley) at @SkildAI! He didn't believe that robots could actually cook eggs reliably. :) Tested back-to-back 5times without fail! One batch of scrambled eggs every ~2.5mins nonstop. The same model assembles a GPU on a server rack too.

English

37

212

1.7K

162.4K

Saketh Rambhatla@rssaketh·26 Mar

Skild brain at work at GTC!!! 🚀🚀🤖🤖

Skild AI@SkildAI

Nearly every system today, from energy to chips to food, is bottlenecked by scarce human capital. We are changing that by building AI-powered industries of the future. Check out Skild Brain robustly assembling GPU racks, a highly precise task, live at #NvidiaGTC.

English

0

5

129

Saketh Rambhatla रीट्वीट किया

Deepak Pathak@pathak2206·15 Oca

At @SkildAI, we’ve raised $1.4B, bringing our valuation to over $14B. We’re on a generational mission, and I’m grateful to be working alongside an exceptional team. Thanks to our investors for the long-term conviction towards omni-bodied intelligence 🚀 bloomberg.com/news/articles/…

Skild AI@SkildAI

Announcing Series C We’ve raised $1.4B, valuing the company at over $14B With this capital, we will accelerate our mission to build omni-bodied intelligence 🚀 skild.ai/blogs/series-c

English

46

56

627

158.9K

Saketh Rambhatla रीट्वीट किया

Skild AI@SkildAI·14 Oca

Announcing Series C We’ve raised $1.4B, valuing the company at over $14B With this capital, we will accelerate our mission to build omni-bodied intelligence 🚀 skild.ai/blogs/series-c

English

25

74

594

342.9K

Saketh Rambhatla रीट्वीट किया

Skild AI@SkildAI·13 Oca

See our robot open doors, water plants, assemble a box, and more by learning from watching humans. Using <1hr of robot data.

English

2

11

79

36K

Saketh Rambhatla रीट्वीट किया

Skild AI@SkildAI·13 Oca

Let robots clean in peace 🧺🤖✨

English

6

44

194

158.5K

Saketh Rambhatla रीट्वीट किया

Skild AI@SkildAI·13 Oca

Humans learn by watching. Robots should too.

English

28

130

865

1.1M

Saketh Rambhatla रीट्वीट किया

Ishan Misra@imisra_·31 Oca

Inference time objectives are amazing :) We show that LLMs can be upgraded to multimodal beings by a simple trick :) No training needed! Works on image generation, editing, style transfer and more!

Rohit Girdhar@_rohitgirdhar_

Super excited to share some recent work that shows that pure, text-only LLMs, can see and hear without any training! Our approach, called "MILS", uses LLMs with off-the-shelf multimodal models, to caption images/videos/audio, improve image generation, style transfer, and more!

English

6

40

197

56.3K

Saketh Rambhatla रीट्वीट किया

Rohit Girdhar@_rohitgirdhar_·31 Oca

Super excited to share some recent work that shows that pure, text-only LLMs, can see and hear without any training! Our approach, called "MILS", uses LLMs with off-the-shelf multimodal models, to caption images/videos/audio, improve image generation, style transfer, and more!

English

7

37

243

68.9K

Saketh Rambhatla रीट्वीट किया

Ching-Yao Chuang@ChingYaoChuang·18 Oca

Super cool to see transformers scaling so effectively for image/video autoencoders! Our model also offers a flexible way to implement variable token length

Philippe Hansen-Estruch@tokenpilled65B

Excited to share my work at Meta! We explore scaling tokenizers w/ ViT (ViTok) & found scaling tokenizers with DiT generation pipeline doesn’t boost performance for the current paradigm of auto-encoders! We develop SOTA tokenizers for images/videos. Thread for findings

English

1

4

48

5.3K

Saketh Rambhatla रीट्वीट किया

Shijie Wang@ShijieWang20·24 Ara

How can we better animate images solely following text descriptions? We present Motion Focal Loss (MotiF) (arxiv.org/abs/2412.16153) to better align motions with text descriptions in text-image-to-video (TI2V) task and release TI2V-Bench, a comprehensive TI2V benchmark. (1/n)

English

6

10

52

5.8K

Saketh Rambhatla रीट्वीट किया

Mannat Singh@mannat_singh·20 Ara

Flow matching can transform one distribution to another. So why do text-to-image models map noise to images instead of directly mapping text to images? Wouldn't it be cool to directly connect modalities together? CrossFlow accomplishes exactly that! cross-flow.github.io

English

2

41

321

32.8K

Saketh Rambhatla रीट्वीट किया

Mara Levy@mlevy1221·10 Ara

How can we make Imitation Leaning generalize? In my latest work we show that a key point based representation can generalize to novel instances of an object and is agnostic to background changes.

English

1

11

44

11.9K

Saketh Rambhatla रीट्वीट किया

Andrew Brown@Andrew__Brown__·8 Kas

🚨 Internship in Meta GenAI NYC 🚨 I have an open PhD internship position for 2025! Interested in exploring visual generative models (or any other exciting ideas) inside the team that brought you Movie Gen and Emu Video? 📩 Send me DM with CV, website, and GScholar profile

AI at Meta@AIatMeta

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English

3

27

263

35.3K

Saketh Rambhatla रीट्वीट किया

Manohar Paluri@manohar_paluri·4 Eki

Meta Movie Gen is just freakin cool! Generative Video Foundation models with this quality, precise editing and personalization unlock value for creators, new creative tools and enable Agents that can interact in richer ways closing the loop on learning to unlock world models!

AI at Meta@AIatMeta

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English

3

4

51

6.1K

Saketh Rambhatla रीट्वीट किया

Shelly Sheynin@ShellySheynin·4 Eki

I’m thrilled and proud to share our model, Movie Gen, that we've been working on for the past year, and in particular, Movie Gen Edit, for precise video editing. 😍 Look how Movie Gen edited my video!

AI at Meta@AIatMeta

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English

54

88

811

234.9K

Saketh Rambhatla रीट्वीट किया

Roshan Sumbaly@rsumbaly·4 Eki

Lights, camera, action - introducing Meta's Movie Gen! Our latest breakthrough in AI-powered media generation, setting a new standard for immersive AI content creation. We're also releasing a 92 page detailed report of what we learned, along with evaluation prompts that we hope pushes the community forward. 📽️Examples: youtube.com/playlist?list=… 📜Paper: ai.meta.com/static-resourc… 🖥️Site: ai.meta.com/research/movie…

English

2

10

85

6.8K

Saketh Rambhatla रीट्वीट किया

Mannat Singh@mannat_singh·4 Eki

Check out Movie Gen 🎥 Our latest media generation models for video generation, editing, and personalization, with audio generation! 16 second 1080p videos generated through a simple Llama-style 30B transformer. Demo + detailed 92 page technical report 📝⬇️

AI at Meta@AIatMeta

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ go.fb.me/kx1nqm 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

English

1

16

1K

Saketh Rambhatla

खोजें