Rotem Shalev-Arkushin

19 posts

Rotem Shalev-Arkushin

Rotem Shalev-Arkushin

@rotemsh3

CS PhD student @ Tel-Aviv University

Katılım Ağustos 2018
74 Takip Edilen35 Takipçiler
Sabitlenmiş Tweet
Rotem Shalev-Arkushin
Rotem Shalev-Arkushin@rotemsh3·
Thrilled to share that our paper ImageRAG has been accepted to #ICLR2026 🤩🇧🇷 Check it out: rotem-shalev.github.io/ImageRAG/ @RinonGal @ohadf @bermano_h
Rotem Shalev-Arkushin@rotemsh3

Excited to introduce our new work: ImageRAG 🖼️✨ rotem-shalev.github.io/ImageRAG We enhance off-the-shelf generative models with Retrieval-Augmented Generation (RAG) for unknown concept generation, using a VLM-based approach that’s easy to integrate with new & existing models! [1/3]

English
2
4
15
791
Rotem Shalev-Arkushin retweetledi
shahar sarfaty
shahar sarfaty@shaharsarfaty·
The GenAI LoRA ecosystem is a dense jungle. 🌿 Introducing CARLoS 🕵️‍♂️ - a system that retrieves LoRAs by how they alter diffusion behavior, and links these metrics to key concepts in copyright law. ⚖️ 🔗 shahar-sarfaty.github.io/CARLoS/ 📄 arxiv.org/abs/2512.08826 🧵[1/6]
shahar sarfaty tweet media
English
5
13
24
571
Rotem Shalev-Arkushin retweetledi
Delip Rao e/σ
Delip Rao e/σ@deliprao·
Hey @iclr_conf, reverting scores is unnecessary punishment for the majority of the authors who had nothing to do with this incident and had successful rebuttals. Instead of detecting collusions on your end (you have a ton of metadata) why is this everyone’s burden to bear?
Delip Rao e/σ tweet media
English
8
29
216
39K
Rotem Shalev-Arkushin retweetledi
Guy Tevet
Guy Tevet@GuyTvt·
(1/4) [HOIDiNi] hoidini.github.io 🧵: Diffusion models are great at generating free-form human motion but tend to break down when objects enter the scene. Human–object interaction demands millimetric precision, and even tiny errors cause hands to float or penetrate surfaces
English
1
8
24
1.1K
Rotem Shalev-Arkushin retweetledi
Shelly Golan
Shelly Golan@Shelly_Golan1·
T2I models excel at realism, but true creativity means generating what doesn't exist yet. How do you prompt for something you can't describe? 🎨 We introduce VLM-Guided Adaptive Negative Prompting: inference time method that promotes creative image generation. 1/6
Shelly Golan tweet media
English
4
42
163
18.6K
Rotem Shalev-Arkushin retweetledi
jaron1990
jaron1990@jaron1990·
1/ What if you could animate a face directly from text? 🎭 Meet Express4D - a dataset of expressive 4D facial motions captured from natural language prompts, designed for generative models and animation pipelines. 🔗jaron1990.github.io/Express4D/ 📹👇
English
2
17
21
2.1K
Rotem Shalev-Arkushin retweetledi
Roi Bar-On
Roi Bar-On@roibar_on·
1/9 Excited to share EditP23! 🎨 Finally, a single tool for ALL your 3D editing needs: ✅ Pose & Geometry Changes ✅ Object Additions ✅ Global Style Transformations ✅ Local Modifications All driven by one simple 2D image edit. It's mask-free ✨ and works in seconds ⚡️. 🧵
Roi Bar-On tweet media
English
2
28
93
9.7K
Rotem Shalev-Arkushin retweetledi
Guy Tevet
Guy Tevet@GuyTvt·
1/ Can we teach a motion model to "dance like a chicken" Or better: Can LoRA help motion diffusion models learn expressive, editable styles without forgetting how to move? Led by @HSawdayee, @chuan_guo92603, we explore this in our latest work. 🎥 haimsaw.github.io/LoRA-MDM/ 🧵👇
English
4
27
125
5.9K
Rotem Shalev-Arkushin retweetledi
Elad Richardson
Elad Richardson@EladRichardson·
Really impressive results for human-object interaction. They use a two-phase process where they optimize the diffusion noise, instead of the motion itself, to get to sub-centimeter precision while staying on manifold 🧠 HOIDiNi - hoidini.github.io
Elad Richardson tweet media
English
1
18
57
3.1K
Rotem Shalev-Arkushin retweetledi
Omer Dahary
Omer Dahary@OmerDahary·
Excited to share that our new work, Be Decisive, has been accepted to SIGGRAPH! We improve multi-subject generation by extracting a layout directly from noise, resulting in more diverse and accurate compositions. Website: omer11a.github.io/be-decisive/ Paper: arxiv.org/abs/2505.21488
Omer Dahary tweet media
English
6
21
63
5K
Rotem Shalev-Arkushin retweetledi
Sara Dorfman
Sara Dorfman@Sara__Dorfman·
Excited to share that "IP-Composer: Semantic Composition of Visual Concepts" got accepted to #SIGGRAPH2025!🥳 We show how to combine visual concepts from multiple input images by projecting them into CLIP subspaces - no training, just neat embedding math✨ Really enjoyed working on this one with the amazing @DanaCohenBar, @RinonGal & @DanielCohenOr1
Linoy Tsaban@linoy_tsaban

🔔just landed: IP Composer🎨 semantically mix & match visual concepts from images ❌ text prompts can't always capture visual nuances ❌ visual input based methods often need training / don't allow fine grained control over *which* concepts to extract from our input images So👇

English
0
25
109
9.7K
Rotem Shalev-Arkushin retweetledi
Sigal Raab
Sigal Raab@sigal_raab·
🔔Excited to announce that #AnyTop has been accepted to #SIGGRAPH2025!🥳 ✅ A diffusion model that generates motion for arbitrary skeletons ✅ Using only a skeletal structure as input ✅ Learns semantic correspondences across diverse skeletons 🌐 Project: anytop2025.github.io/Anytop-page
English
1
24
73
3K
Rotem Shalev-Arkushin retweetledi
AK
AK@_akhaliq·
RefVNLI Towards Scalable Evaluation of Subject-driven Text-to-image Generation
AK tweet media
English
1
52
134
16.9K
Rotem Shalev-Arkushin retweetledi
Aharon Azulay
Aharon Azulay@AharonAzulay·
How well LLMs are memorizing obscure details from scientific papers? I created a benchmark for that! Full code, dataset and data creation method included. tl;dr GPT4.5 is a major jump in scientific facts memorization. Thread below 👇
English
3
4
18
3.1K
Rotem Shalev-Arkushin
Rotem Shalev-Arkushin@rotemsh3·
Given a text prompt, we utilize a VLM to dynamically identify concepts the models struggle to generate on their own, and retrieve compatible images. We use the images to guide the models to generate the missing concepts. [2/3]
Rotem Shalev-Arkushin tweet media
English
1
0
1
205
Rotem Shalev-Arkushin
Rotem Shalev-Arkushin@rotemsh3·
Excited to introduce our new work: ImageRAG 🖼️✨ rotem-shalev.github.io/ImageRAG We enhance off-the-shelf generative models with Retrieval-Augmented Generation (RAG) for unknown concept generation, using a VLM-based approach that’s easy to integrate with new & existing models! [1/3]
Rotem Shalev-Arkushin tweet media
English
1
12
40
2K
Rotem Shalev-Arkushin retweetledi
Guy Tevet
Guy Tevet@GuyTvt·
🚀 Meet DiP: our newest text-to-motion diffusion model! ✨ Ultra-fast generation ♾️ Creates endless, dynamic motions 🔄 Seamlessly switch prompts on the fly Best of all, it's now available in the MDM codebase: github.com/GuyTevet/motio… [1/3]
English
12
86
466
38.5K