Yael Vinker🎗

766 posts

Yael Vinker🎗 banner
Yael Vinker🎗

Yael Vinker🎗

@YVinker

Postdoctoral Associate at @MIT_csail

Boston Katılım Temmuz 2021
369 Takip Edilen1.5K Takipçiler
Sabitlenmiş Tweet
Yael Vinker🎗
Yael Vinker🎗@YVinker·
I am *very* excited to announce our SIGGRAPH 2026 workshop: Lines & Minds: Visual Abstraction in Art, Psychology, and Computer Graphics 🎨🧠🫖 🔗 lines-and-minds.github.io 📅 Sunday, July 19 Join us to explore how visual abstraction shapes how we think, create, and communicate.
Yael Vinker🎗 tweet media
English
6
18
102
10.2K
Yael Vinker🎗 retweetledi
Elad Richardson
Elad Richardson@EladRichardson·
Excited to share that we received an 🌱 Honorable Mention award for Inspiration Seeds 🌱 Was really a pleasure working on that one 😊
English
1
2
8
327
Yael Vinker🎗 retweetledi
Yael Vinker🎗 retweetledi
MIT CSAIL
MIT CSAIL@MIT_CSAIL·
Classic computer animation techniques from the 90s, as used in Toy Story. v/@0xmitsurii
English
5
33
237
26.9K
Runway
Runway@runwayml·
Aleph 2.0 is here. Now you can edit a single frame in your video, preview the change and then Aleph 2.0 carries that edit across the rest of your video. Try it now in the new Edit Studio on web at the link below.
English
133
263
1.9K
4.9M
Yuval Alaluf
Yuval Alaluf@yuvalalaluf·
Playing around some more with Aleph 2.0. All edits here are guided from a single anchor editing frame, even across longer multi-shot videos. Excited about the jump to 1080p and more precise, localized editing. Can’t wait to share more results soon!
English
3
1
25
1K
Lior Yariv
Lior Yariv@YarivLior·
High-resolution denoising at every step is computationally redundant, especially since diffusion has spectral autoregressive behavior. We show that you can dynamically grow the resolution as the denoising happens, giving a plug-and-play way to accelerate generation.
Gordon Wetzstein@GordonWetzstein

High-fidelity generation is hitting a scaling crisis as DiT compute grows with image resolution and video length. But do we need high-resolution denoising at every step? We introduce Spectral Progressive Diffusion, a plug-and-play framework for efficient image and video generation that directly exploits the spectral autoregression property of diffusion to grow resolution during denoising. [1/7]

English
6
8
58
7.8K
Gordon Wetzstein
Gordon Wetzstein@GordonWetzstein·
High-fidelity generation is hitting a scaling crisis as DiT compute grows with image resolution and video length. But do we need high-resolution denoising at every step? We introduce Spectral Progressive Diffusion, a plug-and-play framework for efficient image and video generation that directly exploits the spectral autoregression property of diffusion to grow resolution during denoising. [1/7]
English
22
64
405
83.9K
Nataniel Ruiz
Nataniel Ruiz@natanielruizg·
It's been so unique to work on Gemini Omni pre-training with the best team on earth. Omni has outstanding reference-based generation and novel multimodal capabilities. It feels to me like a new paradigm. Native video editing, visual and vocal personalization, high quality outputs, multimodal understanding. All in one. Approaching the important AGI milestone of an anything-in-anything-out model. Here is me doing my best impression of directing, and acting in, a Lovecraftian horror film mixed with a Magnolia-like narration. Notice the expressions, the facial detail - it looks like me and it feels like me. I made this in less than an hour.
English
5
6
96
19.9K
#CVPR2026
#CVPR2026@CVPR·
We are grateful to all of the 17,491 reviewers who helped make #CVPR2026 possible. We are especially pleased to recognize the following Outstanding Reviewers, whose high-quality reviews (as judged by their Area Chairs) placed them among the top 5% of reviewers.
#CVPR2026 tweet media#CVPR2026 tweet media#CVPR2026 tweet media#CVPR2026 tweet media
English
4
37
195
85.9K
Yael Vinker🎗 retweetledi
Zain Shah
Zain Shah@zan2434·
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see. @eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
English
1.1K
3.7K
28.6K
5.9M
Yael Vinker🎗 retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc. More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage: 1) raw text (hard/effortful to read) 2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default 3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default ...4,5,6,... n) interactive neural videos/simulations Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral x.com/zan2434/status… There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen. TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.
Thariq@trq212

x.com/i/article/2052…

English
997
2K
18.8K
3.6M