Mez

4.9K posts

Mez

@mez_gebre

Palo Alto, CA Katılım Temmuz 2009

213 Takip Edilen336 Takipçiler

Mez retweetledi

Songyou Peng@songyoupeng·23 Nis

Yay, finally! Introducing Vision Banana🍌 from @GoogleDeepMind, our unified model that outperforms SoTA specialist models on various vision tasks! By treating 2D/3D vision tasks as image generation, we unlock a new foundation for CV. Project page: vision-banana.github.io (1/5)

English

309

2.2K

280K

Mez retweetledi

SkalskiP@skalskip92·27 Mar

RF-DETR + Trackers is such a strong open-source combo I fine-tuned RF-DETR on the VisDrone dataset, plugged in the OC-SORT tracker now I’m going to build some cool smart city demos link: github.com/roboflow/track…

English

106

722

102.7K

Mez retweetledi

Hugging Models@HuggingModels·27 Mar

Somebody just trained an LTX 2.3 LoRA of George Costanza at home on a 5090 in about a day with AI Toolkit. Then generated a 30-second video with ComfyUI on the same setup in just 6 minutes. Open source is, always has been, and always will be, the future of generative AI.

English

869

81.8K

Mez retweetledi

Jia-Bin Huang@jbhuang0604·26 Mar

A great example that medium shapes impact. A research paper on arXiv 11 months ago: 👉 2 citations so far An accessible blog post one day ago: 👉 12 M views, instant community adoption

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

1.1K

162.2K

Mez retweetledi

Hugging Models@HuggingModels·25 Mar

Qwen3.5 0.8B running real-time video captioning on a Mac Studio M2 Ultra. <1s per frame. 269 frames from a 3m49s video. Streaming descriptions as it plays. Pause anywhere, it actually understands the scene. ~1GB model. Local AI is getting unreasonably capable. Video credit: @stevibe

English

284

264.7K

Mez retweetledi

Peter Holderrieth@peholderrieth·18 Mar

🚀MIT Flow Matching and Diffusion Lecture 2026 Released (diffusion.csail.mit.edu)! We just released our new MIT 2026 course on flow matching and diffusion models! We teach the full stack of modern AI image, video, protein generators - theory and practice. We include: 📺 Videos: Step-by-step derivations. 📝 Notes: Mathematically self-contained lecture notes 💻 Coding: Hands-on exercises for every component We fully improved last years’ iteration and added new topics: latent spaces, diffusion transformers, building language models with discrete diffusion models. Everything is available here: diffusion.csail.mit.edu A huge thanks to Tommi Jaakkola for his support in making this class possible and Ashay Athalye (MIT SOUL) for the incredible production! Was fun to do this with @RShprints! #MachineLearning #GenerativeAI #MIT #DiffusionModels #AI

English

396

2.3K

528.1K

Mez retweetledi

alphaXiv@askalphaxiv·19 Mar

Yann LeCun and his team dropped yet another paper! "V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning" In this V-JEPA upgrade, they showed that if you make a video model predict every patch, not just the masked ones AND at multiple layers, they are able to turn vague scene understanding into dense + temporal stable features that actually understands "what is where". This key insight drove improvements in segmentation, depth, anticipation, and even robot planning.

English

220

1.4K

121.9K

Mez@mez_gebre·20 Mar

How AI Impacts Skill Formation arxiv.org/abs/2601.20245

English

Mez retweetledi

Nataniel Ruiz@natanielruizg·6 Mar

Excited to show some surprising inventions on generative multiplayer games we made at Google with Stanford. We call the work MultiGen. I've always been inspired by early studios like id Software with Doom or Blizzard with Warcraft bringing networked video games to the next level. We are at the point in history where we can make strides like them, but for generative games. It's a strange feeling to be in the age of generative video games while still discovering how exactly to train the models and design the tools that make them useful. All of the tools that have been invented for classic game engines need to be redesigned for generative games. For example level and world design is not entirely possible with existing technology. We introduce editable memory to diffusion game engines that allow for design of new levels via a minimap. But we can easily imagine how this can be expanded with different creation tools. The end goal of this research direction is to allow game designers to be able to guide the generation process of their world, at the granularity that they prefer. Editable memory also allows us to add multiplayer to Generative Doom. We were amazed when we saw GameNGen some years ago, and now you can play it live with friends in real-time, on your couch or even online. Shared representations like our editable memory seem like the future for this type of experience. Models are, in some cases, expensive and approximate encoders but great interpolators and extrapolators. Leveraging their strengths lets you have completely new experiences that can be realized now and not in the distant future. This work was started at my previous team and continued in collaboration with Stanford. Congratulations to all for the discoveries.

English

577

103.9K

Mez@mez_gebre·7 Mar

I've been messing around with different generative models around flow matching. Variational Rectified Flow Matching is a cool variant that solves the mean collapse issue with multi-modal target distributions!

English

Mez@mez_gebre·27 Şub

Chatting with a friend working on full-duplex audio models (audio in → audio out) got me curious about how to work with audio. Did a weekend of experiments using audio classification as a “hello world” to learn the space. Notes + deep dive-ish 👇 mez.sh/2026/02/17/aud…

English

Mez retweetledi

Sakana AI@SakanaAILabs·27 Şub

We’re excited to introduce Doc-to-LoRA and Text-to-LoRA, two related research exploring how to make LLM customization faster and more accessible. pub.sakana.ai/doc-to-lora/ By training a Hypernetwork to generate LoRA adapters on the fly, these methods allow models to instantly internalize new information or adapt to new tasks. Biological systems naturally rely on two key cognitive abilities: durable long-term memory to store facts, and rapid adaptation to handle new tasks given limited sensory cues. While modern LLMs are highly capable, they still lack this flexibility. Traditionally, adding long-term memory or adapting an LLM to a specific downstream task requires an expensive and time-consuming model update, such as fine-tuning or context distillation, or relies on memory-intensive long prompts. To bypass these limitations, our work focuses on the concept of cost amortization. We pay the meta-training cost once to train a hypernetwork capable of producing tasks or document specific LoRAs on demand. This turns what used to be a heavy engineering pipeline into a single, inexpensive forward pass. Instead of performing per-task optimization, the hypernetwork meta-learns update rules to instantly modify an LLM given a new task description or a long document. In our experiments, Text-to-LoRA successfully specializes models to unseen tasks using just a natural language description. Building on this, Doc-to-LoRA is able to internalize factual documents. On a needle-in-a-haystack task, Doc-to-LoRA achieves near-perfect accuracy on instances five times longer than the base model's context window. It can even generalize to transfer visual information from a vision-language model into a text-only LLM, allowing it to classify images purely through internalized weights. Importantly, both methods run with sub-second latency, enabling rapid experimentation while avoiding the overhead of traditional model updates. This approach is a step towards lowering the technical barriers of model customization, allowing end-users to specialize foundation models via simple text inputs. We have released our code and papers for the community to explore. Doc-to-LoRA Paper: arxiv.org/abs/2602.15902 Code: github.com/SakanaAI/Doc-t… Text-to-LoRA Paper: arxiv.org/abs/2506.06105 Code: github.com/SakanaAI/Text-…

GIF

English

349

2.2K

604.8K

Mez@mez_gebre·2 Kas

The @nextjs App Router Course was pretty cool. I went through the course using @pocketbase; to give any new user of NextJS that happens to also be using pocketbase a reference implementation. Here is the repo: github.com/mez/nextjs-poc… #pocketbase #nextjs14

English

200

Mez retweetledi

Modular@Modular·7 Eyl

Mojo🔥 is now available for download locally to your machine! ❤️‍🔥🚀 Beyond a compiler, the Mojo SDK includes a full set of developer and IDE tools 🛠 that make it easy to build and iterate on Mojo applications. Let’s build the future together!🔥 modular.com/blog/mojo-its-…

English

422

1.7K

428.3K

Mez retweetledi

Paul Graham@paulg·23 Haz

"Sperm count appeared to have declined 52 per cent in 38 years, or something over 1 per cent a year." ft.com/content/f14ab2…

English

156

352

2.1K

995K

Mez retweetledi

Brendan Dolan-Gavitt@moyix·1 Ara

ChatGPT exploits a buffer overflow 😳

English

947

5.8K

Mez retweetledi

Jens Axboe@axboe·12 Ağu

"Running a successful open source project is just Good Will Hunting in reverse, where you start out as a respected genius and end up being a janitor who gets into fights." Quote attributed to @cra, and I don't think I've ever seen anything more true posted.

English

787

4.5K

Mez retweetledi

Paul Graham@paulg·7 Ağu

Effective organizations are unnatural. The natural state of organizations is bureaucracy and turf wars, and once deprived of effective leadership they revert to their natural state with shocking speed.

English

459

3.8K

Mez retweetledi

Bojan Tunguz@tunguz·5 Ağu

A very good paper I came across this morning by the @DeepMind researchers. For the past five years Transformers have been one of the most dominant approaches to Deep Learning problems, especially in the #NLP domain. 1/5

English

187

1.1K

Keşfet

@GoogleDeepMind @stevibe @RShprints @nextjs @pocketbase @cra @elonmusk @BarackObama