Ruben Tous

360 posts

Ruben Tous

@rubentous1

Associate Professor at the Department of Computer Architecture of the Universitat Politècnica de Catalunya (UPC)

Katılım Mayıs 2014

276 Takip Edilen88 Takipçiler

Ruben Tous@rubentous1·7 Şub

🎓 PhD Grant Opportunity: The Universitat Politècnica de Catalunya (UPC), Barcelona, is offering a funded PhD position for candidates starting the 2026–2027 academic year. Research topics around Image & Video AI (CV, generative, 2D/3D animation, edge AI, …). DM if interested.

English

Ruben Tous@rubentous1·4 Oca

Number of questions asked by programmers on Stack Overflow. A new era has begun, and we must not forget that it has also been made possible thanks to all the people who, for years, have helped others without expecting anything in return.

English

Ruben Tous@rubentous1·15 Ara

I’ve always thought LLM non-determinism is one of the reasons they’re hard to deploy in some industry tasks. Imagine people realizing that submitting the same mortgage application twice might lead to a different decision. x.com/i/status/20001…

jacob paris ▲@jacobmparis

Why are LLMs non-deterministic? What stops it from traversing the neurons in the same order given the same input each time?

English

Ruben Tous@rubentous1·2 Ara

Three years between the "Salmon going up the river" photo and the "Iowa girl" video. x.com/LinusEkenstam/…

Linus ✦ Ekenstam@LinusEkenstam

The girl on a server in Iowa… 🤭 Jeff you are a legend.

English

Ruben Tous@rubentous1·1 Kas

@wtgowers Civilization speed up. x.com/wtgowers/statu…

Timothy Gowers @wtgowers@wtgowers

I crossed an interesting threshold yesterday, which I think many other mathematicians have been crossing recently as well. In the middle of trying to prove a result, I identified a statement that looked true and that would, if true, be useful to me. 1/3

English

3.3K

Timothy Gowers @wtgowers@wtgowers·31 Eki

English

302

2.5K

891.9K

Ruben Tous@rubentous1·16 Eyl

@JordiTorresAI @BSC_CNS Always inspiring to see friends turn expertise into pages. Can’t wait to read this one. 👏 x.com/JordiTorresAI/…

English

Ruben Tous@rubentous1·27 Ağu

@huybery They’ve already done it. I’ve spent the holidays translating SAM2 (video) to C++ side by side with the AI. Without its help, it would have taken me a year. Either way, it’s been quite a ride.

English

Binyuan Hui@huybery·27 Ağu

I believe LLMs will inevitably surpass humans in coding. Let us think about how humans actually learn to code. Human learning of coding has two stages. First comes memorization and imitation: learning syntax and copying good projects. Then comes trial and error: writing code, running it, fixing it, and improving through feedback. These two stages map closely to how LLMs learn. Pretraining (PT) compresses the entire corpus of code into parameters far beyond any human capacity. Reinforcement learning (RL) aligns the model with execution through feedback, but at a speed and scale humans cannot match, with millions of rollouts completed in a short time. This advantage in parameter capacity and iteration means it is only a matter of time before models move ahead of humans. More importantly, code is not only the foundation of human productivity but also the starting point of recursive improvement. When models can write and optimize their own code, what we see is no longer just AGI but the first signs of ASI. AI4AI is on the way.

English

113

102

122K

Ruben Tous@rubentous1·10 Haz

Just the other day, I was thinking about @karpathy post on this topic and what subtle noise might be responsible for my poor sleep lately. Well, I think it's the cat flap. The noise of the little door wakes me up if it coincides with a light sleep phase. x.com/spectatorindex…

The Spectator Index@spectatorindex

SCIENCE: Two nights of limited sleep, defined as getting four hours per night, is enough to make people feel over 4.4 years older compared to those with adequate sleep. (Source: Proceedings of the Royal Society)

English

Ruben Tous@rubentous1·23 May

@LingYang_PU MMaDA is all you need.

English

150

Ling Yang@LingYang_PU·22 May

We present MMaDA, first diffusion that unifies text reasoning, multimodal understanding, and image generation through Mixed Long-CoT, and unified RL - UniGRPO. 📚 Paper: arxiv.org/abs/2505.15809 💻 Code: github.com/Gen-Verse/MMaDA 📦 Model: huggingface.co/Gen-Verse/MMaD…

Aran Komatsuzaki@arankomatsuzaki

MMaDA: Multimodal Large Diffusion Language Models - UniGRPO, a unified RL algo tailored for diffusion foundation models - MMaDA-8B surpasses Show-o and SEED-X in multimodal understanding, and excels over SDXL and Janus in text-to-image generation

English

127

605

86.9K

Ruben Tous@rubentous1·22 May

No way they call it MMaDA. x.com/LingYang_PU/st…

Ling Yang@LingYang_PU

English

110

Ruben Tous@rubentous1·8 Nis

Recent events around global trade tensions reminded me of The Evolution of Cooperation by Robert Axelrod. Trade wars are just Prisoner’s Dilemma with tariffs.

English

Ruben Tous@rubentous1·2 Nis

Seventy-five years on, it appears the Turing Test has finally been passed. Though it’s no longer viewed as the ultimate measure of intelligence, it remains a significant milestone.

English

Ruben Tous@rubentous1·28 Mar

"Salmon going up the river." October 2022.

English

Ruben Tous@rubentous1·17 Mar

@ChombaBupe A sure sign is when they start talking about intelligent agents. Agents are like squirrels gathering nuts, always warning that a new winter is approaching. A tough one this time.

English

Chomba Bupe@ChombaBupe·17 Mar

It's becoming clear that AI companies lied to the general public that human level intelligent machines are imminent. They have run out of ideas & training data as seen by their now desperate last manauvers trying to go for protected copyrighted data.

BURKOV@burkov

For the past two years, I've been consistently saying that these CEOs and their larvas-influencers are lying. Machine learning simply cannot do what they claimed they would achieve. Yet I've been labeled a naysayer, doomer, denier, and overall a negative person spoiling the fiesta. Today, only an imbecile fails to see that they've been lied to. Altman has shut up about the AGI, Nadella doesn't want to hear about it, and only Amodei just can't shut his mouth—probably for medical reasons. Do you really think they'll ever come and apologize to me for blaming me for speaking the truth?

English

117

611

18.9K

Ruben Tous@rubentous1·30 Oca

PhD position in Efficient Vision Transformer Inference on Edge Devices rtous.github.io/rtous/phd_fi_s…

English

Ruben Tous@rubentous1·28 Oca

If I owned Nvidia stock I'd be more worried about this than DeepSeek.

English

Ruben Tous@rubentous1·23 Eki

"It is computed that eleven thousand persons have at several times suffered death, rather than submit to break their eggs at the smaller end." Jonathan Swift, Gulliver's Travels, 1726 (quoted in Fahrenheit 451, Ray Bradbury, 1953) #Endianness

English

Ruben Tous@rubentous1·1 Eki

Seeing the level of understanding of the physical world of the latest vision models, the question is not if this is intelligence but if our intelligence is actually this. x.com/_akhaliq/statu…

AK@_akhaliq

PhysGen Rigid-Body Physics-Grounded Image-to-Video Generation We present PhysGen, a novel image-to-video generation method that converts a single image and an input condition (e.g., force and torque applied to an object in the image) to produce a realistic, physically plausible, and temporally consistent video. Our key insight is to integrate model-based physical simulation with a data-driven video generation process, enabling plausible image-space dynamics. At the heart of our system are three core components: (i) an image understanding module that effectively captures the geometry, materials, and physical parameters of the image; (ii) an image-space dynamics simulation model that utilizes rigid-body physics and inferred parameters to simulate realistic behaviors; and (iii) an image-based rendering and refinement module that leverages generative video diffusion to produce realistic video footage featuring the simulated motion. The resulting videos are realistic in both physics and appearance and are even precisely controllable, showcasing superior results over existing data-driven image-to-video generation works through quantitative comparison and comprehensive user study. PhysGen's resulting videos can be used for various downstream applications, such as turning an image into a realistic animation or allowing users to interact with the image and create various dynamics.

English

113

Ruben Tous@rubentous1·2 Eyl

Temporally Coherent Video Cartoonization for Animation Scenery Generation mdpi.com/2936646

English

Ruben Tous@rubentous1·13 Ağu

By the way, the final result:

GIF

English

121

Ruben Tous@rubentous1·13 Ağu

When you want your face on magazine covers but you're just a nerd. mdpi.com/1999-4893/17/8

English

154

Keşfet

@wtgowers @JordiTorresAI @BSC_CNS @huybery @karpathy @LingYang_PU @ChombaBupe @elonmusk