تغريدة مثبتة
Ramón Calvo
20 posts

Ramón Calvo
@noctrog
PhD student under @francoisfleuret. Prev. Robotics ESOP @eth, intern @NVIDIA, @sony
Switzerland انضم Temmuz 2015
981 يتبع140 المتابعون
Ramón Calvo أُعيد تغريده

Leveraging the True Depth of LLMs
Ramón Calvo González, Daniele Paliotta, Matteo Pagliardini, Martin Jaggi, François Fleuret.
Action editor: Changyou Chen.
openreview.net/forum?id=JccJ6…
#parallelized #benchmark #llms
Română

I have successfully defended my dissertation "Animal Motion Imitation For Adaptive and Lifelike Control of Legged Robots" at ETH Zurich. A huge thanks to my supervisors, committee members, amazing collaborators, and peers at CRL @crl_ethz who made this possible!


English

@gwenzek In our implementation, MHA heads are “concatenated” as in all heads are processed by the same call to the attention kernel on each GPU. Note that since layers are merged in pairs, and TP needs n_gpus = 2*n where n >= 1, each gpu will only process heads from MHA1 or from MHA2.
English

@noctrog Isn't that a complicated way of concatenating heads of two layers?
English

What is the true depth of an LLM?
Together with @DanielePaliotta, @MatPagliardini, M. Jaggi and @francoisfleuret we show that LLMs may have a smaller effective depth, and that it can be exploited to increase inference speeds on multi-GPU settings!
arxiv.org/abs/2502.02790
(1/N)

English

I would like to thank @dj_jiben for the thoughtful discussions and help with some plots! :)
English

You can find the reference LP implementation here: github.com/noctrog/effect…
(10/10)
English
Ramón Calvo أُعيد تغريده

With the awesome @noctrog, @DanielePaliotta, @MatPagliardini, and Martin Jaggi.
@sciences_UNIGE @ICepfl
TL;DR: you can shuffle the middle layers of a transformer without retraining it. We take advantage of that to compute layers in parallel.
arxiv.org/abs/2502.02790
English
Ramón Calvo أُعيد تغريده

As a comparison to #GameNGen, our model was trained on only 0.5% of the number of frames, with 1 GPU (compared to 128 TPUs).
And our code, model and data are completely open-source! You can play it on your local machine.
github.com/eloialonso/dia…
(3/n)
English
Ramón Calvo أُعيد تغريده

Diffusion world models! With @EloiAlonso1 @AdamJelley2 and @micheli_vincent and colleagues.
Counter-Strike, trained on 4090, 5M frames.
You can install it and *play* in it at 10fps.
@UNIGEnews @sciences_UNIGE
Eloi Alonso@EloiAlonso1
Ever wanted to play Counter-Strike in a neural network? These videos show people playing (with keyboard & mouse) in 💎 DIAMOND's diffusion world model, trained to simulate the game Counter-Strike: Global Offensive. 💻 Download and play it yourself → github.com/eloialonso/dia… 🧵
English
