Edouard Grave

129 posts

Edouard Grave

@EXGRV

large language models @kyutai_labs

paris, france เข้าร่วม Ekim 2012

166 กำลังติดตาม2.9K ผู้ติดตาม

Edouard Grave รีทวีตแล้ว

Omar Sanseviero@osanseviero·5h

Gemma 4 is here! 🧠 31B and 26B A4B for models with impressive intelligence per parameter 🤏E2B and E4B for mobile and IoT 🤗Apache 2.0 🤖Base and IT checkpoints available Available in AI Studio, Hugging Face, Ollama, Android, and your favorite OS tools 🚀Download it today!

English

752

66.2K

Edouard Grave@EXGRV·7 Ağu

@giffmana This reminds me of the XLNet paper: arxiv.org/abs/1906.08237

English

2.5K

Lucas Beyer (bl16)@giffmana·7 Ağu

Amazing! Truly open review, through which we all gained more insights, i love it! Result: in multi epoch setting, making AR learn multiple orderings ~closes the gap to diffusion, explaining much of the difference. How the truly open review happened (from my vague memory): Mihir posted paper on diffusion being more sample efficient than AR and benefit from more epochs. Hypothesis in paper and from most commenters was due to augs, but no experiment to check this hypothesis in paper. After some discussion (on his thread, here on x) about this and what experiment might check this, including a first negative result Mihir mentioned and tried, we converged towards thinking it might be ordering instead. I suggested an experiment to check, but it had some issues. @YouJiacheng chimed in with connections to another paper, based on which we got the idea of how to make the experiment perfect. Then Mihir ran it, and it now looks like we have reasonably conclusive evidence PLUS potentially more confidence in a method to make AR better at multi-epoch. All in the open in realtime here on x dot com the everything app™

Mihir Prabhudesai@mihirp98

We ran more experiments to better understand “why” diffusion models do better in data-constrained settings than autoregressive. Our findings support the hypothesis that diffusion models benefit from learning over multiple token orderings, which contributes to their robustness and reduced overfitting. To test this, we trained autoregressive (AR) models with varying numbers of token orderings: N=1 corresponds to the standard left-to-right ordering, while N=k includes the left-to-right order plus k−1 additional random permutations. As N increases, we observe that AR models become more data-efficient, exhibiting improved validation loss and reduced overfitting. All models were trained for 100 epochs, and were evaluated using the standard left-to-right factorization. We also experimented with related approaches, such as RAR and σ-GPT, and observed consistent trends --introducing more random factorizations led to better generalization and less overfitting. We have updated our arXiv submission with these new results. We thank @giffmana and @YouJiacheng for suggesting these experiments. Original paper post - x.com/mihirp98/statu…

English

589

77.6K

Edouard Grave รีทวีตแล้ว

kyutai@kyutai_labs·6 Haz

Unmute meets Moshi 🫂💖 Talk to unmute.sh!

English

6.2K

Edouard Grave@EXGRV·6 Şub

Today, we release our 🇫🇷 to 🇬🇧 simultaneous speech-to-speech translation system, called Hibiki. It runs on-device & the model, inference code and tech report are available. This is built using the same audio LLM as Moshi, showing its versatility. 🟢

kyutai@kyutai_labs

Meet Hibiki, our simultaneous speech-to-speech translation model, currently supporting 🇫🇷➡️🇬🇧. Hibiki produces spoken and text translations of the input speech in real-time, while preserving the speaker’s voice and optimally adapting its pace based on the semantic content of the source speech. Based on objective and human evaluations, Hibiki outperforms previous systems for quality, naturalness and speaker similarity and approaches human interpreters. 🧵

English

1.7K

Edouard Grave@EXGRV·13 Oca

Excited to release a preview of Helium-1, our 2B LLM targeting edge and mobile devices. 🚀 More to come in the future: training code, support for more languages, data pipeline, tech report & more… 🟢

kyutai@kyutai_labs

Meet Helium-1 preview, our 2B multi-lingual LLM, targeting edge and mobile devices, released under a CC-BY license. Start building with it today! huggingface.co/kyutai/helium-…

English

6.3K

Edouard Grave รีทวีตแล้ว

kyutai@kyutai_labs·13 Oca

Meet Helium-1 preview, our 2B multi-lingual LLM, targeting edge and mobile devices, released under a CC-BY license. Start building with it today! huggingface.co/kyutai/helium-…

English

379

58.2K

Edouard Grave@EXGRV·18 Eyl

Local voice models FTW 🚀

kyutai@kyutai_labs

Talking to Moshi locally on a Macbook M series (python 3.12) in 2 lines: pip install moshi_mlx python -m moshi_mlx.local_web -q 4

English

Edouard Grave@EXGRV·24 Tem

Moshi goes to #ICML2024 in Vienna! Try the demo at moshi.chat

English

Edouard Grave@EXGRV·23 Tem

I am at ICML in Vienna! Let me know if you want to chat about (or to) Moshi, multimodal LLMs, Kyutai & more.

English

11.2K

Edouard Grave@EXGRV·4 Tem

@soumithchintala @kyutai_labs Thank you Soumith!

English

244

Soumith Chintala@soumithchintala·3 Tem

this is amazing, but more importantly, liberating! @kyutai_labs leading the charge on real time voice assistants and as a true open-science non-profit, will release the code and details. congrats to a bunch of my former FAIR colleagues at Kyutai on the launch!

kyutai@kyutai_labs

Moshi and Neil on stage giving some emotional improv.

English

224

24.4K

Edouard Grave@EXGRV·4 Tem

@Thom_Wolf @kyutai_labs Thanks Thomas!

English

222

Thomas Wolf@Thom_Wolf·3 Tem

The @kyutai_labs fully end-to-end audio model demo of today is a huge deal that many people missed in the room Mostly irrelevant are the facts that: - they come a few week after OpenAI ChatGPT-4o - the demo was less polished than the 4o one (in terms of voice quality, voice timing…) Relevant: - the model training pipeline and model archi are simple and hugely scalable, with a tiny 8+ people team like Kyutai building it in 4 months. Synthetic data is a huge enabler here - laser focus on local devices: Moshi will soon be everywhere. Frontier model builders have low incentive to let you run smaller models locally (price per token…) but non-profits like Kyutai have very different incentives. The Moshi demo is already online while the OpenAI 4o one is still in limbo. - going under 300 ms of latency while keeping Llama 8B or above quality of answers is a key enabler in terms of interactivity, it’s game changing, This feeling when the model answer your question before you even finished asking is quite crazy or when you interrupt the model while it’s talking and it react… Predictive coding in a model, instantly updated model of what you’re about to say... Basically they nailed the fundamentals. It’s here. This interactive voice tech will be everywhere. It will soon be an obvious commodity.

English

350

1.8K

339.3K

Edouard Grave@EXGRV·4 Tem

@giffmana Thanks Lucas!

English

137

Lucas Beyer (bl16)@giffmana·3 Tem

Kyutai Moshi - first real-time Audio LLM. Basically no delay - the LLM even interrupted the speaker a few times. It was actually a bit eager to answer very quick. :) All to be open-sourced. Quality still a bit robotic though, but ok for v1. Pretty cool overall, congrats!

kyutai@kyutai_labs

Join us live tomorrow at 2:30pm CET for some exciting updates on our research! youtube.com/live/hm2IJSKcY…

English

396

44.8K

Edouard Grave รีทวีตแล้ว

Alexandre Défossez@honualx·13 Ara

Looking forward to discuss open research at @kyutai_labs. If you want to work on large scale multimodal LLMs, come and talk to us, this is what we look like 👇☕️

Neil Zeghidour@neilzegh

Look for my @kyutai_labs colleagues at #NeurIPS2023 if you want to learn more about our mission. We are recruiting permanent staff, post-docs and interns!

English

101

29.5K

Edouard Grave@EXGRV·8 Ara

✈️ I will be attending #NeurIPS2023: let me know if you want to chat about the future of LLMs, and how to democratize them. 🌐 We are also hiring members of technical staff and interns @kyutai_labs. Happy to talk about the lab and our mission.

English

13.4K

Edouard Grave@EXGRV·20 Kas

@ylecun Merci Yann !

Türkçe

152

Yann LeCun@ylecun·18 Kas

@EXGRV Félicitations !

Français

752

Edouard Grave@EXGRV·17 Kas

/kyutai has landed! Super excited to build this new research lab. Pure focus on research. As open as it gets.

kyutai@kyutai_labs

Announcing Kyutai: a non-profit AI lab dedicated to open science. Thanks to Xavier Niel (@GroupeIliad), Rodolphe Saadé (@cmacgm) and Eric Schmidt (@SchmidtFutures ), we are starting with almost 300M€ of philanthropic support. Meet the team ⬇️

English

152

12.5K

Edouard Grave@EXGRV·17 Kas

@soumithchintala Thanks for the kind words Soumith! Really excited by this new lab.

English

251

Soumith Chintala@soumithchintala·17 Kas

a great new open-science lab out of Paris, with a solid amount of initial funding! A very strong talent bench. i had the privilege of working with most of them before, and they're awesome researchers and great human beings!

kyutai@kyutai_labs

Our founding team is covering many AI fields from vision, with Patrick Pérez and Hervé Jégou (@hjegou) to LLMs with Edouard Grave (@EXGRV), audio with Neil Zeghidour (@neilzegh) and Alexandre Défossez (@honualx) and infra with Laurent Mazaré (@lmazare).

English

109

33.5K

Edouard Grave@EXGRV·3 Eyl

@abacaj Yes, we have a couple of papers on that exact topic with @gizacard and @PSH_Lewis. Combining these advances lead to the Atlas language model (paper: arxiv.org/abs/2208.03299, code: github.com/facebookresear…).

English

563

anton@abacaj·2 Eyl

Anyone have success fine tuning models with retrieval? Fine tuning the model to answer the question based on the context, trying this for code will see how it goes

English

22K

Edouard Grave@EXGRV·15 Tem

@francoisfleuret @elonmusk The posters definitely deserve more than 6 likes (or 28 now)!

English

145

François Fleuret@francoisfleuret·14 Tem

That I get 500 likes for criticizing @elonmusk and 6 for this masterpiece is quite sad.

François Fleuret@francoisfleuret

How awesome is that. Poster of the scifi movie "The invasion of the Large Language Models (1961)"

English

2.7K

Edouard Grave@EXGRV·14 Tem

@yoavgo The idea is cute, but I would not take the experimental results too seriously as the baseline numbers seem to be off.

English

686

(((ل()(ل() 'yoav))))👾@yoavgo·14 Tem

so, two cents on the gzip classification thing: apparently the idea has been around in some form of another for a while. but was treated as a curiosity because how inefficient it was compared to all other classification methods. enter BERT.

English

268

80.5K

ค้นพบ

@giffmana @YouJiacheng @soumithchintala @kyutai_labs @Thom_Wolf @ylecun @elonmusk @BarackObama