Thomas Chaton

89 posts

Thomas Chaton

Thomas Chaton

@chaton_thomas

Research Enginering Manager at @PyTorchLightnin | @gridai_

Beigetreten Mayıs 2020
22 Folgt107 Follower
Thomas Chaton retweetet
Dan Biderman
Dan Biderman@dan_biderman·
How can we use small LLMs to shift more AI workloads onto our laptops and phones? In our paper and open-source code, we pair on-device LLMs (@ollama) with frontier LLMs in the cloud (@openai, @together), to solve token-intensive workloads on your 💻 at 17.5% of the cloud cost while maintaining 97.9% of the accuracy. See Gru and the Minions in action below, 🔉on please (h/t @cartesia)!
English
41
170
634
192.2K
Thomas Chaton retweetet
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
Introducing DeepSeek-R1 optimizations for Blackwell, delivering 25x more revenue at 20x lower cost per token, compared with NVIDIA H100 just four weeks ago. Fueled by TensorRT DeepSeek optimizations for our Blackwell architecture, including FP4 performance with state-of-the-art production accuracy, it scored 99.8% of FP8 on MMLU general intelligence benchmark. FP4-optimized DeepSeek checkpoint now available on @huggingface: huggingface.co/nvidia/DeepSee…
NVIDIA AI Developer tweet media
English
106
412
2.9K
500.8K
Thomas Chaton retweetet
William Falcon ⚡️
William Falcon ⚡️@williamfalcon·
Here I show you how to finetune and deploy DeepSeek R1 (8B) for < $1.00 in 8 minutes using the AI Hub from @LightningAI ⚡️⚡️
English
1
17
66
4.8K
Thomas Chaton retweetet
DeepSeek
DeepSeek@deepseek_ai·
🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference! Core components of NSA: • Dynamic hierarchical sparse strategy • Coarse-grained token compression • Fine-grained token selection 💡 With optimized design for modern hardware, NSA speeds up inference while reducing pre-training costs—without compromising performance. It matches or outperforms Full Attention models on general benchmarks, long-context tasks, and instruction-based reasoning. 📖 For more details, check out our paper here: arxiv.org/abs/2502.11089
DeepSeek tweet mediaDeepSeek tweet mediaDeepSeek tweet mediaDeepSeek tweet media
English
885
2.1K
15.4K
2.6M
Thomas Chaton
Thomas Chaton@chaton_thomas·
@ThomasScialom It would be fantastic if the data and pre/post training code was open sourced too.
English
0
0
0
104
Thomas Scialom
Thomas Scialom@ThomasScialom·
The team worked really hard to make history, voila finally the Llama-3.1 herd of models...have fun with it! * open 405B, insane 70B * 128K context length, improved reasoning & coding capabilities * detailed paper ai.meta.com/research/publi…
Thomas Scialom tweet media
English
3
17
105
5.8K
Thomas Chaton
Thomas Chaton@chaton_thomas·
@Thom_Wolf It would be fantastic if the data and pre/post training code was open sourced too.
English
1
0
0
1.4K
Thomas Wolf
Thomas Wolf@Thom_Wolf·
Among the most impressive aspect of the Llama 3.1 release is the accompanying research paper! Close to 100 pages of deep knowledge-sharing on LLMs like we havn't seen very often recently What a treat! It covers everything, pretrainining data, filtering, annealing, synthetic data, scaling laws, infrastructures, parallelism, training recipees, post-training adaptation, tool-use, benchmarking, inference strategies, quantization, vision, speech, videos... Mind-blown! Maybe the single paper you can read today to join the field of LLM from zero right to the frontier Read it here and feel the open-science ai.meta.com/research/publi…
Thomas Wolf tweet media
English
15
250
1.1K
76.1K
Bhimraj Yadav
Bhimraj Yadav@bhimrazy·
Use LitData with MinIO —a high-performance, S3-compatible object store designed for large-scale AI/ML, data lakes, and databases It's a great library from the @LightningAI team.
Bhimraj Yadav tweet media
English
4
2
15
3.2K
Jeffrey 杰弗瑞
Jeffrey 杰弗瑞@tomcocobrico·
When you get 2000$ in cloud credits for the fine tuning course but the first website you sign up for is actually @LightningAI ‘s new studio. I have to say it looks really neat. 24/7 free cpu with persistent storage, easy switch to gpus, reasonable auto sleep
English
3
2
10
5.9K
Thomas Chaton retweetet
Linus
Linus@thesephist·
A while ago I complained here about persistent storage in Google Colab. Have been using @LightningAI Studios for a while now for: - Full VSCode (incl. GH Copilot) - Persisted files shared across notebooks - Multi-GPU/node (!!) It's been great. Feels like a remote ML workstation
Linus tweet media
English
7
32
260
56.2K
Bhimraj Yadav
Bhimraj Yadav@bhimrazy·
I was able to process almost 100 GB of image data using the concurrent_task_executor function in less than 5 minutes, @LightningAI Studios. Feel free to drop any suggestions or questions.
Bhimraj Yadav tweet media
English
2
7
24
5.1K
Bhimraj Yadav
Bhimraj Yadav@bhimrazy·
🚀 Boost your Python code's speed with `concurrent_task_executor`! 🏎️💨 No more waiting for slow processing. Just throw your tasks at this function, sit back, and watch your code go "Brrrrr" through your data! 💥✨ #Python #Coding #Efficiency #GoBrrrrr 🐍💻
Bhimraj Yadav tweet media
English
2
0
2
195
Thomas Chaton
Thomas Chaton@chaton_thomas·
@karpathy @karpathy Give it a try to Lightning Studio. You won't use your local computer ever again !
English
0
0
0
47
Andrej Karpathy
Andrej Karpathy@karpathy·
Setting up my shiny new fully maxed out Space Black MacBook Pro M3 Max 128GB 16-inch (upgrading from an M1 Air). I always like to set up the new one with a clean slate, from scratch - this time I will not allow my dev configuration to get out of hand. Then we'll talk to it.
English
349
126
5.6K
598K
pax
pax@elitepax·
Just got invited to Studio. It blows my mind how much value is packed in it. I've been riding the AI wave for the past year, learned a ton in the process, launched some production apps, but my stuff is scattered around because I'm moving quick and I always have to deal with some kind of friction when I start prototyping new ideas. With Lightning Studio I was able to rapidly pick up best practices, fine tuned a model with a custom dataset, served it and now I'm chatting with it, all under 1h. Hats down, true product & engineering! 💪🏻
English
1
1
5
1.8K
Lightning AI ⚡️
Lightning AI ⚡️@LightningAI·
Introducing Lightning AI Studios - A persistent GPU cloud environment. Setup once. Ready any time. Code online. Code from your local IDE. Prototype. Train. Serve. Multi-node. All from the same place. No credit card. 6 Free GPU hours/month. lightning.ai
English
50
111
479
200.7K
Thomas Chaton
Thomas Chaton@chaton_thomas·
You can duplicate the Studio, you will get everything. The dependencies, the data, the code, etc... Finally, a benchmark you can reproduce yourself with a click!
English
0
0
1
59
Thomas Chaton
Thomas Chaton@chaton_thomas·
We just finished benchmarking cloud data-loading libraries over Imagenet 1.2M: - Lightning AI Streaming Dataset - Webdataset - MosaicML Streaming Conclusion: Lightning AI is the fastest (up to 80%) 🚀 lightning.ai/lightning-ai/s…
English
1
0
1
88
Thomas Chaton
Thomas Chaton@chaton_thomas·
Prepare a 1 trillion token dataset to train LLMs from scratch in under 4 hours instead of days with @LightningAI Studio! Everything is included, the final datasets, the code, dependencies, etc... Get started in seconds as no setup is needed. lightning.ai/lightning-ai/s…
English
0
0
4
84