Sam Ade Jacobs

194 posts

Sam Ade Jacobs

Sam Ade Jacobs

@samadejacobs

PhD Comp. Science (Texas A&M University), R&D expertise and experience in advanced large-scale big data (graph) analytics, machine (deep) learning, and robotics

San Francisco, CA Katılım Eylül 2010
119 Takip Edilen92 Takipçiler
Sabitlenmiş Tweet
Sam Ade Jacobs
Sam Ade Jacobs@samadejacobs·
#SC20 starts today! It is exciting to have our work on AI/HPC-enabled drug design for CoVID19 in the prestigious Gordon Bell Special Prize Finalist. Congratulations to our team, “sleepless” night in a chaotic Summer not in vain!
Sam Ade Jacobs tweet media
English
3
2
8
0
Sam Ade Jacobs retweetledi
DeepSpeed
DeepSpeed@DeepSpeedAI·
🚀Introducing Ulysses-Offload🚀 - Unlock the power of long context LLM training and finetuning with our latest system optimizations - Train LLaMA3-8B on 2M tokens context using 4xA100-80GB - Achieve over 55% MFU Blog: shorturl.at/Spx6Y Tutorial: shorturl.at/bAWu5
DeepSpeed tweet media
English
1
30
97
5.8K
Sam Ade Jacobs retweetledi
Sam Ade Jacobs retweetledi
DeepSpeed
DeepSpeed@DeepSpeedAI·
Announcing that DeepSpeed now runs natively on Windows. This exciting combination unlocks DeepSpeed optimizations to Windows users and empowers more people and organizations with AI innovations. - HF Inference & Finetuning - LoRA - CPU Offload Blog: shorturl.at/a7TF8
DeepSpeed tweet media
English
1
6
38
4.3K
Sam Ade Jacobs retweetledi
DeepSpeed
DeepSpeed@DeepSpeedAI·
Introducing Universal Checkpointing for boosting training efficiency. - Change parallelism (PP, SP, TP, ZeRO-DP) or GPU count mid-stream - Improve resilience by scaling down to healthy nodes💪 - Increase throughput by scaling up to elastic nodes🚀 Blog: rb.gy/aup3pn
DeepSpeed tweet media
English
0
5
23
4.3K
Sam Ade Jacobs retweetledi
Stas Bekman
Stas Bekman@StasBekman·
If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should work well now: #event-11602278791" target="_blank" rel="nofollow noopener">github.com/microsoft/Deep… ZeRO++'s main feature is allowing you to use a hybrid approach if you can fit a model on a single node of 8 gpus. So it takes benefit of the super fast NVLink within the node and only needs to reduce grads across nodes over the slow link. So if in your workflow the slow inter-node network was impacting your tflops, enabling ZeRO++ should give you a sizeable boost. The number would very depend on your situation but in my experiments I saw 5%+ boost with a 7b llama. This is similar to Hybrid FSDP. To try see: deepspeed.ai/tutorials/zero… I was talking about the hybrid solution - I'm yet to try the quantized weights/grads also offered by ZeRO++ which should speed up things even further as there will be even less stress on the network with those. Just remember until the next release is made you want deepspeed@master
English
3
12
77
7.9K
Sam Ade Jacobs retweetledi
DeepSpeed
DeepSpeed@DeepSpeedAI·
Introducing Mixtral, Phi2, Falcon, and Qwen support in #DeepSpeed-FastGen! - Up to 2.5x faster LLM inference - Optimized SplitFuse and token sampling - Exciting new features like RESTful API and more! For more details: github.com/microsoft/Deep… #DeepSpeeed #AI
DeepSpeed tweet media
English
10
88
416
49.5K
Sam Ade Jacobs retweetledi
DeepSpeed
DeepSpeed@DeepSpeedAI·
🚀 Excited to announce our paper "ZeRO++: Extremely Efficient Collective Communication for Large Model Training" has been accepted at #ICLR2024! 🔍 ZeRO++ significantly reduces communication volume by 4x, achieving up to 3.3x speedup. microsoft.com/en-us/research… #DeepSpeed #AI
English
2
20
93
5.7K
Sam Ade Jacobs retweetledi
OpenAI
OpenAI@OpenAI·
We're rolling out new features and improvements that developers have been asking for: 1. Our new model GPT-4 Turbo supports 128K context and has fresher knowledge than GPT-4. Its input and output tokens are respectively 3× and 2× less expensive than GPT-4. It’s available now to all developers in preview. 2. Assistants API and new tools (Retrieval, Code Interpreter) will help developers build world-class AI assistants within their own apps. 3. The platform is becoming multimodal. GPT-4 Turbo with Vision, DALL·E 3, and text-to-speech are all now available to developers. Oh… and we’re doubling GPT-4 rate limits. openai.com/blog/new-model…
English
894
2.7K
14.5K
4M
Sam Ade Jacobs retweetledi
DeepSpeed
DeepSpeed@DeepSpeedAI·
Introducing DeepSpeed-FastGen 🚀 Serve LLMs and generative AI models with - 2.3x higher throughput - 2x lower average latency - 4x lower tail latency w. Dynamic SplitFuse batching Auto TP, load balancing w. perfect linear scaling, plus easy-to-use API github.com/microsoft/Deep…
DeepSpeed tweet media
English
6
115
548
112.8K
Sam Ade Jacobs retweetledi
DeepSpeed
DeepSpeed@DeepSpeedAI·
🚀Exciting new updates on #DeepSpeed ZeRO-Inference with 20X faster generation! - 4x lesser memory usage through 4-bit weight quantization with no code change needed. - 4x larger batch sizes through KV cache offloading. Available in DeepSpeed v0.10.3: aka.ms/z3-inference
DeepSpeed tweet media
English
2
28
167
18.2K
Sam Ade Jacobs retweetledi
Jaime Teevan
Jaime Teevan@jteevan·
Sometimes it takes an external push to really recognize you're in the middle of something big. Just seeing how many people I know and respect are on the first-ever #TIME100 AI list makes me feel like I'm a part of history. time.com/collection/tim…
English
8
2
81
6.1K
Sam Ade Jacobs retweetledi
DeepSpeed
DeepSpeed@DeepSpeedAI·
Want to train 1 million token context lengths (all 7 of the Harry Potter books!📚) on a GPT-like model w. 64 GPUs? Announcing DeepSpeed-Ulysses🚀 This release enables highly efficient and scalable LLM training with extremely long sequence lengths🤯 github.com/microsoft/Deep…
DeepSpeed tweet media
English
1
40
142
15.7K
Sam Ade Jacobs retweetledi
OpenAI
OpenAI@OpenAI·
We trained an AI using process supervision — rewarding the thought process rather than the outcome — to achieve new state-of-art in mathematical reasoning. Encouraging sign for alignment of advanced AIs: …openai.com/research/impro…
English
410
802
4.5K
1.8M