dheevatsa

158 posts

dheevatsa

dheevatsa

@dheevatsa

AI systems at NVidia

California, USA Katılım Şubat 2009
454 Takip Edilen192 Takipçiler
dheevatsa
dheevatsa@dheevatsa·
Muon is having its moment — Kimi K2, GLM 5, and now DeepSeek V4! More broadly, it feels like the time for advanced optimizers is finally here — reiterating that they are an important component for efficient training systems at scale! Our recent work: performant Muon/SOAP-class optimizers in NVIDIA NeMo/Megatron-Core — layer-wise distributed optimizer, TP-aware Newton-Schulz, SYRK kernels. Muon ≥ AdamW on GB300-NVL72. developer.nvidia.com/blog/advancing…
English
2
14
81
5.8K
dheevatsa
dheevatsa@dheevatsa·
@airindia Thanks for the prompt response, unfortunate it came after such an escalation
English
1
0
0
33
Air India
Air India@airindia·
@dheevatsa Dear Sir, we've responded to your concern via DM. You may wish to check and acknowledge the same.
English
1
0
1
62
dheevatsa
dheevatsa@dheevatsa·
🧵 1/ @airindia has unilaterally downgraded my elderly parents from premium to economy on their confirmed SFO→BLR flight. No notice. No apology. No explanation. This was booked early with extra legroom specifically to accommodate their needs. Absolutely unacceptable !
English
2
0
0
170
dheevatsa
dheevatsa@dheevatsa·
@airindia Sent, also have sent an email regarding this with all the details to appellateauthority@airindia.com. Please review and expedite the response/resolution !
English
1
0
0
41
Air India
Air India@airindia·
@dheevatsa Dear Sir, we hear you. Please help us with your parent's booking details (6 digit alpha-numeric PNR / 13 digit e-ticket number starting with 098) via DM for us to check and assist you.
English
1
0
0
99
dheevatsa
dheevatsa@dheevatsa·
@airindia 4/ These are senior citizens — and this last-minute downgrade is not only irresponsible but inhumane. We chose Air India precisely to ease their travel. This is more than poor service. This is operational negligence with real human cost. @airindia - plz fix this !
English
0
0
0
81
dheevatsa
dheevatsa@dheevatsa·
3/ But @airindia refuses to issue the authorization to allow rebooking to those flights — even though that would honor the original ticket class. No help. No information. Just deflecting and canned responses.
English
1
0
0
87
dheevatsa
dheevatsa@dheevatsa·
Very excited to share the work that we've been doing on continuing to push the boundary for AI training by efficiently scaling across multiple data-centers that are over thousands of kilometers apart, using NeMo/M-Core. bit.ly/3StaYi9
English
0
0
0
73
arun
arun@xprunie·
@AravSrinivas love this -> would love a new discover page to have all the new perplexity specific niches you guys have built for example: perplexity news, finance, sports scores, etc. makes it much easier to click into than type and hope the finance platform pops up
English
2
1
32
4.5K
Aravind Srinivas
Aravind Srinivas@AravSrinivas·
perplexity dot ai / finance - Perplexity’s Finance Dashboard - stocks, earnings, daily market movements and summaries all in one place. Use it and go make some 💰everyday.
English
128
97
1.7K
144.4K
Sakana AI
Sakana AI@SakanaAILabs·
Introducing ASAL: Automating the Search for Artificial Life with Foundation Models sakana.ai/asal/ Artificial Life (ALife) research holds key insights that can transform and accelerate progress in AI. By speeding up ALife discovery with AI, we accelerate our understanding of emergence, evolution, and intelligence–core principles that can inspire the next generation of AI systems! We proudly collaborated with MIT, OpenAI, Swiss AI Lab IDSIA, and Ken Stanley on this exciting project. Full Paper (Website): pub.sakana.ai/asal/ Full Paper (arxiv): asal.sakana.ai/paper/ Code: github.com/SakanaAI/asal/ In this work, we propose a new algorithm called Automated Search for Artificial Life (“ASAL”) to automate the discovery of artificial life using vision-language foundation models. Instead of tediously hand-designing every tiny rule of an Alife simulation, simply describe the space of simulations to search over, and ASAL will automatically discover the most interesting and open-ended artificial lifeforms! Because of the generality of foundation models, ASAL can discover new lifeforms across a diverse range of seminal ALife simulations, including Boids, Particle Life, Game of Life, Lenia, and Neural Cellular Automata. ASAL even discovered novel cellular automata rules that are more open-ended and expressive than the original Conway’s Game of Life. We believe this new paradigm may reignite ALife research by overcoming the bottleneck of manually designed simulations, thus advancing beyond the limits of human ingenuity.
English
75
630
2.8K
749.2K
dheevatsa
dheevatsa@dheevatsa·
Awesome to see the Distributed Shampoo optimizer top AlgoPerf ! “28% faster training than baseline ... 19% faster than 2nd place " Kudos to the team's tenacity for persistently improving over many months, not only surpassing strong baselines but also making it practically viable!
MLCommons@MLCommons

@MLCommons #AlgoPerf results are in! 🏁 $50K prize competition yielded 28% faster neural net training with non-diagonal preconditioning beating Nesterov Adam. New SOTA for hyperparameter-free algorithms too! Full details in our blog. mlcommons.org/2024/08/mlc-al… #AIOptimization #AI

English
1
0
1
215
rohan anil
rohan anil@_arohan_·
Meta researchers just dropped PyTorch distributed shampoo🧴few days ago: arxiv.org/pdf/2309.06497… 💥 Train neural networks with a second order method for better performance. This underlying work which it is based on has been a passion project for last 5 years while swimming upstream with @GuptaVineetG - with no love from any conferences chairs. Distributed Shampoo in Pytorch with solid results means as a co-author of the method trust the implementation! Lastly given the effort they have put it in, my guess is it is already in production (:
English
8
70
544
113.4K
dheevatsa
dheevatsa@dheevatsa·
Grand Teton - Meta’s next-gen compute platform for AI ! Embodies a lot of exciting things that we’ve co-designed over past couple of years, enabling pushing our AI workloads further and beyond #ai #ocpsummit22 #codesign lnkd.in/gFG9PTBF
English
0
1
1
0
Naveen Rao
Naveen Rao@NaveenGRao·
Would you rather invest $200m+ into a new computing arch and get maybe 2x perf, or <$5m and get 7x with better algos? We @MosaicML did just that! We released Mosaic ResNet and it achieves SOTA perf in just 27min on standard HW. You read that right👇mosaicml.com/blog/mosaic-re…
Naveen Rao tweet media
English
8
56
551
0
dheevatsa
dheevatsa@dheevatsa·
Very happy share that our paper on “Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models” has been accepted for the industry track at ISCA this year :) It’s really great to be able to showcase this work, that…lnkd.in/g9e5FJsj
English
0
0
4
0
dheevatsa
dheevatsa@dheevatsa·
It was great to be able present one of the exec talks at the OCP global summit last week along with whitney zhao, introducing our AI training cluster and talking about the general challenges/opportunities we are seeing with buildin…lnkd.in/gHgjrmEy lnkd.in/g6B4y9i9
English
0
0
1
0
dheevatsa retweetledi
Engineering at Meta
Engineering at Meta@Meta_Engineers·
💻@Meta’s Director of Engineering, Omar Baldonado, spoke on stage today at the 2021 OCP Global Summit, sharing the incredible work our Meta Infrastructure teams have done over the past 10 years through the Open Compute Project. Learn more about our work: bit.ly/3ojwSFk
Engineering at Meta tweet media
English
5
4
18
0