dheevatsa

158 posts

dheevatsa

@dheevatsa

AI systems at NVidia

California, USA Katılım Şubat 2009

454 Takip Edilen192 Takipçiler

dheevatsa@dheevatsa·25 Nis

Muon is having its moment — Kimi K2, GLM 5, and now DeepSeek V4! More broadly, it feels like the time for advanced optimizers is finally here — reiterating that they are an important component for efficient training systems at scale! Our recent work: performant Muon/SOAP-class optimizers in NVIDIA NeMo/Megatron-Core — layer-wise distributed optimizer, TP-aware Newton-Schulz, SYRK kernels. Muon ≥ AdamW on GB300-NVL72. developer.nvidia.com/blog/advancing…

English

5.8K

dheevatsa@dheevatsa·22 Ağu

@airindia Thanks for the prompt response, unfortunate it came after such an escalation

English

Air India@airindia·22 Ağu

@dheevatsa Dear Sir, we've responded to your concern via DM. You may wish to check and acknowledge the same.

English

dheevatsa@dheevatsa·22 Ağu

🧵 1/ @airindia has unilaterally downgraded my elderly parents from premium to economy on their confirmed SFO→BLR flight. No notice. No apology. No explanation. This was booked early with extra legroom specifically to accommodate their needs. Absolutely unacceptable !

English

170

dheevatsa@dheevatsa·22 Ağu

@airindia Sent, also have sent an email regarding this with all the details to appellateauthority@airindia.com. Please review and expedite the response/resolution !

English

Air India@airindia·22 Ağu

@dheevatsa Dear Sir, we hear you. Please help us with your parent's booking details (6 digit alpha-numeric PNR / 13 digit e-ticket number starting with 098) via DM for us to check and assist you.

English

dheevatsa@dheevatsa·22 Ağu

@airindia 4/ These are senior citizens — and this last-minute downgrade is not only irresponsible but inhumane. We chose Air India precisely to ease their travel. This is more than poor service. This is operational negligence with real human cost. @airindia - plz fix this !

English

dheevatsa@dheevatsa·22 Ağu

3/ But @airindia refuses to issue the authorization to allow rebooking to those flights — even though that would honor the original ticket class. No help. No information. Just deflecting and canned responses.

English

dheevatsa@dheevatsa·9 May

Very excited to share the work that we've been doing on continuing to push the boundary for AI training by efficiently scaling across multiple data-centers that are over thousands of kilometers apart, using NeMo/M-Core. bit.ly/3StaYi9

English

dheevatsa@dheevatsa·13 Şub

@xprunie @AravSrinivas and also have all available as a tab in the app

English

arun@xprunie·12 Şub

@AravSrinivas love this -> would love a new discover page to have all the new perplexity specific niches you guys have built for example: perplexity news, finance, sports scores, etc. makes it much easier to click into than type and hope the finance platform pops up

English

4.5K

Aravind Srinivas@AravSrinivas·12 Şub

perplexity dot ai / finance - Perplexity’s Finance Dashboard - stocks, earnings, daily market movements and summaries all in one place. Use it and go make some 💰everyday.

English

128

1.7K

144.4K

dheevatsa@dheevatsa·25 Ara

@BharatBKaul @SakanaAILabs @navikm emergent behavior ftw :)

English

102

Bharat Kaul@BharatBKaul·24 Ara

@SakanaAILabs @dheevatsa @@navikm

QAM

975

Sakana AI@SakanaAILabs·24 Ara

Introducing ASAL: Automating the Search for Artificial Life with Foundation Models sakana.ai/asal/ Artificial Life (ALife) research holds key insights that can transform and accelerate progress in AI. By speeding up ALife discovery with AI, we accelerate our understanding of emergence, evolution, and intelligence–core principles that can inspire the next generation of AI systems! We proudly collaborated with MIT, OpenAI, Swiss AI Lab IDSIA, and Ken Stanley on this exciting project. Full Paper (Website): pub.sakana.ai/asal/ Full Paper (arxiv): asal.sakana.ai/paper/ Code: github.com/SakanaAI/asal/ In this work, we propose a new algorithm called Automated Search for Artificial Life (“ASAL”) to automate the discovery of artificial life using vision-language foundation models. Instead of tediously hand-designing every tiny rule of an Alife simulation, simply describe the space of simulations to search over, and ASAL will automatically discover the most interesting and open-ended artificial lifeforms! Because of the generality of foundation models, ASAL can discover new lifeforms across a diverse range of seminal ALife simulations, including Boids, Particle Life, Game of Life, Lenia, and Neural Cellular Automata. ASAL even discovered novel cellular automata rules that are more open-ended and expressive than the original Conway’s Game of Life. We believe this new paradigm may reignite ALife research by overcoming the bottleneck of manually designed simulations, thus advancing beyond the limits of human ingenuity.

English

630

2.8K

749.2K

dheevatsa@dheevatsa·2 Ağu

Awesome to see the Distributed Shampoo optimizer top AlgoPerf ! “28% faster training than baseline ... 19% faster than 2nd place " Kudos to the team's tenacity for persistently improving over many months, not only surpassing strong baselines but also making it practically viable!

MLCommons@MLCommons

@MLCommons #AlgoPerf results are in! 🏁 $50K prize competition yielded 28% faster neural net training with non-diagonal preconditioning beating Nesterov Adam. New SOTA for hyperparameter-free algorithms too! Full details in our blog. mlcommons.org/2024/08/mlc-al… #AIOptimization #AI

English

215

dheevatsa@dheevatsa·18 Eyl

@_arohan_ for those who want to try it out - github.com/facebookresear… and @_arohan_ very astute guess on prod rollout ! :)

English

1.4K

rohan anil@_arohan_·16 Eyl

Meta researchers just dropped PyTorch distributed shampoo🧴few days ago: arxiv.org/pdf/2309.06497… 💥 Train neural networks with a second order method for better performance. This underlying work which it is based on has been a passion project for last 5 years while swimming upstream with @GuptaVineetG - with no love from any conferences chairs. Distributed Shampoo in Pytorch with solid results means as a co-author of the method trust the implementation! Lastly given the effort they have put it in, my guess is it is already in production (:

English

544

113.4K

dheevatsa@dheevatsa·18 Eki

Grand Teton - Meta’s next-gen compute platform for AI ! Embodies a lot of exciting things that we’ve co-designed over past couple of years, enabling pushing our AI workloads further and beyond #ai #ocpsummit22 #codesign lnkd.in/gFG9PTBF

English

dheevatsa@dheevatsa·12 Haz

@hanlintang @_arohan_ @jefrankle @NaveenGRao @MosaicML @leavittron @hjmshi should be, we’ve been doing some experiments with ResNets

English

Hanlin Tang@hanlintang·11 Haz

@dheevatsa @_arohan_ @jefrankle @NaveenGRao @MosaicML @leavittron @hjmshi Thanks @dheevatsa is it stable enough to maybe try ResNet-50 runs with?

English

Naveen Rao@NaveenGRao·9 Haz

Would you rather invest $200m+ into a new computing arch and get maybe 2x perf, or <$5m and get 7x with better algos? We @MosaicML did just that! We released Mosaic ResNet and it achieves SOTA perf in just 27min on standard HW. You read that right👇mosaicml.com/blog/mosaic-re…

English

551

dheevatsa@dheevatsa·11 Haz

@hanlintang @_arohan_ @jefrankle @NaveenGRao @MosaicML @leavittron we are :) (@hjmshi with Mike Rabbat and others) - github.com/facebookresear… , feel free to try it out ! [disclaimer: still experimental and wip]

English

Hanlin Tang@hanlintang·9 Haz

@_arohan_ @jefrankle @NaveenGRao @MosaicML @leavittron Oh boy would love to get DistributedShampoo into PyTorch so we can test combinations... is anyone working on that?

English

dheevatsa@dheevatsa·19 Mar

Very happy share that our paper on “Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models” has been accepted for the industry track at ISCA this year :) It’s really great to be able to showcase this work, that…lnkd.in/g9e5FJsj

English

dheevatsa retweetledi

Christy Pambianchi@ChristyP_CPO·4 Şub

Despite being smuggled out of his hometown as a teen, @intel’s @BharatBKaul says the “universe has been kind” to him. Inspiring. #IAmIntel intel.ly/3ANXJ29

English

dheevatsa@dheevatsa·22 Kas

It was great to be able present one of the exec talks at the OCP global summit last week along with whitney zhao, introducing our AI training cluster and talking about the general challenges/opportunities we are seeing with buildin…lnkd.in/gHgjrmEy lnkd.in/g6B4y9i9

English

dheevatsa retweetledi

Engineering at Meta@Meta_Engineers·9 Kas

💻@Meta’s Director of Engineering, Omar Baldonado, spoke on stage today at the 2021 OCP Global Summit, sharing the incredible work our Meta Infrastructure teams have done over the past 10 years through the Open Compute Project. Learn more about our work: bit.ly/3ojwSFk

English

dheevatsa@dheevatsa·7 Haz

I’m looking forward to speaking at the AI Hardware summit this year! Perfect venue to talk about all of the interesting work on Co-designing AI HW/SW at scale at FB :) AI HW Summit 2021 - hubs.ly/H0NQMV40 #AIHWSummit #codesign lnkd.in/dzaXzz8

English

Keşfet

@airindia @xprunie @AravSrinivas @BharatBKaul @SakanaAILabs @navikm @_arohan_ @GuptaVineetG