Aarush Sah

2.6K posts

Aarush Sah banner
Aarush Sah

Aarush Sah

@aarush

Superintelligence @Meta. prev @NVIDIA LPU, @GroqInc

Menlo Park, CA เข้าร่วม Eylül 2022
705 กำลังติดตาม8.1K ผู้ติดตาม
adi
adi@adonis_singh·
muse-spark vs llama-4-maverick meta is so back its unreal
adi tweet mediaadi tweet media
English
5
0
87
4.1K
Aarush Sah
Aarush Sah@aarush·
Very excited for you all to see Muse Spark! It’s a capable model that excels in multimodal and agentic tool use tasks. Spark is the first (of many) steps towards Personal Superintelligence, and I look forward to seeing how our models evolve as we get closer to that goal :)
Aarush Sah tweet media
Alexandr Wang@alexandr_wang

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

English
1
0
19
926
Kath Korevec
Kath Korevec@simpsoka·
Can’t wait to join the team at @openai building codex. Would love to hear what you love about it or want changed. We’re moving fast. DMs open.
English
279
24
1.4K
286.8K
anirudh bv
anirudh bv@anirudhbv_ce·
I implemented @GoogleResearch's TurboQuant as a CUDA-native compression engine on Blackwell B200. 5x KV cache compression on Qwen 2.5-1.5B, near-loseless attention scores, generating live from compressed memory. 5 custom cuTile CUDA kernels ft: - fused attention (with QJL corrections) - online softmax -on-chip cache decompression - pipelined TMA loads Try it out: devtechjr.github.io/turboquant_cut… s/o @blelbach and the cuTile team at @nvidia for lending me Blackwell GPU access :) cc @sundeep @GavinSherry
English
142
305
3.3K
776K
Angelina Lue
Angelina Lue@angelina_lue·
Hey twitter/x, one of my goals this year is to share more things that excite me with the world. I’m starting here so let me introduce myself: My name is Angelina, I’m 22, and I currently live in SF! For the past six months, I’ve been working at Meta Superintelligence Labs on model training infra and data strategy👩🏻‍💻 Before that I was at UCLA studying CS and Econ and spent a lot of my time in college building in fintech and investing in early stage companies (General Catalyst Venture Fellows, NEA, Mantis VC). I love food, traveling to new places, a good story, snowboarding, and hosting dinners and game nights🕺🏻 I also love meeting new people, feel free to say hi :)
Angelina Lue tweet media
English
103
5
575
196.7K
Aarush Sah
Aarush Sah@aarush·
The scale of a frontier lab's operations is humbling to experience firsthand
English
2
0
36
5.6K
Charlie Marsh
Charlie Marsh@charliermarsh·
We've entered into an agreement to join OpenAI as part of the Codex team. I'm incredibly proud of the work we've done so far, incredibly grateful to everyone that's supported us, and incredibly excited to keep building tools that make programming feel different.
English
285
142
3.1K
479.5K
Aarush Sah
Aarush Sah@aarush·
Sometimes it’s easy to forget that LLMs are marvels of engineering. Like what do you mean we have a machine that can actually understand the meaning behind a bunch of characters, the same way a human can?!
English
0
0
16
1.2K
Gavin
Gavin@GavinSherry·
Congratulations and thanks to everyone who participated in the @GroqInc journey and brought us to this incredible moment at @nvidia GTC
Gavin tweet media
English
5
17
178
13.8K
Aarush Sah รีทวีตแล้ว
Jonathan Ross
Jonathan Ross@JonathanRoss321·
LPU in Ian Buck's hand, sitting in the audience at GTC
Jonathan Ross tweet media
English
9
21
323
17.4K
Florian Brand
Florian Brand@xeophon·
how those groq lpus look at me
Florian Brand tweet media
English
2
0
38
1.6K
Jai Relan
Jai Relan@JaiRelan·
@aarush They’re also so deep into back orders that the next lot you can get is in 4 weeks ://
English
1
0
0
85
Aarush Sah
Aarush Sah@aarush·
Went to an Apple store yesterday and they're still sold out of mac minis 💀
English
3
0
8
1.3K
Aarush Sah รีทวีตแล้ว
Jonathan Ross
Jonathan Ross@JonathanRoss321·
GPU ♥ LPU: Everything You Wanted to Know I’m joining David Senra (@FoundersPodcast) at @NVIDIAGTC for a conversation about the reality of modern inference. This is your opportunity to learn why Nvidia and Groq partnered together, and what it means for the future of inference.
English
17
42
447
67.8K
JingyuanLiu
JingyuanLiu@JingyuanLiu123·
Some updates: I've always been bullish on TML, and I actually joined TML this Monday Looking back, I am feeling so lucky that I have the privilege to work closely with the best optimization experts on the Muon optimizer ( @Jianlin_S from Kimi and @clu_cheng from Meta). Now I am so excited to be able to work with @jxbz and build new cool things! (On the other hand, there have always been some bad rumors about Meta TBD's potential failure. That's not true! From my personal experiences, it really has the best talents in the field, and I really enjoyed learning from the lab. The avocado model will for sure be great!)
JingyuanLiu@JingyuanLiu123

hmm I sort of disagree and I am bullish for TML. I think they really really have the top talents that I admire in the field, e.g. Jeremy and Sam for optimization, Songlin for Attn, Lia for MoE, Andrew for FSDPv2, and a bunch more folks it's just natural that it takes a while to publish good models: - dpsk starts to publish papers in 2023, even piblished dspkv2 (which I think is already amazing) in mid 2024 and nobody cares, until dpskv3 and r1 - msh took 10+ month to deliver a first not bad long ctx model in 2023 and be silent for the whole 2024 year, and starts to catch up gradually in 2025 - qwen starts to be a much better model than llama until qwen2.5, mid or late 2024, while the lab has been there forever it takes time to get infra and data done, but as long as you have good folks, and principled ways of doing science and experiments, some time or later, scaling laws will pay back

English
41
8
273
54K