Aarush Sah

2.6K posts

Aarush Sah

@aarush

Superintelligence @Meta. prev @NVIDIA LPU, @GroqInc

Menlo Park, CA Beigetreten Eylül 2022

705 Folgt8.1K Follower

Aarush Sah@aarush·3h

@Chad_calcagno @AnthropicAI 😂😂

QME

Chad Calcagno@Chad_calcagno·1d

@aarush @AnthropicAI I just wanna know where you rank on the token leaderboard

English

Aarush Sah@aarush·1d

Wow. Claude Mythos represents at 20% jump on SWE-bench Pro over the next best model, GPT-5.4 Congrats to the @AnthropicAI team!

Anthropic@AnthropicAI

The Claude Mythos Preview system card is available here: anthropic.com/claude-mythos-…

English

832

Aarush Sah@aarush·3h

@Nottlespike @AnthropicAI Interesting. Haven’t read the technical report in full yet, will have to do some poking around

English

Kearm h/eng@Nottlespike·1d

@aarush @AnthropicAI Given your benchmark work isn't this concerning?

English

Aarush Sah@aarush·8h

@adonis_singh 🔥

QME

319

adi@adonis_singh·8h

muse-spark vs llama-4-maverick meta is so back its unreal

English

3.8K

Aarush Sah@aarush·16h

Very excited for you all to see Muse Spark! It’s a capable model that excels in multimodal and agentic tool use tasks. Spark is the first (of many) steps towards Personal Superintelligence, and I look forward to seeing how our models evolve as we get closer to that goal :)

Alexandr Wang@alexandr_wang

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

English

922

Aarush Sah@aarush·1d

I would argue that the capabilities of Claude Mythos are much more likely to cause immediate harm if widely released without appropriate safeguards. And even if not, it’s much harder to put the genie back into the bottle rather than simply release the genie later

Julien Chaumond@julien_c

“gpt2-large is too powerful to be publicly released” vibes

English

578

Aarush Sah@aarush·4d

@simpsoka @OpenAI Congratulations!

English

Kath Korevec@simpsoka·5d

Can’t wait to join the team at @openai building codex. Would love to hear what you love about it or want changed. We’re moving fast. DMs open.

English

279

1.4K

286.7K

Aarush Sah@aarush·5d

@anirudhbv_ce @GoogleResearch Nice work!

English

1.4K

anirudh bv@anirudhbv_ce·6d

I implemented @GoogleResearch's TurboQuant as a CUDA-native compression engine on Blackwell B200. 5x KV cache compression on Qwen 2.5-1.5B, near-loseless attention scores, generating live from compressed memory. 5 custom cuTile CUDA kernels ft: - fused attention (with QJL corrections) - online softmax -on-chip cache decompression - pipelined TMA loads Try it out: devtechjr.github.io/turboquant_cut… s/o @blelbach and the cuTile team at @nvidia for lending me Blackwell GPU access :) cc @sundeep @GavinSherry

English

142

305

3.3K

776K

Aarush Sah@aarush·27 Mar

@angelina_lue Welcome!

GIF

English

2.5K

Angelina Lue@angelina_lue·26 Mar

Hey twitter/x, one of my goals this year is to share more things that excite me with the world. I’m starting here so let me introduce myself: My name is Angelina, I’m 22, and I currently live in SF! For the past six months, I’ve been working at Meta Superintelligence Labs on model training infra and data strategy👩🏻‍💻 Before that I was at UCLA studying CS and Econ and spent a lot of my time in college building in fintech and investing in early stage companies (General Catalyst Venture Fellows, NEA, Mantis VC). I love food, traveling to new places, a good story, snowboarding, and hosting dinners and game nights🕺🏻 I also love meeting new people, feel free to say hi :)

English

103

576

196.7K

Aarush Sah@aarush·23 Mar

The scale of a frontier lab's operations is humbling to experience firsthand

English

5.6K

Aarush Sah@aarush·20 Mar

@charliermarsh Huge congrats!

English

Charlie Marsh@charliermarsh·19 Mar

We've entered into an agreement to join OpenAI as part of the Codex team. I'm incredibly proud of the work we've done so far, incredibly grateful to everyone that's supported us, and incredibly excited to keep building tools that make programming feel different.

English

285

142

3.1K

479.5K

Aarush Sah@aarush·19 Mar

Sometimes it’s easy to forget that LLMs are marvels of engineering. Like what do you mean we have a machine that can actually understand the meaning behind a bunch of characters, the same way a human can?!

English

1.2K

Aarush Sah@aarush·16 Mar

Amazing work from @JonathanRoss321, @sundeep, @GavinSherry and the team. Very excited to see this work come to fruition 💚🤝🧡

Ryan Shrout@ryanshrout

On the right: Vera Rubin. Middle: NVLink 6th Gen Left: The brand new Groq system

English

2.4K

Aarush Sah@aarush·16 Mar

@GavinSherry @GroqInc @nvidia ❤️

QME

281

Gavin@GavinSherry·16 Mar

Congratulations and thanks to everyone who participated in the @GroqInc journey and brought us to this incredible moment at @nvidia GTC

English

178

13.8K

Aarush Sah retweetet

Jonathan Ross@JonathanRoss321·16 Mar

LPU in Ian Buck's hand, sitting in the audience at GTC

English

323

17.4K

Aarush Sah@aarush·16 Mar

@xeophon 👀

QME

148

Florian Brand@xeophon·16 Mar

how those groq lpus look at me

English

1.6K

Aarush Sah@aarush·16 Mar

@JaiRelan Oh man 💀

English

Jai Relan@JaiRelan·16 Mar

@aarush They’re also so deep into back orders that the next lot you can get is in 4 weeks ://

English

Aarush Sah@aarush·16 Mar

Went to an Apple store yesterday and they're still sold out of mac minis 💀

English

1.3K

Aarush Sah retweetet

Jonathan Ross@JonathanRoss321·13 Mar

GPU ♥ LPU: Everything You Wanted to Know I’m joining David Senra (@FoundersPodcast) at @NVIDIAGTC for a conversation about the reality of modern inference. This is your opportunity to learn why Nvidia and Groq partnered together, and what it means for the future of inference.

English

447

67.8K

Aarush Sah@aarush·6 Mar

@JingyuanLiu123 @Jianlin_S @clu_cheng Congratulations, and best of luck at Thinking Machines! 🫡

English

153

JingyuanLiu@JingyuanLiu123·6 Mar

Some updates: I've always been bullish on TML, and I actually joined TML this Monday Looking back, I am feeling so lucky that I have the privilege to work closely with the best optimization experts on the Muon optimizer ( @Jianlin_S from Kimi and @clu_cheng from Meta). Now I am so excited to be able to work with @jxbz and build new cool things! (On the other hand, there have always been some bad rumors about Meta TBD's potential failure. That's not true! From my personal experiences, it really has the best talents in the field, and I really enjoyed learning from the lab. The avocado model will for sure be great!)

JingyuanLiu@JingyuanLiu123

hmm I sort of disagree and I am bullish for TML. I think they really really have the top talents that I admire in the field, e.g. Jeremy and Sam for optimization, Songlin for Attn, Lia for MoE, Andrew for FSDPv2, and a bunch more folks it's just natural that it takes a while to publish good models: - dpsk starts to publish papers in 2023, even piblished dspkv2 (which I think is already amazing) in mid 2024 and nobody cares, until dpskv3 and r1 - msh took 10+ month to deliver a first not bad long ctx model in 2023 and be silent for the whole 2024 year, and starts to catch up gradually in 2025 - qwen starts to be a much better model than llama until qwen2.5, mid or late 2024, while the lab has been there forever it takes time to get infra and data done, but as long as you have good folks, and principled ways of doing science and experiments, some time or later, scaling laws will pay back

English

273

54K

Entdecken

@Chad_calcagno @AnthropicAI @Nottlespike @adonis_singh @simpsoka @OpenAI @anirudhbv_ce @GoogleResearch