Abhi Venigalla

941 posts

Abhi Venigalla

@ml_hardware

Researcher @Databricks. Former @MosaicML, @CerebrasSystems. Addicted to all things compute.

San Francisco, CA Katılım Ekim 2018

1.5K Takip Edilen8.1K Takipçiler

Abhi Venigalla retweetledi

Cerebras@cerebras·13 Mar

x.com/i/article/2032…

ZXX

146

1.2K

270.7K

Abhi Venigalla retweetledi

Tanishq Kumar@tanishqkumar07·4 Mar

I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.

English

135

456

4.1K

608.6K

Abhi Venigalla retweetledi

Davis Blalock@davisblalock·4 Mar

🚀 Today we’re releasing FlashOptim: better implementations of Adam, SGD, etc, that compute the same updates but save tons of memory. You can use it right now via `pip install flashoptim`. 🚀 arxiv.org/abs/2602.23349 A bunch of cool ideas make this possible: [1/n]

English

228

1.6K

216.6K

Abhi Venigalla retweetledi

EE Times | Electronic Engineering Times@eetimes·19 Şub

Taalas Specializes to Extremes for Extraordinary Token Speed eetimes.com/taalas-special…

EE Times | Electronic Engineering Times tweet media

English

2.8K

Abhi Venigalla retweetledi

ken@aquariusacquah·17 Şub

2 weeks ago, we rebuilt our entire product. "Browser automation" fell short of our mission to eliminate all repetitive knowledge work. The new Kaizen is the ultimate digital employee: always on, extremely capable, continually learning. Sign up for access in the tweet below.

English

7.1K

Abhi Venigalla retweetledi

SemiAnalysis@SemiAnalysis_·16 Şub

InferenceX v2: NVIDIA Blackwell Vs AMD vs Hopper - Formerly InferenceMAX, GB300 NVL72, MI355X, B200, H100, Disaggregated Serving, Wide Expert Parallelism, Large Mixture of Experts, SGLang, vLLM, TRTLLM semianalysis.substack.com/p/inferencex-v…

English

243

230.1K

Abhi Venigalla retweetledi

Cerebras@cerebras·12 Şub

OpenAI Codex-Spark powered by Cerebras You can now just build things faster—at 1,000 tokens/s.

English

141

286.7K

Abhi Venigalla retweetledi

Ishani Thakur@ishanit5·12 Şub

sim the people, sim the world, join simile. they kick ass. @joon_s_pk @msbernst @percyliang @ElainaYallen @mihikapoor

Simile@simile_ai

x.com/i/article/2021…

English

13.9K

Abhi Venigalla retweetledi

Rohan Kodialam@KodialamRo·5 Şub

The world’s most powerful data agent releases today. Sphinx 1.0 is here to power elite data teams.

English

177

1.9K

1.7M

Abhi Venigalla retweetledi

Thomas Sohmers@trsohmers·4 Şub

Excited to announce today that my startup, @positron_ai, has closed a $230M Series B financing round at an over $1B valuation, co-led by great folks at @jumptrading, Arena, Unless Ventures, and strategic backing by @Arm! bloomberg.com/news/articles/…

English

191

51K

Abhi Venigalla retweetledi

Cerebras@cerebras·14 Oca

OpenAI🤝Cerebras openai.com/index/cerebras…

Latviešu

179

346

2.9K

1.6M

Abhi Venigalla@ml_hardware·26 Kas

@Arunabh85992578 @firstadopter Your GB200 flop/s are 2x higher than they should be. You're quoting the sparse flop/s number not dense

English

tae kim@firstadopter·25 Kas

After ten years, can someone give me an estimate of TPU 2025 revenue for external customers? Bueller?

English

13.4K

Abhi Venigalla retweetledi

Cody Blakeney@code_star·16 Kas

I've got something new for everyone. My first substack article! Not the one I planned to do first, but a fun one! I have made a handy calculator base on the DeepSeek v1 coefficients for finding optimal LR and batch sizes for dense LLMs.

English

169

40.8K

Abhi Venigalla retweetledi

Horace He@cHHillee·10 Eyl

Apologies that I haven't written anything since joining Thinking Machines but I hope this blog post on a topic very near and dear to my heart (reproducible floating point numerics in LLM inference) will make up for it!

Thinking Machines@thinkymachines

Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to prompt engineering. Here we share what we are working on and connect with the research community frequently and openly. The name Connectionism is a throwback to an earlier era of AI; it was the name of the subfield in the 1980s that studied neural networks and their similarity to biological brains. thinkingmachines.ai/blog/defeating…

English

197

2.9K

532.4K

Abhi Venigalla retweetledi

Sphinx AI@getsphinx·9 Eyl

🚀 Thrilled to announce our $9.5M funding round led by @buckymoore at @lightspeedvp, alongside an incredible group of investors from the Valley and New York. ✨ With this announcement, we’re also moving Sphinx Copilot -- the state-of-the-art AI agent for data science -- out of closed beta. It’s now available at sphinx.ai (with a generous free tier!). Our early partners have gone from raw data ➝ commercial insights in minutes instead of days. We can’t wait to see what the data community builds with Sphinx. 🌱 This is just the beginning for Sphinx. We’re redefining how AI works with data, from copilots to fully autonomous researchers and analysts. We're excited to keep building best-in-class machine intelligence for a new generation of data-driven innovation. sphinx.ai/blog/sphinx-la…

English

32.6K

Abhi Venigalla retweetledi

typedfemale@typedfemale·22 Ağu

alexander wang with yann lecun

English

1.9K

129.2K

Abhi Venigalla retweetledi

Sasha Doubov@sashadoubov·23 Tem

memory-bound gf, compute-bound bf

English

113

11.7K

Abhi Venigalla retweetledi

typedfemale@typedfemale·17 Tem

presenting: big jeff's trainium hell

English

114

566

4.7K

670.5K

Abhi Venigalla retweetledi

Cerebras@cerebras·28 May

Cerebras just beat NVIDIA Blackwell Last week: Blackwell hit 1,000 t/s on Llama 4. Today: Cerebras hit 2,500 t/s on the same model, same benchmarks by @ArtificialAnlys Blackwell smoked Groq, AMD, Google – everyone. Only Cerebras stands – and we smoked Blackwell.

English

486

143.5K

Abhi Venigalla@ml_hardware·14 May

@chase1440 @code_star @mvpatel2000 Good times :)

English

130

Chase Holmes 🇺🇸@chase1440·14 May

@code_star @mvpatel2000 @ml_hardware tired: spec sheets wired: MFU mic-drop aka @ml_hardware ripping a live training job on 1024 H100s and screensharing the evals with a skeptical customer

English

511

Mihir Patel@mvpatel2000·13 May

A great way to tell if an org has good ML eng is by backing out their MFU and checking if it's actually good when they brag about their training stack. Super useful to know 1) all the numbers (memorize hardware stats!) and 2) how to drive the math

Horace He@cHHillee

The fundamental question here (computing MFU) is a very reasonable question to ask in an interview (and if I'd recommend learning it if you don't know how). However, the real interview question I would like to ask is this: "I see 3 assumptions in this question that range from somewhat misleading to kinda unusual to flat out wrong. What are they?"

English

9.8K

Keşfet

@tri_dao @avnermay @joon_s_pk @msbernst @percyliang @ElainaYallen @mihikapoor @positron_ai