Eric Quinnell

582 posts

Eric Quinnell

@divBy_zero

Founding Chip Architect, Stealth Startup. Fmr AWS Tranium, Tesla Dojo, CPUs (x86+ARM). PhD Computer Arithmetic

Katılım Aralık 2022

391 Takip Edilen2.4K Takipçiler

Eric Quinnell@divBy_zero·1d

@opinali @corsix Definitely not. It’s all hard. Software, hardware, all of it

English

Osvaldo Pinali Doederlein@opinali·2d

@divBy_zero @corsix Lol so what everyone is teaching me is that there's no hope for a much saner world if I switch career to silicon design.

English

Osvaldo Pinali Doederlein@opinali·3d

As a SW guy what puzzles me most in chip design is that it's a total monopoly of one programming language: Verilog. Yeah there's some higher-level stuff, some variants, but all ultimately translated to Verilog. Where's the fun without a hundred different PLs to confuse you

English

1.5K

Eric Quinnell@divBy_zero·2d

@corsix @opinali This. Also all the different EDA tools only support different random subsections of Verilog. Part of the fun is finding the sub-sub-set that actually works in them all without losing chip intent

English

Pete Cawley@corsix·2d

@opinali Verilog already comes with the confusion level of a hundred different PLs

English

294

Eric Quinnell@divBy_zero·15 May

@Leik0w0 It’s usually better for hardware systolic arrays. Hw doesn’t execute matmuls in the same physical directions as paper and pencil matrices - the row and col weights come from the same physical place, so it has to be row+row or col+col. Someone is doing the xpose somewhere

English

116

Léo@Leik0w0·15 May

WHO DECIDED WE WOULD USE COL MAJOR BY DEFAULT ????

English

2.3K

Eric Quinnell@divBy_zero·12 May

@insane_analyst He who controls the SPICE controls the universe

English

12.5K

Irrational Analysis@insane_analyst·11 May

It's been over an hour and I still can't import external SPICE models correctly. WTF am I doing wrong I made a spice directive like the tutorials said.

English

12.2K

Eric Quinnell@divBy_zero·8 May

To be fair, at low enough FP quantization it does indeed return to associative behavior. Bc it’s INT

SemiAnalysis@SemiAnalysis_

Floating point math is not associative! And many of the highest performance kernels split the workload among SMs and accumulate partial results in a nondeterministic order. Many AI labs just accept this, or pay a huge performance penalty for determinism. DeepSeek decided to do neither. (1/4) 🧵

English

1.5K

Eric Quinnell@divBy_zero·20 Nis

@itsclivetime Still true

English

184

Eric Quinnell retweetledi

Clive Chan@itsclivetime·19 Nis

i believe it was @divBy_zero that wisely said that, contrary to common sense, it is always easier to fix performance problems in silicon than change an entrenched sw stack

English

Clive Chan@itsclivetime·19 Nis

why isn't there a startup doing a RISC-V extension for the Python Virtual Machine put a bunch of PyCores on a chip and you win the whole "Agentic CPU" market

English

214

27.2K

Eric Quinnell@divBy_zero·17 Nis

@yacineMTB Confirmed anecdotal data point. (Mid 40s, but close enough. This vibe coding stuff is legit)

English

1.5K

kache@yacineMTB·17 Nis

Old computer professionals, people in their 50s, 60s, that started with assembly, are about to become weapons of mass destruction as they discover what they can do with codex-level tools

English

125

2.5K

146.6K

Eric Quinnell@divBy_zero·15 Nis

Callout to the many leads and engineers who worked this over the years, esp @rawat_ritvik @aaronsrogers and Pete. This will be a massive (and needed) upgrade to all cars and bots.

English

472

Eric Quinnell@divBy_zero·15 Nis

Congrats AI5 team, I know it was a rocky road

Elon Musk@elonmusk

Congrats to the @Tesla_AI chip design team on taping out AI5! AI6, Dojo3 & other exciting chips in work.

English

1.7K

Eric Quinnell retweetledi

NASA@NASA·7 Nis

Hello, Moon. It’s great to be back. Here’s a taste of what the Artemis II astronauts photographed during their flight around the Moon. Check out more photos from the mission: nasa.gov/artemis-ii-mul…

English

10K

174K

809.9K

29.7M

Eric Quinnell@divBy_zero·27 Mar

@LottoLabs @ptremblay Pedantically you are correct, yes. Way less loss than current quantization, and that detail would derail non technical folks ever further, so I didn’t split the hairs

English

Lotto@LottoLabs·27 Mar

@ptremblay @divBy_zero Fair its no loss in accuracy but not lossless in the traditional sense of recovering bit for bit

English

Lotto@LottoLabs·27 Mar

@divBy_zero Lossless is the kicker here

English

216

Eric Quinnell@divBy_zero·27 Mar

@CliffLattner Yes, exactly. If doing inference, the compute will hide under the dram loads, even at high batch. For training, it is extra compute to pay.

English

151

Cliff Lattner@CliffLattner·27 Mar

@divBy_zero IMO its a no brainer that if you are willing to spend more cycles on quantization/dequantization, and forgo the savings you get from computing on the quantized data, you can do better than ordinary abs-max. I doubt that it is anywhere close to 8x over say fp8.

English

244

Eric Quinnell@divBy_zero·27 Mar

Two days of weird takes, so I must: “8x perf” is 32-bit baseline vs 4-bit compressed “KV cache” is merely a use case and hard to capitalize full perf. And yes, many already compress here. But it’s lossless. The others aren’t. We should have been using this all along.

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

7.3K

Eric Quinnell@divBy_zero·26 Mar

My flight lands, I exit the plane, look for the Baggage Claim sign, and then there it was: A 448Gpbs PAM4 Keysight waveform analyzer advertisement, on an LED backlight mega screen. Thank you SJC

English

2.9K

Eric Quinnell@divBy_zero·25 Mar

@GoogleResearch Classic compressor, seems obvious in retrospect, we should have all done this earlier. Well done Google Research, truly

English

1.2K

Google Research@GoogleResearch·24 Mar

GIF

English

5.8K

39K

19.4M

Eric Quinnell@divBy_zero·25 Mar

@rpoo @GoogleResearch Hahaha, this

English

161

Ross@rpoo·25 Mar

@GoogleResearch 🫡 for helping bring down ram prices

English

140

4.5K

Eric Quinnell retweetledi

kache@yacineMTB·23 Mar

ahahahahahahahaahhaahahahahahaha okay you guys were right this hardware shit is hard.

English

122

2.2K

152K

Eric Quinnell@divBy_zero·23 Şub

Levels should end asking you “is it dark outside? No looking at your watch”, after cabling and manually PXE booting for 16 hours in a row

P.M@p_misirov

there is a game called "data center" on steam which let's you build and manage your own data center. this is lowkey genius, the best way to educate people on a new trait. hyperscalers should learn a thing or two from "edutainment".

English

1.7K

Keşfet

@opinali @corsix @Leik0w0 @insane_analyst @itsclivetime @yacineMTB @rawat_ritvik @aaronsrogers