Trevor McCourt

1.3K posts

Trevor McCourt

@trevormccrt1

"Life is a beautiful, magnificent thing, even to a jellyfish"

Cambridge, MA Katılım Mart 2020

268 Takip Edilen13.9K Takipçiler

Sabitlenmiş Tweet

Trevor McCourt@trevormccrt1·31 Eki

A pretty complete description of the what, why and how of Extropic youtube.com/watch?v=dRuhl6…

YouTube

English

119

15.9K

Trevor McCourt@trevormccrt1·6d

cool to share this in public finally!

Extropic@extropic

If you are attending @APSphysics March meeting, come learn more about thermodynamic computing! Our work on taming non-equilibrium thermal electron fluctuations in silicon is now accepted in Physical Review Applied. Read more here: journals.aps.org/prapplied/abst…

English

1.3K

Trevor McCourt retweetledi

Extropic@extropic·20 Mar

English

475

181K

Trevor McCourt@trevormccrt1·5 Kas

@DanielleFong no actually Danielle this was exactly the design intent

English

4.2K

Danielle Fong 🔆@DanielleFong·4 Kas

lmfao

David Shapiro (L/0)@DaveShapi

Okay here's the first thing I did with THRML by @extropic It's just a basic sudoku solver. Thermodynamic computing is a bit overkill for this task but I think since humans can actually do sudoku, it's a good intuition for what's going on under the hood. With sudoku, there are many overlapping constraints. You start with a partially filled puzzle, which are the initial conditions, but then other rules are: no duplicates on any row, column, or square. Now, with a sudoku problem, you know there is ONE singular solution, or a "low energy state" i.e. where there are no rule violations or collisions. So then what you do is you program those "clamped" initial values into the TSU, and you bake in the rules (no duplicates) and then, due to the laws of thermodynamics and electricity... it just sort of settles into the correct solution (this is "annealing") The reason I think this is such a good example of what TSUs do is because for humans (and classical computers) it's more or less a "guess and check" process. No matter what method you use with classical computation or human computation, it's an iterative refinement process of sequential steps. But, with sudoku, as you can see in the output below, it's a single step. That's because the TSU looks at the whole problem globally. Here's how I did this: ChatGPT PRO 🤣 No joke, ChatGPT pro one-shotted this entire problem. There were several refinements we made, though it was mostly around UI and validation (not the core logic). However, we did do an optimization step to make sure we were using the correct block batching from the THRML library.

English

121

37K

Trevor McCourt retweetledi

Rohan Pandey@khoomeik·4 Kas

surprised that no one has posted @sarahookr's hardware lottery essay since @GillVerd's TSU announcement maybe EBMs were just waiting for the right chips to come along

English

230

31.1K

Trevor McCourt retweetledi

Trevor McCourt@trevormccrt1·31 Eki

A pretty complete description of the what, why and how of Extropic youtube.com/watch?v=dRuhl6…

YouTube

English

119

15.9K

Trevor McCourt@trevormccrt1·31 Eki

@keylimesoda @carl_feynman That’s the hope! arxiv.org/abs/2510.23972

English

2.1K

Ric Lewis@keylimesoda·31 Eki

@trevormccrt1 @carl_feynman Trevor, to that end, would your random sampling-based algorithms run faster and more efficiently than matrix-based algorithms on regular CPUs as well?

English

2.4K

carl feynman@carl_feynman·30 Eki

I read some of the marketing material on their website. They have a way of sampling simple probability distributions in a very energy-efficient way. They claim this will scale to better algorithms. I happen to have worked extensively in the field of random sampling, for sampling high-dimensional polyhedra and Markov chain Monte Carlo algorithms. Unfortunately, my experience indicates that this innovation in random sampling is not worth much. For the algorithms I’m familiar with, only a small fraction of the time is spent generating random numbers. The rest of the time is spent processing these numbers in various ways, using regular computer arithmetic. Speeding up the generation will not make much difference. It’s possible that some algorithm of which I am unaware will really need a very fast Gibbs sampler from simple distributions. But I don’t think it’s likely. 1/N

Extropic@extropic

Hello Thermo World.

English

673

106K

Trevor McCourt@trevormccrt1·31 Eki

tbf the answer very well could be "extremely far". We are doing something very hard that could take a ton of resources to make work. This isn't ambiguous to anyone, and is made pretty clear in all of our recent material for anyone who bothers to read. However, the cost of not trying things like what we are doing is fucking massive. Bringing advanced AI to everyone by brute force scaling of GPUs is going to cost tens or even hundreds of trillions of dollars. IMO way more people should be taking big shots at more energy effecient AI rn.

English

265

David Duvenaud@DavidDuvenaud·31 Eki

@dshoopy @1v100000 @trevormccrt1 @liron The answer to @1v100000's question, about how far from here to practical models, is what is not clear. The claims I'm saying are misleading are Extropic's claims I quoted above about what they've achieved so far.

English

852

Liron Shapira@liron·29 Eki

Today's Extropic launch raises some new red flags. I started following this company when they refused to explain the input/output spec of what they're building, leaving us waiting to get clarification.) Here are 3 red flags from today: 1. From extropic.ai/writing/inside… "Generative AI is Sampling. All generative AI algorithms are essentially procedures for sampling from probability distributions. Training a generative AI model corresponds to inferring the probability distribution that underlies some training data, and running inference corresponds to generating samples from the learned distribution. Because TSUs sample, they can run generative AI algorithms natively." This is a highly misleading claim about the algorithms that power the most useful modern AIs, on the same level of gaslighting as calling the human brain a thermodynamic computer. IIUC, as far as anyone knows, the majority of AI computation work doesn't match the kind of input/output that you can feed into Extropic's chip. The page says: "The next challenge is to figure out how to combine these primitives in a way that allows for capabilities to be scaled up to something comparable to today’s LLMs. To do this, we will need to build very large TSUs, and invent new algorithms that can consume an arbitrary amount of probabilistic computing resources." Do you really need to build large TSUs to research if it's possible for LLM-like applications to benefit from this hardware? I would've thought it'd be worth spending a couple $million on investigating that question via a combination of theory and modern cloud supercomputing hardware, instead spending over $30M on building hardware that might be a bridge to nowhere. Their own documentation for their THRML (their open-source library) says: "THRML provides GPU‑accelerated tools for block sampling on sparse, heterogeneous graphs, making it a natural place to prototype today and experiment with future Extropic hardware." You're saying you lack a way your hardware primitives could *in principle* be applied toward useful applications of some kind, and you created this library to help do that kind of research using today's GPUs… Why would you not just release the Python library earlier (THRML), do the bottlenecking research you said needs to be done earlier, and engage the community to help get you an answer to this key question by now? Why were you waiting all this time to first launch this extremely niche tiny-scale hardware prototype to come forward explaining this make-or-break bottleneck, and only publicize your search for potential partners who have some kind of relevant "probabilistic workloads" now, when the cost of not doing so was $30M and 18 months? 2. From extropic.ai/writing/tsu-10…: "We developed a model of our TSU architecture and used it to estimate how much energy it would take to run the denoising process shown in the above animation. What we found is that DTMs running on TSUs can be about 10,000x more energy efficient than standard image generation algorithms on GPUs." I'm already seeing people on Twitter hyping the 10,000x claim. But for anyone who's followed the decades-long saga of quantum computing companies claiming to achieve "quantum supremacy" with similar kinds of hype figures, you know how much care needs to go into defining that kind of benchmark. In practice, it tends to be extremely hard to point to situations where a classical computing approach *isn't* much faster than the claimed "10,000x faster thermodynamic computing" approach. The Extropic team knows this, but opted not to elaborate on the kind of conditions that could reproduce this hype benchmark that they wanted to see go viral. 3. The terminology they're using has been switched to "probabilistic computer": "We designed the world’s first scalable probabilistic computer." Until today, they were using "thermodynamic computer" as their term, and claimed in writing that "the brain is a thermodynamic computer". One could give them the benefit of the doubt for pivoting their terminology. It's just that they were always talking nonsense about the brain being a "thermodynamic computer" (in my view the brain is neither that nor a "quantum computer"; it's very much a neural net algorithm running on a classical computer architecture). And this sudden terminology pivot is consistent with them having been talking nonsense on that front. Now for the positives: * Some hardware actually got built! * They explain how its input/output potentially has an application in denoising, though as mentioned, are vague on the details of the supposed "10,000x thermodynamic supremacy" they achieved on this front. Overall: This is about what I expected when I first started asking for the input output 18 months ago. They had a legitimately cool idea for a piece of hardware, but didn't have a plan for making it useful, but had some vague beginnings of some theoretical research that had a chance to make it useful. They seem to have made respectable progress getting the hardware into production (the amount that $30M buys you), and seemingly less progress finding reasons why this particular hardware, even after 10 generations of successor refinements, is going to be of use to anyone. Going forward, instead of responding to questions about your device's input/output by "mogging" people and saying it's a company secret, and tweeting hyperstitions about your thermodynamic god, I'd recommend being more open about the seemingly giant life-or-death question that the tech community might actually be interested in helping you answer: whether someone can write a Python program in your simulator with stronger evidence that some kind of useful "thermodynamic supremacy" with your hardware concept can ever be a thing.

English

113

1.1K

326.3K

Trevor McCourt retweetledi

Extropic@extropic·31 Eki

From sim to silicon. X0 validated our novel probabilistic hardware primitives: • pbits • pdits • pmodes • pMoG To learn more, read the X0 breakdown here: extropic.ai/writing/inside…

English

245

53.9K

Trevor McCourt@trevormccrt1·31 Eki

On a few misconceptions that I'm seeing pop up over and over again:

Trevor McCourt@trevormccrt1

Carl, thanks for the thoughtful engagement with our material! You are absolutely right that random sampling makes up a vanishing part of the computational workload in something like a transformer or diffusion model. These algorithms evolved alongside the GPU, and therefore naturally mostly use the things GPUs are good at (matrix-vector multiplication). However, we arent trying to run these models. In fact, we are trying to take things in a completely different direction! At Extropic, we are co-designing hardware that is really good at sampling and machine learning algorithms that make heavy use of sampling. Our paper provides early evidence that this approach could be really really energy efficient. In this paper, we show that sampling hardware can run a denoising model (parent concept of diffusion) that generates images that are similar to a simple benchmark dataset. This, of course, does not mean we are going to replace ChatGPT any time soon. There is a ton of work needed to scale our approach to something comparable with today's LLMs. However, the way we are doing things now is insanely costly from an energy perspective, which IMO makes pursuing less incremental (but riskier) technology a no-brainer. arxiv.org/pdf/2510.23972

English

8.8K

Trevor McCourt retweetledi

json@JsonBasedman·30 Eki

The Extropic TSU blog post is great. I LOVE when interactive visualizations are included. Brings me the same childlike joy I got when I first found that neural cellular automata blogpost on distillpub (RIP) extropic.ai/writing/tsu-10…

English

9.9K

Trevor McCourt retweetledi

Ruiqi Gao@RuiqiGao·30 Eki

Time to make EBM great again 🚀

Extropic@extropic

Hello Thermo World.

English

23.4K

Trevor McCourt@trevormccrt1·30 Eki

@carl_feynman Carl, as mentioned in our paper, 100ns was just for the particular circuit we were measuring at the time. This is nowhere near the limits of our transistor process, which is closer to 1ns given other practical constraints. arxiv.org/pdf/2510.23972

English

2.3K

carl feynman@carl_feynman·30 Eki

Their random number generator is very energy-efficient in part because it is very slow. It takes several hundred nanoseconds for it to generate a random bit, with about five or six bits of precision in the probability. (They mention 100 ns as the correlation time, but you have to wait several correlation times for the bits to adequately independent). I’m not sure if I can beat that speed in software on a regular CPU, but I’d be close. Someone who’s better at GPU programming can probably beat that speed. I could definitely beat that speed with a hardware random number generator, even using the 30 year old semiconductor technology that was the last one I built chips in. Of course all those would use more energy. (2/N)

English

192

15.1K

Trevor McCourt retweetledi

Pedro Domingos@pmddomingos·30 Eki

How to cut the energy cost of AI by a factor of 10,000: arxiv.org/abs/2510.23972

English

229

35.3K

Trevor McCourt retweetledi

Jack D. Carson@mtlushan·30 Eki

while I have mixed opinions on Gill's philosophy of technology, which has largely been misused by AI hypers, I'm grateful and relieved that he is working on genuinely interesting and hard problems like this. I'm excited to see it @beffjezos

Beff (e/acc)@beffjezos

The foundations have been laid. Now it's time to scale. Excited for the Thermodynamic Intelligence takeoff ahead.

English

6.4K

Trevor McCourt retweetledi

Chris Prucha@chrisprucha·30 Eki

Really proud of @GillVerd, @trevormccrt1 and team! I’m so honored to have played a [small] part in their story as an investor. LFG 💪💪

Extropic@extropic

Hello Thermo World.

English

7.8K

Trevor McCourt@trevormccrt1·30 Eki

@stevejang Type shit

English

283

steve jang@stevejang·30 Eki

more thoughts per watt

English

1.5K

Kerem Y Çamsarı@KeremCamsari·30 Eki

Correct — but you can make that argument for pre-GPU backprop too I am not saying the papers as they are will be as big as backprop if and when the hardware arrives. all I am saying is that by itself can’t be a convincing argument — you have to dig a little deeper and articulate why (accounting for the HW lottery)

zach@blip_tm

@trevormccrt1 the big flaw in both papers is that they only run weird EBMs that nobody wants to use! imo new hardware needs to be able to run conventional SOTA models to get meaningful adoption (though i realize that's not your view)

English

6.6K

Trevor McCourt@trevormccrt1·30 Eki

@KeremCamsari 💯

QME

380

Keşfet

@APSphysics @DanielleFong @sarahookr @GillVerd @keylimesoda @carl_feynman @dshoopy @1v100000