Trevor McCourt

1.3K posts

Trevor McCourt banner
Trevor McCourt

Trevor McCourt

@trevormccrt1

"Life is a beautiful, magnificent thing, even to a jellyfish"

Cambridge, MA Katılım Mart 2020
268 Takip Edilen13.9K Takipçiler
Trevor McCourt retweetledi
Extropic
Extropic@extropic·
If you are attending @APSphysics March meeting, come learn more about thermodynamic computing! Our work on taming non-equilibrium thermal electron fluctuations in silicon is now accepted in Physical Review Applied. Read more here: journals.aps.org/prapplied/abst…
English
15
54
475
181K
Danielle Fong 🔆
Danielle Fong 🔆@DanielleFong·
lmfao
David Shapiro (L/0)@DaveShapi

Okay here's the first thing I did with THRML by @extropic It's just a basic sudoku solver. Thermodynamic computing is a bit overkill for this task but I think since humans can actually do sudoku, it's a good intuition for what's going on under the hood. With sudoku, there are many overlapping constraints. You start with a partially filled puzzle, which are the initial conditions, but then other rules are: no duplicates on any row, column, or square. Now, with a sudoku problem, you know there is ONE singular solution, or a "low energy state" i.e. where there are no rule violations or collisions. So then what you do is you program those "clamped" initial values into the TSU, and you bake in the rules (no duplicates) and then, due to the laws of thermodynamics and electricity... it just sort of settles into the correct solution (this is "annealing") The reason I think this is such a good example of what TSUs do is because for humans (and classical computers) it's more or less a "guess and check" process. No matter what method you use with classical computation or human computation, it's an iterative refinement process of sequential steps. But, with sudoku, as you can see in the output below, it's a single step. That's because the TSU looks at the whole problem globally. Here's how I did this: ChatGPT PRO 🤣 No joke, ChatGPT pro one-shotted this entire problem. There were several refinements we made, though it was mostly around UI and validation (not the core logic). However, we did do an optimization step to make sure we were using the correct block batching from the THRML library.

English
9
4
121
37K
Trevor McCourt retweetledi
Rohan Pandey
Rohan Pandey@khoomeik·
surprised that no one has posted @sarahookr's hardware lottery essay since @GillVerd's TSU announcement maybe EBMs were just waiting for the right chips to come along
Rohan Pandey tweet media
English
10
33
230
31.1K
Ric Lewis
Ric Lewis@keylimesoda·
@trevormccrt1 @carl_feynman Trevor, to that end, would your random sampling-based algorithms run faster and more efficiently than matrix-based algorithms on regular CPUs as well?
English
2
0
11
2.4K
carl feynman
carl feynman@carl_feynman·
I read some of the marketing material on their website. They have a way of sampling simple probability distributions in a very energy-efficient way. They claim this will scale to better algorithms. I happen to have worked extensively in the field of random sampling, for sampling high-dimensional polyhedra and Markov chain Monte Carlo algorithms. Unfortunately, my experience indicates that this innovation in random sampling is not worth much. For the algorithms I’m familiar with, only a small fraction of the time is spent generating random numbers. The rest of the time is spent processing these numbers in various ways, using regular computer arithmetic. Speeding up the generation will not make much difference. It’s possible that some algorithm of which I am unaware will really need a very fast Gibbs sampler from simple distributions. But I don’t think it’s likely. 1/N
Extropic@extropic

Hello Thermo World.

English
35
29
673
106K
Trevor McCourt
Trevor McCourt@trevormccrt1·
tbf the answer very well could be "extremely far". We are doing something very hard that could take a ton of resources to make work. This isn't ambiguous to anyone, and is made pretty clear in all of our recent material for anyone who bothers to read. However, the cost of not trying things like what we are doing is fucking massive. Bringing advanced AI to everyone by brute force scaling of GPUs is going to cost tens or even hundreds of trillions of dollars. IMO way more people should be taking big shots at more energy effecient AI rn.
English
0
0
7
265
David Duvenaud
David Duvenaud@DavidDuvenaud·
@dshoopy @1v100000 @trevormccrt1 @liron The answer to @1v100000's question, about how far from here to practical models, is what is not clear. The claims I'm saying are misleading are Extropic's claims I quoted above about what they've achieved so far.
English
1
0
6
852
Liron Shapira
Liron Shapira@liron·
Today's Extropic launch raises some new red flags. I started following this company when they refused to explain the input/output spec of what they're building, leaving us waiting to get clarification.) Here are 3 red flags from today: 1. From extropic.ai/writing/inside… "Generative AI is Sampling. All generative AI algorithms are essentially procedures for sampling from probability distributions. Training a generative AI model corresponds to inferring the probability distribution that underlies some training data, and running inference corresponds to generating samples from the learned distribution. Because TSUs sample, they can run generative AI algorithms natively." This is a highly misleading claim about the algorithms that power the most useful modern AIs, on the same level of gaslighting as calling the human brain a thermodynamic computer. IIUC, as far as anyone knows, the majority of AI computation work doesn't match the kind of input/output that you can feed into Extropic's chip. The page says: "The next challenge is to figure out how to combine these primitives in a way that allows for capabilities to be scaled up to something comparable to today’s LLMs. To do this, we will need to build very large TSUs, and invent new algorithms that can consume an arbitrary amount of probabilistic computing resources." Do you really need to build large TSUs to research if it's possible for LLM-like applications to benefit from this hardware? I would've thought it'd be worth spending a couple $million on investigating that question via a combination of theory and modern cloud supercomputing hardware, instead spending over $30M on building hardware that might be a bridge to nowhere. Their own documentation for their THRML (their open-source library) says: "THRML provides GPU‑accelerated tools for block sampling on sparse, heterogeneous graphs, making it a natural place to prototype today and experiment with future Extropic hardware." You're saying you lack a way your hardware primitives could *in principle* be applied toward useful applications of some kind, and you created this library to help do that kind of research using today's GPUs… Why would you not just release the Python library earlier (THRML), do the bottlenecking research you said needs to be done earlier, and engage the community to help get you an answer to this key question by now? Why were you waiting all this time to first launch this extremely niche tiny-scale hardware prototype to come forward explaining this make-or-break bottleneck, and only publicize your search for potential partners who have some kind of relevant "probabilistic workloads" now, when the cost of not doing so was $30M and 18 months? 2. From extropic.ai/writing/tsu-10…: "We developed a model of our TSU architecture and used it to estimate how much energy it would take to run the denoising process shown in the above animation. What we found is that DTMs running on TSUs can be about 10,000x more energy efficient than standard image generation algorithms on GPUs." I'm already seeing people on Twitter hyping the 10,000x claim. But for anyone who's followed the decades-long saga of quantum computing companies claiming to achieve "quantum supremacy" with similar kinds of hype figures, you know how much care needs to go into defining that kind of benchmark. In practice, it tends to be extremely hard to point to situations where a classical computing approach *isn't* much faster than the claimed "10,000x faster thermodynamic computing" approach. The Extropic team knows this, but opted not to elaborate on the kind of conditions that could reproduce this hype benchmark that they wanted to see go viral. 3. The terminology they're using has been switched to "probabilistic computer": "We designed the world’s first scalable probabilistic computer." Until today, they were using "thermodynamic computer" as their term, and claimed in writing that "the brain is a thermodynamic computer". One could give them the benefit of the doubt for pivoting their terminology. It's just that they were always talking nonsense about the brain being a "thermodynamic computer" (in my view the brain is neither that nor a "quantum computer"; it's very much a neural net algorithm running on a classical computer architecture). And this sudden terminology pivot is consistent with them having been talking nonsense on that front. Now for the positives: * Some hardware actually got built! * They explain how its input/output potentially has an application in denoising, though as mentioned, are vague on the details of the supposed "10,000x thermodynamic supremacy" they achieved on this front. Overall: This is about what I expected when I first started asking for the input output 18 months ago. They had a legitimately cool idea for a piece of hardware, but didn't have a plan for making it useful, but had some vague beginnings of some theoretical research that had a chance to make it useful. They seem to have made respectable progress getting the hardware into production (the amount that $30M buys you), and seemingly less progress finding reasons why this particular hardware, even after 10 generations of successor refinements, is going to be of use to anyone. Going forward, instead of responding to questions about your device's input/output by "mogging" people and saying it's a company secret, and tweeting hyperstitions about your thermodynamic god, I'd recommend being more open about the seemingly giant life-or-death question that the tech community might actually be interested in helping you answer: whether someone can write a Python program in your simulator with stronger evidence that some kind of useful "thermodynamic supremacy" with your hardware concept can ever be a thing.
Liron Shapira tweet media
English
113
44
1.1K
326.3K
Trevor McCourt retweetledi
Extropic
Extropic@extropic·
From sim to silicon. X0 validated our novel probabilistic hardware primitives: • pbits • pdits • pmodes • pMoG To learn more, read the X0 breakdown here: extropic.ai/writing/inside…
English
13
19
245
53.9K
Trevor McCourt retweetledi
json
json@JsonBasedman·
The Extropic TSU blog post is great. I LOVE when interactive visualizations are included. Brings me the same childlike joy I got when I first found that neural cellular automata blogpost on distillpub (RIP) extropic.ai/writing/tsu-10…
English
3
5
83
9.9K
Trevor McCourt
Trevor McCourt@trevormccrt1·
@carl_feynman Carl, as mentioned in our paper, 100ns was just for the particular circuit we were measuring at the time. This is nowhere near the limits of our transistor process, which is closer to 1ns given other practical constraints. arxiv.org/pdf/2510.23972
English
1
0
31
2.3K
carl feynman
carl feynman@carl_feynman·
Their random number generator is very energy-efficient in part because it is very slow. It takes several hundred nanoseconds for it to generate a random bit, with about five or six bits of precision in the probability. (They mention 100 ns as the correlation time, but you have to wait several correlation times for the bits to adequately independent). I’m not sure if I can beat that speed in software on a regular CPU, but I’d be close. Someone who’s better at GPU programming can probably beat that speed. I could definitely beat that speed with a hardware random number generator, even using the 30 year old semiconductor technology that was the last one I built chips in. Of course all those would use more energy. (2/N)
English
9
3
192
15.1K
steve jang
steve jang@stevejang·
more thoughts per watt
English
2
0
15
1.5K
Kerem Y Çamsarı
Kerem Y Çamsarı@KeremCamsari·
Correct — but you can make that argument for pre-GPU backprop too I am not saying the papers as they are will be as big as backprop if and when the hardware arrives. all I am saying is that by itself can’t be a convincing argument — you have to dig a little deeper and articulate why (accounting for the HW lottery)
zach@blip_tm

@trevormccrt1 the big flaw in both papers is that they only run weird EBMs that nobody wants to use! imo new hardware needs to be able to run conventional SOTA models to get meaningful adoption (though i realize that's not your view)

English
2
1
15
6.6K