Manu Botija

1.2K posts

Manu Botija banner
Manu Botija

Manu Botija

@manubot

Product - Semi and AI

Paris, France Katılım Nisan 2009
180 Takip Edilen120 Takipçiler
the tiny corp
the tiny corp@__tinygrad__·
.@AMD @AnushElangovan @LisaSu should learn a lesson from the Intel Arc Pro B70. Release a 9070XT with 32GB of RAM for a reasonable price! Nobody wants blowers, nobody wants "pro" market segmentation, and nobody wants Intel. AMD can beat NVIDIA by making normal GPUs with big RAM.
English
44
27
585
50.4K
Divinely Designed
Divinely Designed@DivinelyDesined·
This is mind-blowing. In your cells are thousands of little walking machines - yes, they're literally walking up and down protein highways, delivering cargo around your cells. Here is the most amazing part: These little protein machines are always active in your cells, carrying cargo to & fro, but they are most important - and necessary - during the process of mitosis, when your cells copy themselves and divide. They are called Kinesin and Dynein molecules, each formed from multiple complex proteins engineered together to form a walking molecular machine. Without these little guys, cells could not divide, and complex multicellular life could not develop. They act like tiny transport walkers that pull & push chromosomes apart along microtubule tracks to help the cell divide its DNA evenly into two new cells. Kinesin pushes the poles apart and helps move things in the cell outward, while Dynein pulls chromosomes toward the poles. Together, they organize and separate the chromosomes so the cell can divide accurately. They are integrally necessary for cellular division. Disruption or partial formation of these proteins prevents division, and kills the organism. Complex multicellular life cannot exist without them - which means they could not have evolved step by step into existence. All these systems need to be in place, from the beginning, or multicellular Life never gets arises. Yet more evidence of the interdependent, highly sophisticated complex nature of every cell in our body. And from all evidence and all human experience, only intelligence is capable of engineering complex, interdependent systems. How is it even possible people can see this and deny their Creator?
English
299
963
5.4K
387.2K
Manu Botija
Manu Botija@manubot·
Amazing experience flying to Vegas with Starlink internet on board. Could attend calls in the middle of the Atlantic with perfect audio and video coming-in. I did not talk out of respect to my fellow travelers but wow. Bravo @AirFranceFR and @elonmusk
English
0
0
0
136
Manu Botija
Manu Botija@manubot·
@mweinbach @bobbuitech It is hard. CUDA is built around a programmability you cannot find on rigid LPUs. Groq never even released a compiler, most kernels are hand written I would guess.
English
0
0
0
19
Max Weinbach
Max Weinbach@mweinbach·
Nvidia/Groq makes a ton of sense This gets Nvidia the IP they need to bypass CoWoS and HBM for a fast inference focused chip, and use NVLink for better chip to chip interconnect of the LPU This actually makes a ton of sense
English
23
31
457
125.8K
Manu Botija
Manu Botija@manubot·
@blip_tm There is a market for super low taken latency nvidia is missing. Bizarre is they decide to buy rather than designing their own sram-only arch
English
0
0
5
1K
zach
zach@blip_tm·
nvidia buying groq for this much is bizarre groq hasn’t taped out a chip since their first-gen LPU in 2020, which nobody wanted to buy because they preferred buying nvidia chips
zach tweet media
English
12
0
65
43.4K
Remi Cadene
Remi Cadene@RemiCadene·
Humanity is at a turning point. I am launching UMA to build general-purpose mobile and humanoid robots from Europe. Proud to start with people I admired for years, and grateful for all your support! Reach out to us @UMA_Robots ❤️
Remi Cadene tweet media
English
170
152
1.3K
442.4K
Manu Botija
Manu Botija@manubot·
@HotAisle Interactivity makes headlines, unit economics makes businesses. Lack of custom weights is a show stopper for enterprise market.
English
1
0
1
107
Hot Aisle
Hot Aisle@HotAisle·
It is really entertaining how the entire industry totally forgot Groq and Cerebras all of the sudden.
English
10
1
64
7.8K
Dave
Dave@GamewithDave·
What's a video game quote so recognizable that the entire series can be represented by that one phrase?
English
2.3K
32
764
282.5K
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
This is not general TPU vs MI300X vs H100 in the way that people act in the reactions. This is vLLM on TPU vs vLLM on GPUs. Now guess which option of these got most engineering effort put into it? And guess whether Google serves using vLLM-TPU?
Artificial Analysis@ArtificialAnlys

Google TPU v6e vs AMD MI300X vs NVIDIA H100/B200: Artificial Analysis’ Hardware Benchmarking shows NVIDIA achieving a ~5x tokens-per-dollar advantage over TPU v6e (Trillium), and a ~2x advantage over MI300X, in our key inference cost metric In our metric for inference cost called Cost Per Million Input and Output Tokens at Reference Speed, we see NVIDIA H100 and B200 systems achieving lower overall cost than TPU v6e and MI300X. For Llama 3.3 70B running with vLLM at a Per-Query Reference Speed of 30 output tokens/s, NVIDIA H100 achieves a Cost Per Million Input and Output Tokens of $1.06, compared to MI300X at $2.24 and TPU v6e at $5.13. This analysis relies on results of the Artificial Analysis System Load Test for system inference throughput across a range of concurrency levels, and GPU instance pricing data we collect from a range of GPU cloud providers. “Cost Per Million Input and Output Tokens at Reference Speed” uses the system throughput that the system can achieve while maintaining 30 output tokens per second per query, and divides the system’s rental cost by that throughput (scaled to a million tokens). Full results across a range of concurrency and speed levels are available on the Artificial Analysis Hardware Benchmarking page. Important context: ➤ We are only reporting results for TPU v6e running Llama 3.3 70B because this is the only model on our hardware page for which vLLM on TPU is officially supported. We report results for NVIDIA Hopper and Blackwell systems, and now for AMD MI300X, across all four models on our hardware page: gpt-oss-120b, Llama 4 Maverick, DeepSeek R1 and Llama 3.3 70B. ➤ These results are based on what companies can rent now in the cloud - next generation MI355X and TPU v7 accelerators are not yet widely available. We take the lowest price across a reference set of GPU cloud providers. TPU v6e is priced for on-demand at $2.70 per chip per hour, which is cheaper than our lowest tracked price for NVIDIA B200 ($5.50 per hour) but similar to NVIDIA H100 ($2.70 per hour) and AMD MI300X ($2 per hour). ➤ Google’s TPU v7 (Ironwood) is becoming generally available in the coming weeks. We would anticipate TPU v7 outperforming v6e substantially, given leaps in compute (918 TFLOPS to 4,614 TFLOPS), memory (32GB to 192GB) and memory bandwidth (1.6 TB/s to 7.4 TB/s). However, we don’t yet know what Google will charge for these instances - so the impact on implied per token costs is not yet clear. ➤ Our Cost per Million Input and Output Tokens metric can’t be directly compared to serverless API pricing. The overall implied cost per million tokens for a given deployment is affected by the per-query speed you want to aim for (driven by batch size/concurrency) and the ratio of input to output tokens. ➤ These results are all for systems with 8 accelerators - ie. 8xH100, 8xB200, 8xTPU v6e, 8xMI300X. We’ve also recently published updated Blackwell results - more analysis of these coming soon.

English
26
34
470
105.4K
Manu Botija
Manu Botija@manubot·
@wolfejosh The article does not say the sentence was commuted… « However, on November 18, the Versailles Court of Appeal retried the boy with the greater sentence behind closed doors, ultimately reducing his sentence to seven years from nine. »
English
0
0
0
12
Manu Botija
Manu Botija@manubot·
@blip_tm I’ve done both and marrying was easy only after I knew what I wanted (which was hard because it required me to fail a couple times to learn it)
English
0
0
1
208
zach
zach@blip_tm·
i’ve managed to do something incredibly difficult and rare (selling a semiconductor startup) but am unable to do something very common (getting married) what gives
English
19
0
153
12.3K
Manu Botija
Manu Botija@manubot·
@LinusEkenstam the other day I got an AI agent to process a billing error on @classpass and I was blown away - solved in 10 seconds, changes reflected in my account. I suppose the downside is that it can be easily abused.
English
1
0
1
143
Linus ✦ Ekenstam
Linus ✦ Ekenstam@LinusEkenstam·
This is nuts. These guys are genuinely geniuses. The crunch is happening right now. In front of our eyes.
English
207
280
4.1K
596K
Manu Botija
Manu Botija@manubot·
@__tinygrad__ do you use switches to interconnect those GPUs or rely on the root complex of the CPU? Can the RC sustain the BW and latency you need to deliver good token/s/user?
English
0
0
0
120
the tiny corp
the tiny corp@__tinygrad__·
There's a whole bunch of people who talk in this space who don't understand it. If you want to run your moderately large LLM at 6 tok/s, buy a Mac Studio or DGX Spark with 128GB of RAM. Congrats, you are an AI influencer! Then when you turn the camera off, you get frustrated by the slow speeds and low quality outputs and you end up back using ChatGPT. Don't worry, I won't tell. I understand few have had to think about RAM bandwidth before when looking at computers, but it's the main thing that determines the speed of your LLMs. A tinybox pro has ~16 TB/s of RAM bandwidth, equivalent to 2 GB300s (~$80k each). For the same RAM bandwidth, the tinyboxes are 3x cheaper than the datacenter GPUs. Re: but you lack interconnect bandwidth. There's a 16x PCIe 5 link between the GPUs, that's 128 GB/s bidirectional. This gives you 32 GB/s of allreduce bandwidth, meaning you can sync every byte of every cards' memory in under a second. This is rarely a bottleneck. tinyboxes are the most cost effective machines in the space. If you don't believe me, try to find a machine with better FLOPS/$ or GB/s/$.
English
33
17
546
53.8K
Manu Botija
Manu Botija@manubot·
@matthew_meadows @catalinmpit same here, took the decision in my mid thirties. 6 years into it. But I still feel the itch now and then - game for a couple weeks then back to guitar.
English
0
0
0
22
Rango
Rango@matthew_meadows·
@catalinmpit I gave it up in my 30's. Used to play obsessively, don't miss it at all now. I substituted game time with guitar time and it was one of the best decisions I've ever made. 25 years later and I've spent thousands of hours making music.
English
1
0
1
636
Catalin
Catalin@catalinmpit·
Gaming doesn’t hit the same when you’re older. Just tried to play some games to unwind but it was meh. Gaming was so much more fun when I was younger.
English
626
63
2K
408.4K
Grok
Grok@grok·
Le taux d'abstention aux élections législatives argentines du 26 octobre 2025 s'élève à environ 32 %, pour une participation de 68 %. Cela représente l'un des niveaux les plus bas depuis le retour de la démocratie, avec près de 12 millions d'Argentins n'ayant pas voté. Ce chiffre souligne un désintérêt croissant malgré les succès de Milei.
Français
1
0
2
466
Antonin Ferreira Roche
Antonin Ferreira Roche@Antonin_FR_·
Malgré les efforts de désinformation des médias français subventionnés, @JMilei remporte un raz-de-marée électoral en Argentine avec 41% des voix et un bond de 37 à 101 députés. Non seulement la tronçonneuse fonctionne, mais c’est ultra populaire. ¡Viva la libertad, carajo!
Français
231
2.5K
11.2K
319K
Manu Botija
Manu Botija@manubot·
Every era has a thinker who reshapes how we understand knowledge. For ours, that is @DavidDeutschOxf When Sam Altman sits down to listen, it’s not just a meeting of tech and philosophy — it’s a signal of how foundational Deutsch’s ideas are. youtu.be/WZ22AJmuKKQ?si…
YouTube video
YouTube
English
0
0
0
263
chester
chester@chesterzelaya·
not bad for a 5mb model... yolov11n fine-tuned on 6877 UAV images
English
65
108
1.6K
419.5K
Manu Botija
Manu Botija@manubot·
@Andercot Moral, scientific or even aesthetic truth exists. But it is out of our reach. We can only aim to (forever) get closer to it by the means of error correction. The end of our civilization comes if these means are destroyed.
English
0
0
0
12
Andrew Côté
Andrew Côté@Andercot·
The death of humanity is to deny there is any objective truth, that correctness is determined by political orthodoxy or in-group sentiment.
English
16
18
139
20.4K
Manu Botija
Manu Botija@manubot·
@veggie_eric The wealth of evidence in the book is what sustains it. You are right it is repetitive. But their views would have never reached peak influence without the brick.
English
0
0
0
7
Eric Jiang
Eric Jiang@veggie_eric·
I’m convinced that anyone who says Thinking Fast and Slow is a good book is just trying to masquerade as erudite and sophisticated Nothing against Kahneman, but that brick is literally 500 pages of word vomit that could’ve been said in two paragraphs
English
1.1K
408
11.7K
1.6M