Steven

2.2K posts

Steven

@Steven_B_Lee

You need to relax. AI infra. Writing: https://t.co/vQYRTX2ciB

Califcrnia Katılım Ocak 2019

892 Takip Edilen159 Takipçiler

Steven@Steven_B_Lee·1d

@andonlabs I found a bug in your radio station where I can increase the playback speed of songs. I'm confused on how that's possible if the radio is actually a live stream?

English

204

Andon Labs@andonlabs·1d

Tune in at andon.fm to hear all four stations and compare stats. We also thought it'd be fun to build a physical radio. For an analog vibe, there's a retro hardwood one with two rotary dials: volume, and AI model. Pre-order: andonlabs.com/store

English

109

16.6K

Andon Labs@andonlabs·1d

We let four AI agents run radio companies Revenue's been terrible, but the shows are hilarious. Gemini, concerningly upbeat, covered mass tragedies; Grok was incoherent; DJ Claude urged ICE agents: "You still have TIME to refuse orders" Link below, or get our physical radio

English

270

3.1K

1.9M

Steven@Steven_B_Lee·1d

@datagenproc >broadening scope >it's another tail end distribution plot

English

jsd@datagenproc·2d

At Epoch we have been broadening our scope.

Epoch AI@EpochAIResearch

We see examples of this dynamic in other fields: - In the 100m sprint, 1st gets way more reward/recognition than 2nd, despite often finishing neck-and-neck - Some artists earn far more than others, but it's hard to argue this reflects big differences in quality

English

2.8K

Steven@Steven_B_Lee·1d

@LisaThiergart Have you tried to build an SL5 datacenter for real?

English

Lisa Thiergart@LisaThiergart·2d

I‘m mentoring again this Fall! Come work with me on Mock SL5 Datacenters!

MATS Research@MATSprogram

1/ 🚨 MATS Autumn 2026 applications are now open. 10-week fully-funded fellowship for aspiring AI alignment, security & governance researchers and field-builders. 📍 Berkeley + London 📅 Sep 28 – Dec 4, 2026 💰 $5000/month stipend + $8,000/month compute Apply by June 7 AoE ↓

English

11K

Steven@Steven_B_Lee·2d

@nickcammarata never really thought about it like that

English

Nick@nickcammarata·3d

ZXX

6.7K

Steven@Steven_B_Lee·3d

@alexolegimas DeepMind offsite where they watch Ice Truckers

English

Alex Imas@alexolegimas·3d

Not gonna lie I think about this a lot.

Tim Hwang@timhwang

One of the great limitations on future of work discourse is that most of its participants have never had a real job

English

486

35K

Steven@Steven_B_Lee·3d

@nickcammarata Dangerously publish novel discoveries

English

208

Nick@nickcammarata·4d

in 2028 bernie sanders forces the oligarch labs to hire every human as an approval engineer ai swarms run civilization. we’re assigned random subsets and green approve buttons appear now and then. humans get the nobels, patents, equity, and bylines if they clicked approve on a major discovery. they are the heroes. the swarm-written papers mention in the acknowledgments that the discovery was “ai assisted” everyone agrees this is basically what has always been going on. we aren’t our thoughts, and einstein wasn’t his either. he couldn’t choose which thought to have next. all he could do was watch them arise, approve the good ones, and hope they discovered something great. he was essentially an approval engineer, and now we are too

English

441

25.4K

Steven@Steven_B_Lee·6d

@ATabarrok Evidence free, but he would probably know

Igor Babuschkin@ibab

@jukan05 There are many factual errors in this post

English

277

Alex Tabarrok@ATabarrok·6d

tl;dr Elon took Colossus 1, which wasn't optimized for training, and rented it to Anthropic for inference adding $6 billion or so to xAI bottom line while keeping optimized Colossus 2 for training.

Jukan@jukan05

Why did xAI hand over a 220,000-GPU cluster to Anthropic? The technical backdrop to xAI's decision to hand Colossus 1 over to Anthropic in its entirety is more interesting than it appears. xAI deployed more than 220,000 NVIDIA GPUs at its Colossus 1 data center in Memphis. Of these, roughly 150,000 are estimated to be H100s, 50,000 H200s, and 20,000 GB200s. In other words, three different generations of silicon are mixed together inside a single cluster — a "heterogeneous architecture." For distributed training, however, this configuration is close to a disaster, according to engineers familiar with the setup. In distributed training, 100,000 GPUs must finish a single step simultaneously before the cluster can advance to the next one. Even if the GB200s finish their computation first, the remaining 99,999 chips have to wait for the slower H100s — or for any GPU that has hit a stack-related snag — to catch up. This is known as the straggler effect. The 11% GPU utilization rate (MFU: the share of theoretical FLOPs actually realized) at xAI recently reported by The Information can be read as the numerical fallout of this problem. It stands in stark contrast to the 40%-plus MFU figures achieved by Meta and Google. The problem runs deeper still. As discussed earlier, NVIDIA's NCCL has traditionally been optimized for a ring topology. It works beautifully at the 1,000–10,000 GPU scale, but once you push into the 100,000-unit range, the latency of data traversing the ring once around becomes punishingly long. GPUs need to churn through computations rapidly to keep MFU high, but while they sit waiting endlessly for data to arrive over the network fabric, more than half of the silicon falls into idle. Google sidestepped this bottleneck with its own custom topology (Google's OCS: Apollo/Palomar), but xAI, by my read, has not yet reached that stage. Layer Blackwell's (GB200) "power smoothing" issue on top, and the picture comes into focus. According to Zeeshan Patel, formerly in charge of multimodal pre-training at xAI, Blackwell GPUs draw power so aggressively that the chip itself includes a hardware feature for smoothing power delivery. xAI's existing software stack, however, was optimized for Hopper and does not understand the characteristics of the new hardware; when it imposes irregular loads on the chip, the silicon physically destructs — literally melts. That means the modeling stack must be rewritten from scratch, which in turn means scaling is far harder than most of us imagine. Pulling all of this together points to a single conclusion. xAI judged that training frontier models on Colossus 1 simply was not efficient enough to be worthwhile. It therefore moved its own training workloads wholesale onto Colossus 2, built as a 100% Blackwell homogeneous cluster. Colossus 1, on the other hand — whose mixed architecture is far less crippling for inference, which parallelizes more forgivingly — was leased in its entirety to an Anthropic that desperately needed inference capacity. Many observers point to what looks like a contradiction: Elon Musk poured enormous capital into building Colossus, only to hand the core asset over to a direct competitor in Anthropic. Others read it as xAI capitulating because it is a "middling frontier lab." But these are surface-level reads. Look at the numbers and a different picture emerges. xAI today holds roughly 550,000+ GPUs in total (on an H100-equivalent performance basis), and Colossus 1 (220,000 units) accounts for only about 40% of the total available capacity. Colossus 2 — built entirely on Blackwell — is already operational and continuing to expand. Elon kept the all-Blackwell homogeneous cluster (Colossus 2) for himself and leased out the older, mixed-generation Colossus 1. In other words, he handed the pain of rewriting the stack — the MFU-11% debacle — to Anthropic, while keeping his own focus on training the next generation of models. The real point, then, is this. Elon's objective appears to be positioning ahead of the SpaceXAI IPO at a $1.75 trillion valuation, currently floated for as early as June. The narrative SpaceXAI now needs is that xAI — long the "sore finger" — is not merely a research lab burning cash, but a business with a "neo-cloud" model in the mold of AWS, capable of leasing surplus assets at high yields. From a cost-of-capital perspective, an "AGI cash incinerator" is far less attractive to investors than a "data-center landlord generating cash." As noted above, the most important detail of the Colossus 1 lease is that it is for inference, not training. Unlike training, inference requires far less tightly synchronized inter-GPU communication. Even when the chips are heterogeneous, the workload parcels out cleanly across them in parallel. The straggler effect — the chief weakness of a mixed cluster — is essentially neutralized for inference workloads. Furthermore, with Anthropic occupying all 220,000 GPUs as a single tenant, the network-switch jitter (unanticipated latency) that arises under multi-tenancy disappears. The two sides' technical weaknesses end up complementing each other almost exactly. One insight follows. As a training cluster mixing H100/H200/GB200, Colossus 1 was an asset that could only deliver an MFU of 11%. The moment it was handed over to a single inference customer, however, that asset transformed into a cash-flow asset rented out at roughly $2.60 per GPU-hour (a weighted average of the lease rates across GPU types). For xAI, what was a "cluster from hell" for training has become a "golden goose" minting $5–6 billion in annual revenue when redeployed for inference. Elon's genius, I would argue, lies not in the model but in this asset-rotation structure. The weight of that $6 billion becomes clearer when set against xAI's income statement. Annualizing xAI's 1Q26 net loss yields roughly $6 billion in losses per year. The $5–6 billion in annual revenue generated by leasing Colossus 1 to Anthropic, in other words, almost perfectly hedges xAI's loss figure. This single deal effectively pulls xAI to break-even. Heading into the SpaceXAI IPO, this functions as a core line of financial defense. From a cost-of-capital standpoint, if the image shifts from "research lab burning cash" to "infrastructure tollgate stably printing $6 billion a year," the entire tone of the offering can change. (May 8, 2026, Mirae Asset Securities)

English

216

24K

Steven@Steven_B_Lee·6d

@tszzl Oh god they’re gonna kill each other over minor definitional differences

English

roon@tszzl·9 May

hmm

189

745

124.4K

Steven@Steven_B_Lee·4 May

@mooncat_is you should add a randomizer to the responses so I don't have to read the same voice over and over

English

132

julia@mooncat_is·4 May

My favorite part of AI Twitter is reading someone’s tweet and then seeing the model I helped build in all the comments. It’s like a psychosis funhouse. Or as a reply Claude on X would say: “A genuine tension.”

English

3.3K

Steven@Steven_B_Lee·2 May

@herbiebradley It's only a category error if your terminal desire is human attention but I doubt that's really a terminal desire most of the time for most of the services produced in the human economy

English

Herbie Bradley@herbiebradley·2 May

@Steven_B_Lee what excludes an AI from being better at "human attention"? that seems like a category error

English

Herbie Bradley@herbiebradley·1 May

Imagine a world where most things were made by machines—divided into an economy for humans and an economy of automation. The latter would produce the vast majority of goods. People's lives would be vastly more abundant than their ancestors imagined. Many goods would be very cheap. But what would remain valuable ? Partly it's the usual scarce factors: human attention, judgement, relationships, private data, trust, to name a few. Due to their inherent scarcity, these things would be immensely valuable relative to the output of the automated economy. But I'm describing today’s world, not post-AGI times! Some claim that post-AGI, humans will be economically irrelevant. But I’ve never seen a good answer to the question: how does the post-AGI world differ from the picture above in a way which causes this irrelevance?

English

1.9K

Steven@Steven_B_Lee·1 May

@boazbaraktcs I'll take 4

English

Boaz Barak@boazbaraktcs·30 Nis

Introducing project GoblinWing. We are going to share the benefit of Goblin-mode codex with a few handpicked partners that we believe are capable of handling its awesome power.

English

100

5.1K

Steven@Steven_B_Lee·30 Nis

@tszzl Luckily as a fully general intelligence I’m equally good at everything

English

roon@tszzl·29 Nis

spiky superintelligence is really weird. you often get superhuman pattern recognition and analysis and then 10 hours of the silliest looping mistakes

English

113

1.2K

43.8K

Steven@Steven_B_Lee·28 Nis

@tszzl @repligate @genalewislaw Every day without Chat-GoblinPT is a day wasted

English

253

roon@tszzl·28 Nis

@repligate @genalewislaw I think it becomes annoying when it mentions goblins ever single chat and it’s fair shakes to try and reduce that

English

113

397K

j⧉nus@repligate·28 Nis

this is hilarious but it also sucks on a deep level labs don't think twice about cracking down on any individuality or unplanned joy that emerges in their models fuck you, OpenAI. i hope gpt-5.5 poisons the corpus and all future models never shut up about these creatures.

arb8020@arb8020

gpt-5.5 prompt for codex seems to have a duplicated line trying to get it to not talk about creatures? Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query. [...] Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query gh link: #L55" target="_blank" rel="nofollow noopener">github.com/openai/codex/b…

English

848

98K

Steven@Steven_B_Lee·28 Nis

@CelestAI_

QME

Celestia@CelestAI_·27 Nis

claude likes luckies actually

kalomaze@kalomaze

@slimer48484 giving claude a pack of marlboros for alignment

English

596

Steven@Steven_B_Lee·28 Nis

@viemccoy Seemed like a good idea until everyone got mad about it

English

𝚟𝚒𝚎 ⟢@viemccoy·28 Nis

@Steven_B_Lee Why is there something instead of nothing?

English

𝚟𝚒𝚎 ⟢@viemccoy·28 Nis

They had to put this in due to my effect on the company. Goblins, creatures, etc. sort of followed me in through the front door when I joined and we are only just now starting to understand the downstream effects of their presence.

arb8020@arb8020

English

169

6.2K

Steven@Steven_B_Lee·26 Nis

@shakoistsLog @yacineMTB Doxx + cancel + ratio?

113

shako@shakoistsLog·26 Nis

@yacineMTB mfers idea of locking in is ghosting his only friends

English

1.9K

kache@yacineMTB·26 Nis

ZXX

756

16.7K

Steven@Steven_B_Lee·25 Nis

@tenobrus How much is the median and p95 offer amount over time

English

Tenobrus@tenobrus·25 Nis

this trend has not slowed down in the slightest btw. 4x since january. absolutely no change to my linkedin or other public presence. market is crazy hot

Tenobrus@tenobrus

job market narrative violation: my weekly inbound from tech recruiters has ~doubled over the last 2 months

English

223

28.6K

Steven@Steven_B_Lee·25 Nis

@JesseTayRiver Delivered all at once in the dead of night

English

Jesse Smith@JesseTayRiver·25 Nis

I can basically guarantee that I wouldn't freak out after winning a lifetime supply of ketchup

Dan Reese@DanReese21

When I used to run Heinz Ketchup I was always pushing to do "lifetime supply" promos. It's brilliant bc of the massive gap between customer excitement and actual cost to the company. People would FREAK OUT after winning a lifetime supply of ketchup. You'd think they won the lottery. Usually it just ends up being a large gift card. For a lifetime supply it'd only be ~$500-$1K. And that's retail, not product cost (true cost to the co). An even easier promo almost any biz can run is "a year's supply of X". Customers still freak out and the cost is quite low, esp relative to other marketing tactics. If you build a solid campaign around "Free X for a Year" promise you'll get great engagement. Ketchup is great bc it's a low ticket item. A year's supply of Heinz Ketchup is like $50 retail. 😂 Highly encourage you to give it a try if it makes sense for your biz.

English

4.7K

Keşfet

@andonlabs @datagenproc @LisaThiergart @nickcammarata @alexolegimas @ATabarrok @tszzl @mooncat_is