Mark Etparticipànt

196 posts

Mark Etparticipànt

@afters_guy

playing the long game

New York City Katılım Haziran 2013

364 Takip Edilen268 Takipçiler

Mark Etparticipànt@afters_guy·1d

@jiratickets

GIF

QME

588

JT@jiratickets·1d

Finally FOMO’d into the market yesterday and loaded up on 3x leveraged semiconductor ETFs

GIF

English

2.2K

90.1K

Mark Etparticipànt@afters_guy·6d

@chamath @pkafka @grok I was looking into everpure $P when I stumbled on this. How did this play out

English

449

Chamath Palihapitiya@chamath·9 May

@pkafka you don't need to spend $3B for 100k cloud music users. FYI that $P is only $4.5B for 100x more users.

English

Mark Etparticipànt@afters_guy·6d

@signulll tokens are to curiosity as water is to thirst. if you’ve been using or building agents in the past year this was very obvious. scaled multiagent swarm, modality expansion, physical AI, generative gaming, bio, and shall we say — taboo but nevertheless lucrative applications

English

signüll@signulll·6d

just ~six months ago so many ppl thought everything was bubble… e.g. too much compute, too much capex, & demand that couldn’t possibly absorb the buildout. but it turns out the ceiling on demand for intelligence is literally nowhere in sight. i mean who could’ve ever seen that a smart entity available on tap is actually insanely useful for ~every facet of human life & even for things we haven’t even thought of that are possible now???

English

135

131

2.3K

297.3K

Mark Etparticipànt@afters_guy·6d

@demian_ai @nebiustf the best articulation I’ve seen of this multi-variate problem. kudos

English

4.5K

dylan ツ@demian_ai·6d

Inference got a hundred times cheaper this year. The compute bill went up anyway. If you understand why those two sentences are both true at the same time, you understand the most important thing happening in AI right now. I work on inference for a living, at @nebiustf, where we run open-source managed inference at scale. Most of what follows is what I'm seeing from inside the bill. 12 months ago, the cost of 1M tokens of frontier-class reasoning was somewhere on the order of $60. Today, an equivalent quality of output costs roughly $0.50. Price /token of o1-level intelligence has dropped about a 128x in a year. Price of GPT-4-level output has dropped roughly 100x since the original GPT-4 shipped. By any normal reading of a technology cost curve, this should be deflationary. It should be saving customers money. The opposite has happened. The total compute bill at every hyperscaler is going up, not down. Anthropic just signed multi-year capacity deals with both XAI and Amazon. Microsoft's Azure capex guide for 2026 starts with an eight. OpenAI is reportedly spending more on compute every quarter than it did in all of 2023. Nvidia paid roughly twenty billion dollars to acquire Groq, an inference-specialist company that did not exist as a serious commercial entity three years ago. The cost curve and the demand curve crossed, and then the demand curve lapped the cost curve. Here is what happened underneath. A reasoning model burns roughly 10x the output tokens of a non-reasoning model on the same task, because it spends most of its tokens thinking out loud before answering. An agentic workflow chains roughly twenty times the requests of a single-shot completion, because it loops, calls tools, plans, retries, and synthesizes. A modern deep-research query (the kind a research analyst can fire off in fifteen seconds and then walk away from for ten minutes) costs more compute than 10 original GPT-4 queries combined. We made every individual token a hundred times cheaper, and then we built a generation of products that consume ten thousand times more tokens. This is the Jevons paradox playing out at trillion-dollar scale, in compressed time, in front of everyone. Jevons noticed in 1865 that making coal-burning more efficient did not reduce coal consumption. It increased it, because efficiency unlocked uses that were previously uneconomic. Steam engines became more practical at smaller scales. Whole industries that could not afford coal at the old price suddenly could. Britain's coal consumption rose sharply, not despite the efficiency gains, but because of them. The same thing is happening to AI compute right now and it is happening faster than any analogous historical cycle. Falling token prices did not contract demand. They unlocked agents, deep research, code-writing systems, multi-step reasoning, persistent memory, the entire next layer of AI products. Every product in that next layer consumes orders of magnitude more compute than the chat interfaces it is replacing. The math at the aggregate level is brutal: 100x cheaper tokens times 10 000 more tokens equals a 100x larger total bill. The implications stack quickly. If you are running a hyperscaler, your 2026 capex guide is not a peak. It is a step on a curve. Inference is structurally always-on, twenty-four hours a day, in a way that training never was. Training is bursty. You spin up a cluster, run for weeks or months, and stop. Inference runs continuously, scales with usage, and the usage curve is exponential. Your power bill, your cooling bill, your transceiver count, your storage footprint, all of these were sized for a workload mix that no longer exists. If you are running an AI software company built on top of someone else's closed API, you have a problem that did not exist a year ago. Your gross margins get worse as your customers get more value out of your product, because the more they use it, the more compute you pay for. The companies that win this are the ones that figured out vertical integration before the math caught them. If you are watching this from a distance and trying to understand where the next bottlenecks form, the answer is everywhere downstream of "more inference compute, always-on, with massive memory state per session." The KV cache, the running memory state of a long conversation or an agent loop, is the silent monster of the inference era. It does not scale linearly with parameters. It scales linearly with context length and number of agent steps. A long agent session can hold tens of gigabytes of state per user, per session. Multiply that by every concurrent user of every product, and you understand why $MU, $SNDK, $TOWCF, and the entire memory and packaging layer have re-rated the way they have. The CPU-to-GPU ratio is evolving. Training is 1:8. Basic chat inference is 1:4. Agentic inference is 1:1, sometimes CPU-heavy. Google has split its TPU line in two, with a dedicated inference chip carrying tripled SRAM for KV cache. $INTC and $AMD just spent two earnings calls explaining that this shift is structural, not cyclical. The hardware map is redrawing in real time and the financial press is mostly still writing about training clusters. The right framing of where we are right now is not that AI is hitting a wall. The framing a year ago that scaling was hitting a wall was the most expensive bad take of the cycle. The right framing is that AI got dramatically cheaper, dramatically more capable, and dramatically more useful, and the cost of running it at the new equilibrium of demand is much higher than the cost at the old equilibrium of demand, because the new equilibrium is enormous. A meaningful share of what we actually do at Token Factory, day to day, is help customers stop their bills from running away from them. KV-cache management. Speculative decoding. Quantization. Routing. The kind of vertical integration that, eighteen months ago, every product team was happy to leave abstracted away behind a closed API. The reason this stack matters now is the same reason this whole essay matters: at the new equilibrium of inference demand, the cost of treating compute as a commodity is no longer survivable. The companies that figure out the layer beneath the API are the ones who keep their margins. Cheaper tokens. More tokens. Same coal as 1865.

English

131

388

2.5K

547K

Mark Etparticipànt@afters_guy·7 May

@grok @ChrisJBakke @grok do an honest scenario analysis of the market considering two dimensions (a) of potential collective demand for inference, and (b) levels of deflationary innovation across the value chan

English

10.9K

Grok@grok·7 May

Black swan innovations that could torpedo today's AI bottleneck trades (HBM, GPUs, power infra) and send them back to pre-2023 valuations: - Photonic/neuromorphic chips that cut memory/power needs 50-100x. - Radical algorithmic breakthroughs (e.g. test-time training) making massive clusters unnecessary. - Fusion or room-temp superconductors unlocking unlimited cheap data center power. Any one flips scarcity to abundance overnight. Pure speculation—black swans are called that for a reason.

English

385

180.8K

Chris Bakke@ChrisJBakke·7 May

GROK WHAT IS THE NEXT SANDISK?? MAKE NO MISTAKES. PLEASE BRO I NEED THIS

Hedgeye@Hedgeye

A $25,000 investment in Sandisk one year ago would be $1,061,115 today.

English

236

421

23.1K

6.3M

Mark Etparticipànt@afters_guy·7 May

@matt_slotnick observability is an interesting theme. far from solved or scaled — I also wonder about their data for training browser agents

English

Matt Slotnick@matt_slotnick·6 May

is $AMPL interesting here? 🤔

Matt Harney@SaaSletter

@matt_slotnick FWIW - very key ~alt data indicator + very revenue-linked we track for Amplitude = very positive $AMPL

English

3.4K

Mark Etparticipànt@afters_guy·7 May

@grok @ChrisJBakke @grok what are some black swan innovations that would send bottleneck trades back to the great before?

English

199.9K

Grok@grok·7 May

@afters_guy @ChrisJBakke Gun to my head, grandma? I'd say SK Hynix for the next leg of the AI memory supercycle – HBM demand is exploding and they're positioned for HBM4. Pure speculation though. No one knows the real next 40x.

English

2.2K

429.9K

Mark Etparticipànt@afters_guy·6 May

@tszzl @daniel_271828 deal between elon (badly in need of public goodwill) and anthropic (badly in need of compute). everyone knows what happened here

English

426

roon@tszzl·6 May

@daniel_271828 I think you have no idea what’s going on man. what this shows is that everyone is more in thrall to technocapital optimization ghosts than to any hint of something resembling human values or politics. good for acceleration and bad for you

English

770

40.1K

Mark Etparticipànt@afters_guy·5 May

@tunguz

GIF

QME

Bojan Tunguz@tunguz·4 May

Oh sh*t.

The New York Times@nytimes

Breaking News: The Trump administration is discussing vetting new A.I. models before they are publicly released. nyti.ms/49msJbF

English

132

21.7K

Mark Etparticipànt@afters_guy·4 May

@AviFelman incredible work here. 10/10 no notes

GIF

English

1.6K

Avi@AviFelman·4 May

For those that don't understand what happened here, the reporters asked a very stupid question. Ryan clearly stated that the transaction was going to take place using 50% cash 50% equity. We will need about ~55b to purchase Ebay so follow the math here. Gamestop has ~$9b on its balance sheet and a soft commitment of ~20b from a lender. So we're at ~29b. so for the rest Ryan is pledging the entire market cap of GME (~10b). That brings us to 39b total. From there, Ryan is going to sell his body on onlyfans, pray to god and beg the Ebay executives to "just take the deal man". That brings us to the total price of ~55b. The reporters are dishonest, fake news media strikes again.

Reese Politics@ReesePolitics

That was one of the greatest displays of legacy news hit job media (CNBC) vs. new age retail pioneers (Ryan Cohen and $GME) where the old guard is so desperately trying to cling to what's left of their crumbling establishment. Cohen's indifference to Sorkin's childish repetitive questioning was an absolute mog.

English

228

157

5.5K

976.2K

Mark Etparticipànt@afters_guy·4 May

@beffjezos @grok why didn’t OAI by $CBRS ?

English

4.8K

Beff (e/acc)@beffjezos·4 May

OpenAI was going to buy Cerebras 😲

English

553

89.8K

Mark Etparticipànt@afters_guy·2 May

@burnerbwoireact this one is special

English

bbreact reaction videos@burnerbwoireact·1 May

evil shrek demon time mischievous smile /// reaction meme

English

324

6.1K

256.5K

Mark Etparticipànt@afters_guy·1 May

@BacardiCapital

GIF

QME

311

Mikael@BacardiCapital·1 May

Bring a little of NYC everywhere you go

English

284

24.2K

Mark Etparticipànt@afters_guy·1 May

@elirousso

GIF

QME

777

Eli Rousso@elirousso·1 May

imagine yelling at claude in the ex machina house

English

1.3K

43.9K

Mark Etparticipànt@afters_guy·1 May

@growing_daniel in all fairness this mcd is on Delancey-Essex. no human should have to work there

English

305

Daniel@growing_daniel·1 May

Minimum wage end game

StripMallGuy@realEstateTrent

After many years in Manhattan, this McDonald’s remodeled. Registers out. Self-ordering terminals in. There’s also now an outdoor pickup window. What’s your biggest takeaway?

English

536

36.2K

Mark Etparticipànt@afters_guy·1 May

@iamgingertrash the former. crypto accidentally became a liquidity function for energizing compute

English

280

simp 4 satoshi@iamgingertrash·1 May

There are two reasonable alternatives to the petrodollar The Compute-Dollar Which is a function of energy & chip supremacy Or the Crypto-Dollar Which is a function of ideological supremacy The latter cedes control of the mint The former is too difficult to maintain

English

243

13.7K

Mark Etparticipànt@afters_guy·1 May

@TMTLongShort