dharmafi

8.1K posts

dharmafi

dharmafi

@dharmatrade

https://t.co/WfpZiwMtM5

Katılım Aralık 2021
1.1K Takip Edilen9.6K Takipçiler
Sabitlenmiş Tweet
dharmafi
dharmafi@dharmatrade·
There were 12 apostles of Jesus Christ. 1 betrayed him. 1 denied him. Thus the first church suffered corruption, as they do till this day. However, his blessed holy mother Mary never abandoned him. She knows the pain of every mother, more than you can ever imagine.
dharmafi tweet media
English
9
4
66
20.5K
Eric S. Raymond
Eric S. Raymond@esrtweet·
A few days ago I wrote that disruption from below is the fate that awaits the big AI inference providers. This analysis, which I think is sound, strongly reinforces that projection. The upward pressure on costs is inexorable. But the harder the big providers push to recover a viable revenue stream, the stronger the incentive gets for their customers to bail out. By "bail out" I mean buying on-premises AI engines that sacrifice some of the speed and capability of frontier models in return for giving customers better control over their future inference costs. This is an absolutely classic setup for disruption from below. Because both the expensive frontier models and the cheap low-end models are going to keep getting better. As I pointed out before, at some point the cheap low-end models will get good enough and the bottom will fall out of the cloud inference market. With the cloud inference providers desperately raising their prices rather than lowering them, the jaws of that trap will close faster.
Owen Gregorian@OwenGregorian

The current AI pricing was always going to go away | Arnon Shimoni, Arnon Shimoni The current AI pricing was always going to go away. It just doesn’t make sense. Microsoft canceled internal Claude Code licenses this week (for whatever reason, even if it’s because they integrated it), Uber blew its entire 2026 AI budget in four months, and GitHub is dropping flat-rate plans across its products. You’ll see the framing “the AI subsidy era is ending” which is a polite way of what everyone’s been doing when they slap AI features into every tier of their product on a bet that inference costs would keep falling. They didn’t and the cost curve is bending the wrong way, and the labs have no choice except to pass that along. Did we collectively forget second order thinking? Each model generation, costs per token did fall in theory, sometimes 10x less but that was for comparable quality… Lots of people extrapolated and built business models on the extrapolation, which… isn’t how you think about it. Second-order thinking anyone? Everyone who deals with road planning knows about is induced demand. Each new capability invents new demand. Highways are the textbook case. Add a lane, you get new commutes. The commutes weren’t there before the lane. AI is the same shape. Cheaper inference doesn’t reduce the bill, it expands what people ask the model to do. Now my reasoning queries take >4 minutes, where the old ones took 2m… Agentic workflows make 50 calls where the old workflow made one. Unit cost falls, units explode, but still the total spend goes up. Anyone selling a flat-rate “AI assistant” assumed user behavior wouldn’t change. It did. It always does. The second is that the supply side stopped cooperating – memory and GPU economics are moving against you. Memory got 4x more expensive. GPUs got >95% more expensive. Frontier training and inference run on Nvidia accelerators paired with high-bandwidth memory. The ceiling isn’t transistors anymore, it’s HBM and the advanced packaging that bonds it to the compute die. That ceiling is one factory deep. TSMC’s CoWoS packaging line is the bottleneck for accelerator supply. SK Hynix dominates HBM, with Samsung lagging and Micron behind that. None of them can add capacity overnight. These are 18-to-36 month commitments, minimum, and they were planned for a world that under-forecast demand by an order of magnitude. So GPU pricing is what scarcity pricing looks like. Top-end accelerators today are roughly 2x more expensive than the previous generation at comparable cluster scale. HBM prices have 4x’d in 18 months. Power and cooling are now real constraints in places nobody used to model power for, which is why every hyperscaler now has a “we’re building a gigawatt campus” story and a nuclear-PPA press release. Anthropic’s CFO testified under oath this March that the company spent $10 billion on compute and made $5 billion in revenue (Ed Zitron has the math). The labs are underwater on inference. They’re raising prices to keep the lights on. Companies that sold flat-rate AI-everywhere products are now sitting on a margin problem they architected themselves into. The bet was that one of these curves would bend in their favor. None of them did, probably none of them will, certainly not on the timeline their pricing assumed. What changes from here The product question shifts. It stops being “where can we add AI?” and starts being “which use cases earn the inference cost they burn?” That’s a harder roadmap to write. It also changes the pricing surface, which is the part most product teams haven’t internalized. Three architectures handle a moving cost. None of them are new. All of them are uncomfortable for sales teams that grew up selling seats. Per-action. Every API call, every generation, every agent step has a price. Revenue scales with cost because they’re indexed to the same underlying event. Twilio has run this since 2008. AWS has run a version of it since 2006. The downside is transparency cuts both ways. Customers see the meter, and they negotiate. The upside is your gross margin doesn’t depend on guessing how hard your power users will hammer the system. Credits. Prepaid buckets. Customer buys 100,000 credits, burns them down on whatever, refills. Credits smooth cash flow and let you mix model costs behind a single unit, which is the only sane way to handle a product that routes between five different inference providers. The trap is breakage. Snowflake credits are infrastructure, customers understand what they’re buying. Gift-card credits are stranded assets, and customers can tell which one they bought. You only get to do the second one once. Hybrid. Base seat with included credits and metered overage. Most enterprise sales motions accept this without flinching, because the seat number still anchors the contract and the meter is the safety valve. It’s the design most AI-native products converge to within their first repricing cycle. Not my favourite, but whatever, it tends to work. The shape isn’t the point by itself, but rather whether the line moves when the cost line moves. Per-seat is the one architecture that pretends costs are fixed. Everything else is some flavor of indexing revenue to the underlying event. The impossible choice If your pricing can move with cost, you get to keep building. You can ship the agentic workflow, the heavier reasoning model, the slow expensive feature for power users, and you have a way to be paid for them. If you’re locked into per-seat (or flat, or whatever) – you pick between two losing options. Eat the margin and watch it compress every quarter your customers’ usage grows. Or strip AI out of your cheaper tiers and watch your activation rate fall off the lower-priced cohorts that used to be your funnel. Both options are visible on the next board deck. Neither one of them looks fun. arnon.dk/the-current-ai…

English
29
22
253
19.9K
dharmafi
dharmafi@dharmatrade·
@filpizlo 1) What 😅 So that dude was out there jamming by himself with a portable speaker? If so, amazing.
English
1
0
1
154
dharmafi
dharmafi@dharmatrade·
@trq212 Older codebases that were ahead of their time can also be explored. plan9 is around 200,000 lines of code. That's kernel, compilers, network, windowing system. Who cares if an LLM wasn't trained on plan9. It can learn the entire system on the fly. x.com/dharmatrade/st…
dharmafi@dharmatrade

@code Lines of code: 40M Linux kernel 15M GCC compiler 200K Entire plan9 operating system (kernel, compilers, networking, windowing system, commands)

English
0
0
0
313
Thariq
Thariq@trq212·
my main takeaway from the Bun rewrite is that legacy codebases will be incredibly valuable as a source for "distilling" code into new forms every game should be crossplatform, all legacy software should work on the web, we don't need COBOL anymore
English
115
58
1.7K
148K
dharmafi
dharmafi@dharmatrade·
Here's a walkthrough of how I use the 9front distribution of Plan 9 on Windows. Topics covered: QEMU drawterm Running arbitrary Plan 9 commands from Ubuntu Using exportfs to mount a Plan 9 directory on Ubuntu Using @OpenAI Codex with Plan 9 Editing files with vscode (@code)
English
1
1
13
9.5K
dharmafi
dharmafi@dharmatrade·
@dhh The Linux kernel is 40 million lines of code. The plan9 kernel, compilers, networking, window system, and commands is 200,000 lines of code. Agents have a field day with this. And, the complexity is within reach of what a single human can understand. x.com/dharmatrade/st…
dharmafi@dharmatrade

@code Lines of code: 40M Linux kernel 15M GCC compiler 200K Entire plan9 operating system (kernel, compilers, networking, windowing system, commands)

English
0
0
0
380
DHH
DHH@dhh·
The reason agents are so good at Linux is that all 40 million lines of kernel code was part of the pre training. Along with every other open source dependency. This really does make every obscure error message shallow, and the system completely malleable.
English
108
154
3.2K
178.7K
dharmafi
dharmafi@dharmatrade·
@code Lines of code: 40M Linux kernel 15M GCC compiler 200K Entire plan9 operating system (kernel, compilers, networking, windowing system, commands)
dharmafi tweet media
English
1
0
1
1K
Filip Jerzy Pizło
Filip Jerzy Pizło@filpizlo·
Yeah exactly Folks add stuff to websites (ads, trackers, cool frameworks that do nifty hamburger animations and responsible design etc) right up to the limit of what will work on any device they can test on. Simultaneously folks like us make the browsers more efficient, with fancy object representations and whatnot to make larger pages render on smaller devices. These two forces lead to bloated content. It’s a perverse form of induced demand
English
4
0
35
2.2K
The Lunduke Journal
The Lunduke Journal@LundukeJournal·
Over HALF A GIG to render a single webpage. And that's being celebrated as a win because it's LESS THAN A GIG. I think @ladybirdbrowser is an important project, and nothing but love for @awesomekling. But HALF A GIG? I mean... I get it. We've all been conditioned to think that a single webpage (or a "web app") should use a tremendous amount of RAM. But it SHOULDN'T. If a webpage is taking 500MB+ just to render? That is a tremendous failure. And, yes, that means that Chrome, Firefox, etc... all of them are ridiculous, absurd failures of software engineering. Bloat to the extreme. Make Software Efficient Again.
The Lunduke Journal tweet media
Andreas Kling@awesomekling

Been grinding away at memory usage in @ladybirdbrowser over the last two weeks. Making some good progress! Here's how we do on my X profile page 1 month ago vs today:

English
65
36
638
243.9K
geoff
geoff@GeoffreyHuntley·
@zex_exe haskell but not good enough to build whats in my head. if i had one last language with a gun to my head. haskell.
English
3
0
2
152
geoff
geoff@GeoffreyHuntley·
home sweet home. time to do a brain dump into a blog post before SF starts speed running the unhinged concept have been pitching and cryptically tweeting. alright if we zoom time back far enough. you’ll end up with the attached. the family tree of operating systems. it’s important to consider this. there was once a time when solari, irix, hpux, aix existed and ruled the world. eventually operating systems converged. this family tree of operating systems converged. now we only have linux, macos and windows. now you might be wondering why i’m hammering on about operating systems. it’s because i’m actually talking about PROGRAMMING LANGUAGES. programming languages are due for a similar convergence event. a programming language (which also might double as a unikernel distributed operating system) that is designed for agents first. programming languages are specifications for machines. markdown is not a programming language. my hot take is a product factory, loom, will not be possible to be solved properly until a convergence event. it’s also why i think a software factory will not be possible via gas town or many of the offerings coming to market this year. face it - programming languages have been specifications of machines but designed for humans. this can be falsified. the adoption issues that developers and programming language authors hold dear can be systematically falsified. we need a programming language designed for agents, for machines. it doesn’t been to be understandable by humans but it does need to be explainable by a machine to a human (via a prompt) so i’ve been enumerating through eso langs and exploring what this could perhaps look like. the training data problem is solvable and my hypo is the convergence event could happen within a couple years. not 10 years. if this seems bat shit insane. well, it is but within my head it’s all logical and want the steps needed to make it happen are pretty clear. not all languages that exit today will exist tomorrow. for example we no longer use solaris and in the distant future i deeply believe dynamically typed languages are solaris and we’ll cease using them.
geoff tweet media
English
25
9
125
12.4K
dharmafi
dharmafi@dharmatrade·
@waozixyz I like macros. Operator overloading. The weird Perl 6 stuff.
English
0
0
1
13
Waozi (哇子)
Waozi (哇子)@waozixyz·
Should syntax be customizable in a playful implementation?
English
2
0
3
390
Waozi (哇子)
Waozi (哇子)@waozixyz·
What do you like more aesthetically
English
1
0
4
275