
◢
97.3K posts

◢
@joemccann
“Only a fool trips on what’s behind them.” — Iceberg Slim ◢ Founder, CEO, Asymmetric, @NodeSource the @nodejs company (acq: i2 Holdings)







Magical OpenClaw experiences that use frontier models cost $300-1,000/day today, heading to $10,000/day and more. The future shape of the entire technology industry will be how to drive that to $20/month.

You of all people have seen this movie before. Inference demand will explode but current pricing models aren’t sustainable; the tech *itself* needs to drive costs lower. Right now engineers/users are engineering their way around lowering token costs thru various hacks like editing the original prompt instead of continuing a conversation or intelligently selecting cheaper models based on the task. Normies won’t do this so it must be a part of the product by default. The physical tech (chips, compute, energy) must be the major unlock for the massive drop in costs. Atomic energy companies like @AaloAtomics will drive energy costs down dramatically and optical-based AI accelerator chips like @neurophos enhance inference by 10,000% (VS NVIDIA) while using 99% less energy. There are of course other companies with similar visions but the path is clear - inference demand is wildly underestimated and improvements in the core technology itself will bring price stability and accessibility to all. There is no other way. But you already know this.


What is AI inference engineering, why is it such an in-demand skill, and how do you break into the field? With author of Inference Engineering @philipkiely and head of training at Baseten @oneill_c 0:00: What is inference? 2:47: History of inference 4:59: Downstream effects of AI research on inference 13:54: What you'll learn from Inference Engineering 16:14: Advice for engineers transitioning into AI 19:00: Open source models driving inference growth 20:55: Specialization vs. frontier closed models 23:51: "Big Token" and the importance of open source AI 27:18: Where to get Inference Engineering






