Jacob Kieser
30 posts

Jacob Kieser
@JacobKieser
Founder @embrasureai (sr007) - autonomous data warehouses. Studied cs @uw
San Francisco, CA Katılım Aralık 2014
194 Takip Edilen185 Takipçiler

How to keep AI spend flat while token usage grows exponentially: Not with friction and spend alerts. With better defaults, routing, and caching.
Better Defaults (not Usage Caps) – Engineers can choose any model they want, but defaults matter. We’re experimenting with defaulting to open weight models like GLM 5.2 and Kimi 2.7 through our LLM gateway, while still encouraging engineers to choose the right model for the task. 91% of our employees were never hitting their usage caps, so instead of lowering caps and driving up alerts, we're moving to cheaper defaults. Note that code reviews use a diversity of models, so they can check each other's work.
Better Routing – In our custom harnesses, we preprocess prompts and route to the best model for the job, considering cache hits and model pricing. For instance, you may want a frontier model for planning, but not for execution where they can be overkill. Ultimately, humans shouldn't be choosing models - AI can automate this task.
Better Caching – Cache misses are the easiest way to drive your cost up. All of our requests are cache aware, so we’re reusing a warm cache wherever possible. For example, our cache hit rate went from 5% → 60% in LibreChat once properly implemented.
Keep Context Lean – Start fresh sessions when switching tasks. Scope file context narrowly. Disconnect unused tools. Don't just compact. The goal isn't fewer tokens used, it's fewer tokens wasted.
Better Visibility – Our engineers can use as many tokens as they want, from whatever model they want, but we’ve made usage visible – and the more you spend on AI, the more impact we expect.
The goal isn't to suppress usage. It's to build the infrastructure that makes exponential growth sustainable.
Putting this into practice has cut our AI spend nearly in half, while our token usage continues to grow.

English

@AdamHoltererer I guess we will find out if they ever get it released 🤷♂️
English

@SnowyLake9 @OpenAI Haiku is dead imo, self hosted small open source models for workloads like classification etc
English

Introducing a limited preview of GPT-5.6 Sol, our next generation frontier model, as well as GPT-5.6 Terra, a balanced model for efficient, everyday work, and GPT-5.6 Luna, a fast and affordable model for high-volume work.
openai.com/index/previewi…
English

We’re excited to announce that @meconemarkets will be joining @a16z @speedrun as part of the SR007 cohort. I’m also happy (and sad) to say that I’ll be leaving Stanford Law School to pursue this dream full time.
Mecone started from a core thesis: the financial system of the future will be built on blockchain rails with Perpetual Futures (“Perps”) as one of the primary tools for speculation and hedging. This future will lead to markets that are more open and global, markets where participants of every kind, retail and institutional alike, trade more of the world’s assets with less friction than ever before.
With this future in mind, we’re building the financial infrastructure to make everything tradeable.
Today, Perps only exist on a handful of assets — the biggest commodities, FX, public equities, crypto. But there are a myriad of highly attractive long-tail assets that Perps have failed to reach.
Mecone exists to change that.
We build continuously updating benchmarks for long-tail underlyings such as fine art, real estate, pre-IPO companies, macroeconomic indicators, and many more.
When @hugo_stack, @jaffarkeikei, and I first met, none of us imagined we’d start a company. But the prospect of making some of the global economy’s most valuable and illiquid assets tradeable by anyone, anywhere, anytime was too interesting of a challenge to pass up.
We’re working to list our indices on exchanges now! Join the waitlist and we’ll let you know the moment they go live. mecone.trade
Big shoutout to Halim Labi, Aristotle Mannan, and Reuben Youngblum for supporting us from the very beginning. Also, thank you to @JoshLu, @Chen, @justmazer, @tmhammer, @kenanhsaleh, @_CallMeMacy, @emilybenn12, @tkexpress11 and the whole a16z speedrun team. Thank you @luca_skarlo and the Skarlo® team for whipping up an incredible website on such short notice. And, most importantly, we couldn't have done any of this without our families. Love you guys!



English

@JacobKieser the easier a product is to understand, the easier it is to recommend. followed.
English

OpenAI just made a huge decision, one I think will be extremely positive for them:
Anthropic has been killing it with mass appeal with names that people can actually understand. Nobody understands the difference between 5.5/5.4/5.3 Spark/etc besides people deep in the ecosystem.
Talking to some of my friends not in tech, everyone knows Mythos/Opus/Sonnet, nobody knows what the latest GPT model is. If they have a bad experience with 5.3, then GPT's are bad across the board.
OpenAI@OpenAI
Introducing a limited preview of GPT-5.6 Sol, our next generation frontier model, as well as GPT-5.6 Terra, a balanced model for efficient, everyday work, and GPT-5.6 Luna, a fast and affordable model for high-volume work. openai.com/index/previewi…
English

@OpenAI I wonder where they got this naming inspiration from 🤔
English

@garrytan Dropbox and other consumer file stores are getting replaced IMO, if I want a file I’ll ask my agent
English

@JacobKieser Nope Palantir, ramp don’t qualify here very different
English

@jackprice x.com/JacobKieser/st… just wrote this tweet and saw yours, everyone needs to think like this all the time
Jacob Kieser@JacobKieser
Every interesting use case of AI comes with the premise that you are not worried about token usage, like openclaw for example - infinite loop of prompts. Anyone innovating in the AI space should think about new use cases with the frame of token abundance, its why AI labs keep innovating new use cases.
English

Every interesting use case of AI comes with the premise that you are not worried about token usage, like openclaw for example - infinite loop of prompts.
Anyone innovating in the AI space should think about new use cases with the frame of token abundance, its why AI labs keep innovating new use cases.
English

New to tech X, Will follow and talk to anyone building cool stuff. comment/dm and drop me a follow if you want to talk AI, VC, getting into speedrun and YC, or anything else.
dropping keywords to get on feeds:
founder, anthropic, a16z, building, gpt, 5.5, 996, codex, corgi cafe, roy lee, fable, openai
am i doing this right?
English

@JacobKieser @modaicdev @a16z @speedrun Appreciate it.🙏🏾 this shit cost too much to not be a least a little lit.
English

Today we’re launching @modaicdev @a16z @speedrun 006, the fastest way to render your judgement into reliable decision automation.
English

Fable 5 is cool, but can it rip through soc2 compliance autonomously in 24 hours under rate limits?
actually maybe, but s/o @TrustVanta + @OpenAI Codex, cooking for @EmbrasureAI

English





