Jonas Templestein

4.9K posts

Jonas Templestein

@jonas

CEO https://t.co/7dJOmc0va5, prev. cofounder/CTO Monzo, dad of three

Katılım Ekim 2009

2.7K Takip Edilen8.3K Takipçiler

Sabitlenmiş Tweet

Jonas Templestein@jonas·7 Oca

2025 will be the year we see the first self-driving startups. Level 0: No AI People do everything. They come up with ideas, build products, and run operations. Many legacy businesses still work this way. Level 1: People use AI tools ⬅︎ we are here People might use ChatGPT to help write copy or Cursor to help write code. This is where most startups are today. Level 2: AI agents complete tasks based on human instructions People might ask AI agents to write software from a plain-English spec or tell it execute well-defined customer service processes. At this point entire departments (like support or QA) get largely replaced by AI. No startups I know of operate at this level yet—but if yours does, let me know. Level 3: AI agents propose changes to their own instructions They might propose new customer service processes and product changes in response to customer feedback. Humans would still approve each of those changes. Just a few people could run a large company this way. Level 4: AI agents autonomously change their instructions At this point startups become self-improving. Humans would only be involved as an escalation point or where required by the real world (e.g. to raise capital or to incorporate). At this point many startups would only have one human. Level 5: No humans AI agents decide which businesses to start, raise capital (through crypto tokens or other means), build and run them. No humans required. This would require major reforms in the legal and financial system.

English

216

75.4K

Jonas Templestein@jonas·21h

@badlogicgames Just interpret assistant messages as code - works flawlessly

English

1.3K

Mario Zechner@badlogicgames·22h

people of pi.dev. i'm removing all tools from pi witbout replacement. get creative.

max@maxjendrall

@YoniBraslaver @badlogicgames oh god, please don't switch out all read, write, edit, bash tools for code mode. Would be in the spirit of extensibility tho hahaha "pi has 1 tool. deal with it"

English

459

140.2K

Jonas Templestein@jonas·23h

@KentonVarda is the man

English

280

Jonas Templestein@jonas·23h

I just realised you can use capnweb to build a lightning fast poor-man's version of cloudflare tunnels 🤯 You just need to write a tiny durable object class that hosts a capnweb session, and then write a tiny client side utility that connects to it We use it to e2e test deployed workers. Our vitest test runner tunnels into the deployed worker and can then receive normal fetch(Request) -> Promise requests from the worker

English

3.7K

Jonas Templestein@jonas·2d

@RhysSullivan @viemccoy Just pick a 5 or 6 letter .com that passes the radio test. The more non-descript the better

English

282

Rhys@RhysSullivan·2d

@viemccoy i'll bite, what would you name executor.sh? mission is to enable all of your agents to call all of your tools. a place where your agents go to get work done, collaborate with you, sharable with your team. along with one off scripts also generative ui & workflows

English

4.3K

𝚟𝚒𝚎 ⟢@viemccoy·2d

GDP would double if all of you decided to let me name your startups. If you can't find the True Name of your endeavor, it may succeed, but only partway to what could have been.

English

228

14.5K

Jonas Templestein@jonas·2d

I guess it doesn’t matter. If you have a training set of code tokens, then it’s easy to convert that a training set of escaped code But all else being equal I still prefer not to use tool calls at all because you don’t have to deal with the arcane rules of the tool calling APIs (e.g. ordering of input/output items in the OpenAI responses APIs)

English

Jonas Templestein@jonas·2d

@gingerhendrixai OpenAI is json though

English

Jonas Templestein@jonas·3d

Am having pretty good success asking agents to "Respond with a single triple-backtick block of javascript code" No "tool calling" (in the LLM post-training sense) involved - this means the LLM doesn't have to produce javascript code escaped inside a json object - which seems like a good thing Is anyone else doing this? Has anyone done formal benchmarks to see whether the benefit of not having to write escaped code is outweighed by the post-training bias towards tool calling (which is really just outputting special marker tokens with json in between) Tool calling, just like the formalised assistant/user message framing, feels like it may be a bit of a local maximum. But we might never find out, because of the v large investment in the current post-training format

English

501

Jonas Templestein@jonas·2d

@gingerhendrixai How does claude do tool calling without escaping the tool arguments? Are you saying the raw output tokens are no longer json shaped? Would love to read more about it

English

Gareth Andrew@gingerhendrixai·3d

Yes! Pure code-act. I spent a while getting this to work on dynamic workers when they were new. You used to see more of this e.g. oai's swelancer harness. I've concluded it's probably not worth the effort though. Function calling is already optimized for major providers (e.g. no double escaping in Claude, there's a hidden XML transformation in the middle) so no reason to believe back ticks are better.

English

Jonas Templestein@jonas·3d

@mmkalmmkal There’s this joke about how the amount of leverage workers have in a tech company is inversely proportional to the size of their primary screen Devs have 5 screens, their managers use the MacBook screen and the execs use an iPhone

English

Misha Kaletsky@mmkalmmkal·3d

Prediction: laptops in cafes will stop being a thing when everyone is bossing agents around on their home machines via their phones This isn't necessarily better though

kache@yacineMTB

gpt 5.5 has changed my life. my kid has been sick the past couple of days and ive been hanging out with him, but set up a tmux fork with TTS and automatic sshing to all my boxes. and man. im getting more work done than ever

English

179

Jonas Templestein retweetledi

sam@samgoodwin89·3d

Generating SDKs from APIs is better done by coding agents now than with tools like Stainless. In the real world, every spec is wrong, incomplete and inconsistent. Someone has to go and patch the spec before you can get good results with a rigid code generator. And Stainless APIs still don't give you the errors! They produce nice looking SDKs, but lack the most critical aspect of APIs - the "unhappy paths", which are usually far more in number than happy paths, and are what makes the difference between a great and a terrible UX. Stainless support many targets, but ask anyone who's used Cloudflare's terraform provider and you'll quickly realize that it's not magic. If the spec sucks, the provider sucks. And most specs suck. Distilled and Alchemy address this with AI. We use coding agents for 100%, so each new SDK we onboard is effectively "hand crafted". AI adapts "manually" to the nuances and weirdness of the spec and API. We share some code, but we don't try and squeeze specs into one code generator. Every time we make one, it becomes useful context for the next one and drives the flywheel. Since we are targeting Effect, we value errors more than the happy path. None of the APIs we've worked with except for AWS have documented their errors in the spec. And AWS still hasn't documented 100% (maybe 80-90% at best). AI patches these missing errors (and categorizes them as retryable, etc.) by interacting with the service and observing its actual behavior. This then feeds into Alchemy which uses AI to generate hand-crafted IaC resources and our Effect abstraction on top. This generation process reverse engineers the API's actual behavior and produces: 1) Effect-native SDKs for every cloud, and 2) IaC Resources for every cloud, 3) Alchemy Bindings for every cloud API.

Techmeme@Techmeme

Source: Anthropic is in advanced talks to acquire New York-based Stainless, which helps developers generate SDKs from APIs, for at least $300M (The Information) (Visit Techmeme dot com for the link and full context!)

English

104

9.3K

Jonas Templestein@jonas·3d

I don’t want more stuff but I really want to build this I would pay half the price of this if I could “rent” the set and then return it after one build Or maybe even a destination experience where I take my kids to a LEGO store a few times and build this thing together and they store it between sessions

LOTR Universe@Lordoftheringsu

🚨 New LEGO Lord of the Rings Minas Tirith Set Coming In June! ($650) This gargantuan 8,278-piece model of the Tower of the Guard brings Gondor's capital to life in epic fashion, and will be released on 1 June 2026 via LEGO Insiders Early Access. The set comes with 10 minifigures, each with incredible detailing with prints on arms, torsos, as well as on armour. No expense has been spared here. Oh and there's no stickers too. Fans who order LEGO Minas Tirith between 1-6 June 2026, will also receive 40893 Grond as a GWP (gift with purchase) while stocks last. This set is just utterly stunning, and LEGO have outdone themselves here.

English

Jonas Templestein@jonas·5d

@threepointone Why not just execute(“search(…)”)? And why tool calling (i.e. ask the LLM to produce code in json) and not just “respond with code”

English

283

sunil pai@threepointone·5d

starting to think now that every agent should have just 2 tools. search and execute. we _want_ agents to have access to 100s, if not 1000s of capabilities, that can contextually change during their lifetimes, even per message. saying stiff like "just use bash" doesn't encompass 3rd party apis, and you don't want to keep switching up the base prompt all the time. you gotta generalise that. I also guess search has to be semantic, so probably something with a vector db type thing. does it run on every message? probably...

sunil pai@threepointone

maybe every mcp server can be 3 tools - describe(filter) => schema: get all capabilities - search(input) => toolcalls[]: for a (maybe unstructured) language query (+ opt. metadata) get a sequence/tree of toolcalls - execute(tools) => result: take that list above and run it

English

247

44.6K

Jonas Templestein@jonas·6d

Wow

ABC News@ABC

This woman just made ultramarathon history in 56-hour, 250-mile run in Arizona. Rachel Entrekin won the Cocodona 250 outright in a 56-hour, 250+mile effort, beating the entire men’s field, setting a new course record, and marking a landmark moment in ultrarunning history.

QST

929

Jonas Templestein@jonas·9 May

@badlogicgames I built one with the kids. it’s v fun! the company is French I think so no problem to get one in Europe

English

182

Mario Zechner@badlogicgames·9 May

i guess i need to make another purchase today. do those ship to europe?

Arthur GUERIN@GUERINArthur

New friend just arrived! Been counting the days 🤪 Hello World Reachy Mini 🦾 As easy & cool as Lego, seriously promising. Shipped an app to the Reachy Mini App Store in just a few hours! 👉huggingface.co/spaces/Taikos/… Congrats @pollenrobotics @huggingface @ClementDelangue and the team!

English

17.6K

Jonas Templestein@jonas·8 May

@badlogicgames But at least they put screws on every battery compartment now for safety now! So when the time comes to throw away the toy, it gets thrown away with batteries inside because parents can’t be bothered to unscrew four tiny screws

English

182

Mario Zechner@badlogicgames·8 May

i swear they make these extra unrepairable. this would be a 2 minute soldering job if they didn't hide those damn screews in those narrow shafts. planned obsolesence in kids toys is the worst.

English

12.5K

Jonas Templestein retweetledi

Paul Graham@paulg·8 May

Sure you can earn a billion dollars. I've been teaching people how to do it for 20 years. The way you do it is to start a company that grows fast. You don't have to do anything bad to make a company grow fast. You just have to make something people want. paulgraham.com/ace.html

Marco Foster@MarcoFoster_

AOC: “There’s a certain level of wealth and accumulation that is unearned. You can’t earn a billion dollars. You just can’t earn that. You can get market power, you can break rules, you can abuse labor laws, you can pay people less than what they’re worth, but you can’t earn that”

English

555

762

11.3K

3.1M

Jonas Templestein@jonas·7 May

@mmkalmmkal I need to look into this

English

149

Misha Kaletsky@mmkalmmkal·7 May

And sorry, to clarify, you don't need to generate json schema. It IS json schema!

English

745

Misha Kaletsky@mmkalmmkal·7 May

I think I've been sleeping on typebox. This is pretty magical. Plain typescript input, produces valid json-schema, and parses it. In one library that's smaller than zod, valibot or arktype. Feels very codemode-friendly...

English

110

11.6K

Jonas Templestein@jonas·7 May

@threepointone BTW I agree with you - I think a general purpose harness totally exists

English

sunil pai@threepointone·7 May

@jonas been using pen and paper to write down ideas like a fkin troglodyte more yapping when it becomes concrete!

English

160

sunil pai@threepointone·7 May

oh another thing: I _do_ believe there’s such a thing as a general purpose harness, and it’s cope to claim otherwise. just a question of time and figuring out the “composition” unlock. there’s prior art in a related space…

sunil pai@threepointone

I’m convinced: you can compensate for a “worse” model with your own human intelligence and abilities. Unfortunately, the corollary… yeah.

English

5.2K

Keşfet

@badlogicgames @KentonVarda @RhysSullivan @viemccoy @gingerhendrixai @mmkalmmkal @threepointone @elonmusk