Apple’s recent innovation bets — from a car to a VR headset to an AI-native product — have all flopped. Where is Apple headed?
juli1.substack.com/p/where-is-app…
In a world where model performance changes week over week, you have to build eval very early and constantly run evaluation to see if a change is improving your tool accuracy or performance.
Improve prompt -> run evals -> accept or reject changes
Rinse and repeat
I have been looking more deeply at coding agents and spent some time looking at Aider. The biggest takeaway I have today is that all tools are doing only one thing: improving the prompt and the underlying tools fed to the LLM.
juli1.substack.com/p/anatomy-of-a…