
plannotator
580 posts

plannotator
@plannotator
Annotate agent plans and review code visually. Share with your team, iterate, and send feedback to your agent with one click • https://t.co/3Gg5SohuqE












Translation: "Congratulations, you get 25X less usage" Is my translation misleading? Yes Is their announcement misleading. YES



1/3: compound planning skill can now introduce a hook that injects instructions based on your insights (on EnterPlanMode). This is a continuous improvement in action.


Excited for better review surfaces. However, you can still get a lot out of markdown - whether it's from agent messages, plans, or prose-heavy docs/specs. The right rendering engine can empower markdown on top of your codebase & integrate directly with agents; without cluttering markdown or increasing the cost of tokens. In < 1min, see how plannotator.ai does it in a planning session with pi.dev



This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc. More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage: 1) raw text (hard/effortful to read) 2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default 3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default ...4,5,6,... n) interactive neural videos/simulations Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral x.com/zan2434/status… There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen. TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.




