chansung@algo_diver
What Are We Going to Prove?
They say this year is the year of proving Agent Engineering. But what exactly do we need to prove? The ability to build agents well? The proficiency to handle AI models? Or the capability to build products through "vibe coding"? Here are my thoughts.
1. Highly Reliable Outcomes
Many tasks in the past were slow and manual, but the predictability of their outcomes was largely guaranteed. Introducing agents makes these processes intelligent and fast, but it also brings the risk of wavering reliability due to the probabilistic nature of AI. Therefore, the critical skill will be designing systems that grant autonomy while controlling the evaluation and verifiability of the results.
This is a means to guarantee the robustness of the agents and the systems controlled by them. Without this, the numerous technological elements we've built up (and that support our present) could collapse—unless, of course, AI completely rebuilds everything in the world from scratch.
2. High-Quality Outcomes
With the advancement of AI, anyone can now produce "plausible" results. In other words, the baseline for high quality has been leveled up. At this juncture, what agent engineering must prove is the "craftsmanship" that fills in the last 1% of detail that AI often misses. We need the finesse to grasp the micro-context of user experience (UX) and reduce minor frictions in workflows, rather than settling for superficially polished outputs. In a world where anyone can generate an 80-point code or plan in a minute, the true value lies in the ability to persistently carve and polish the remaining 20 points to ensure near-perfect quality.
You can fill in the rest with direct coding, through conversations with AI, or via automated agent workflows. Regardless of the means, you simply have to deliver overwhelmingly superior quality even above the already heightened baseline.
3. Input Cost vs. Optimal Result Among Countless Options
Even if AI is smart enough to quickly process various tasks in the background, physical constraints such as computing power, token costs, and network latency still act as bottlenecks.
An agent might suggest 100 different parameter tuning methods, but the time we have available is strictly limited. Even if computing resources are practically free, the empirical and astute judgment to pick the single best option out of dozens or hundreds is crucial. The ability to embed this judgment into the agent will be just as important. If everyone at an enterprise level starts using agents, the issue of resource depletion will only escalate.
4. Differentiation
Because everyone is utilizing the same Foundation Models, the form of the outputs and even the "vibe" of the products are starting to look identical. To survive here, it becomes crucial to have unique differentiation that others cannot easily replicate. This goes beyond just writing slightly better prompts; it means embedding a deep understanding of a specific domain into the agent's workflow, or designing a highly original multi-agent collaboration structure to create completely new user experiences. When everyone is producing similar "factory-made" results, imbuing your work with a unique color and irreplaceable utility is key.
In conclusion, agent engineering this year is not simply about proving "feature implementation." In a world where anyone can easily generate results with AI, it will be a fierce process of proving how we can integrate these four elements—Reliability, Craftsmanship, Efficiency, and Originality—into a system to convert them into "truly useful value."
This is also the area I personally intend to focus on.