
Gad Benram
316 posts

Gad Benram
@gadbenram
✡️ • Founder of @tensoropsai • Your partner for AI 🤖 🧠 • @GoogleDevExpert












New for financial services: ready-to-run Claude agent templates for building pitches, conducting valuation reviews, closing the books at month-end, and more. Install them as plugins in Cowork and Claude Code, or use our cookbooks to run them in production as Managed Agents.




What better way to test who is better, o1 or o3, than to put them in a shootout? To evaluate the performance of o3 vs. o1, I decided to create a dynamic test environment for them. First, using o3, I created a basic template for a shooting game. In this template, the rules of the game were defined, but no specific strategy was set—each system had to implement its own strategy independently. What did the experiment reveal? A clear difference in approach was observed: o3, known to be “smarter,” tried to aim its shots in a strategic, calculated manner toward o1. In contrast, o1’s shots seemed completely random—without a clear strategic direction. What did the experiment reveal? A clear difference in approach was observed: o3, known to be “smarter,” tried to aim its shots in a strategic, calculated manner toward o1. In contrast, o1’s shots seemed completely random—without a clear strategic direction. But here comes the surprise: despite o3’s apparent intelligence and deep thinking, its initial solution approach simply did not work. In most cases, o1’s randomness led to surprising and sometimes even advantageous results in the game. What can we learn from this? First, to truly assess which of the two models is “smarter,” you can pit them against each other in a complex environment with clear rules and conduct a broad statistical experiment. In my opinion, this is currently the most effective way to get the full performance picture—both the initial outcome and the capacity to correct and improve actions. Second, the experiment shows that even if a model demonstrates high intelligence in theory, it does not guarantee success if we rely on the output of its first run. This is where the importance of learning and long-term adaptation in AI systems comes in—the ability to improve performance using Reinforcement Learning, which emphasizes identifying mistakes, learning from them, and adapting the strategy to the changing environment. Ultimately, what emerges from the experiment is not just the need for initial intelligence, but also the need for a continuous ability to learn and improve. Techniques such as Reinforcement Learning demonstrate how correction and adaptation processes can lead to improved long-term performance—what does this mean for the future of AI systems? I would love to hear your thoughts—do you agree that the ability to learn and adapt one’s strategy is the key to success in the world of artificial intelligence?


If you forget about the Harveys and Lovables and Anthropics for a second … getting to $100M in revenue is very rare. According to this dataset ca. 1.5% of VC funded startups. If you’ve achieved that as a founder, you can be super proud.









