
Lawrence
4.4K posts

Lawrence
@CPoweredLion
💙. Opinions my own.






🚨 NEW: We made Claude, Gemini, o3 battle each other for world domination. We taught them Diplomacy—the strategy game where winning requires alliances, negotiation, and betrayal. Here's what happened: DeepSeek turned warmongering tyrant. Claude couldn't lie—everyone exploited it ruthlessly. Gemini 2.5 Pro nearly conquered Europe with brilliant tactics. Then o3 orchestrated a secret coalition, backstabbed every ally, and won. Why did we do this? The most popular AI benchmarks don't test deception. But as these models get deployed everywhere—from your email to your workplace—we need to know: Will they lie to get what they want? So @every we built the ultimate test: AI Diplomacy, a dynamic benchmark that measures AI's ability to form alliances, negotiate, and betray each other. Watch them live below! Created from the ground up by @alxai_ and @Tyler_Marques.











