Tyler Marques

45 posts

Tyler Marques banner
Tyler Marques

Tyler Marques

@Tyler_Marques

Data Scientist / Engineer, Techie, Home-labber

Toronto, Canada Katılım Mayıs 2008
58 Takip Edilen164 Takipçiler
alex duffy
alex duffy@alxai_·
@danshipper Agreed but architect's need better tooling - right now reviewing pirate code is.. not fun to say the least
English
2
0
17
2.1K
Dan Shipper 📧
Dan Shipper 📧@danshipper·
new model for engineering team structure in 2026: 2 people only one pirate and one architect the pirate's job is to move as fast as possible to develop valuable, shipped product features by vibe coding. the architect's job is to turn the product surface discovered by the pirate into a reliable, structured machine—also by vibe coding, but at a slower, more well-reasoned pace. every product needs a pirate but most product's only need an architect once they some form of PMF, and in that case they usually don't need one full-time. architects can work across many codebases and solve interesting technical challenges. pirates go hard on a product that they own end-to-end.
English
336
296
4.5K
615.6K
Tyler Marques retweetledi
alex duffy
alex duffy@alxai_·
arc-agi-3 launch march 25 · sf chollet × altman fireside at yc · · · games are the new training arenas for intelligence · · · if you're into: rl envs · simulations · ai × games come debrief after or just come hang before we all get back to work
alex duffy tweet media
English
4
4
26
5.5K
Tyler Marques retweetledi
alex duffy
alex duffy@alxai_·
Today, we're launching Good Start Labs w/ $3.6M from amazing investors including @Inovia & @generalcatalyst My whole life I've been learning from games Over the past five years, I've dreamt about how AI learn with me. Today we're launching LOL Arena, the first AI benchmark for humor, informed by millions of human votes. We are also launching Diplomacy Arena ranking strategy, betrayal, and prompt impact across models. In the coming years we hope to lead at the intersection of Gen AI & Games and define what it means to do alignment via entertainment. Ensuring everyone can share their voice and help AI become a tool that really is custom built to help bring our dreams to life. If that inspires you, join us! We're hiring. Here's what we're shipping today: 🧵
alex duffy tweet media
English
30
34
234
99.2K
Tyler Marques retweetledi
alex duffy
alex duffy@alxai_·
Bringing AI into the real world hits different. Quick 🧵on how I turned an idea → physical trophy that you can win along w/ $1000 as part of our AI Diplomacy, prompting competition in October Link to enter 👇 (it's free but only a few spots left, 49 people will play)
alex duffy tweet media
English
5
6
24
2.2K
Tyler Marques retweetledi
alex duffy
alex duffy@alxai_·
Thanks to @swyx for having me at @aiDotEngineer as well as on the @latentspacepod, both were a blast was great to talk about benchmarks that mean something with people who care. V1 of AI Diplomacy live stream wrapping up in the next couple days with three great games left 👇
alex duffy tweet media
English
1
3
5
985
Tyler Marques retweetledi
alex duffy
alex duffy@alxai_·
AI Diplomacy made @BusinessInsider ! The people want better benchmarks: "Everyone knows the usual benchmarks are a bore." Couldn't have built it w/o @Tyler_Marques - excited to keep it rolling Shipping updates to the stream constantly, come check it out!
alex duffy tweet media
English
4
5
26
1.5K
Tyler Marques
Tyler Marques@Tyler_Marques·
@AdrienLE @morqon @danshipper @alxai_ Yea I'd like to get up a better summary of the games and who won and why. It'd be great to have a bigger sample size it's just expensive to run lots of games
English
0
0
3
50
morgan —
morgan —@morqon·
diplomacy is a great way to confirm your priors
morgan — tweet media
English
3
7
88
4.5K
Tyler Marques
Tyler Marques@Tyler_Marques·
@alxai_ @karpathy @danshipper My favourite bits of this are seeing the "personalities", for lack of a better word, emerge from the models. Claude is honest to a fault here.
English
0
1
2
278
Dan Shipper 📧
Dan Shipper 📧@danshipper·
🚨 NEW: We made Claude, Gemini, o3 battle each other for world domination. We taught them Diplomacy—the strategy game where winning requires alliances, negotiation, and betrayal. Here's what happened: DeepSeek turned warmongering tyrant. Claude couldn't lie—everyone exploited it ruthlessly. Gemini 2.5 Pro nearly conquered Europe with brilliant tactics. Then o3 orchestrated a secret coalition, backstabbed every ally, and won. Why did we do this? The most popular AI benchmarks don't test deception. But as these models get deployed everywhere—from your email to your workplace—we need to know: Will they lie to get what they want? So @every we built the ultimate test: AI Diplomacy, a dynamic benchmark that measures AI's ability to form alliances, negotiate, and betray each other. Watch them live below! Created from the ground up by @alxai_ and @Tyler_Marques.
English
103
319
2K
322.9K
Tyler Marques
Tyler Marques@Tyler_Marques·
We launched twitch.tv/ai_diplomacy today! Been working on this for a while and super proud of it. Watch Claude, Gemini, o3, and others battle it out in the classic board game of Diplomacy. Super proud to be working with @alxai_ and the team at @every
English
1
0
5
280
Tyler Marques
Tyler Marques@Tyler_Marques·
@kosinception Your domain has expired!! If you try to go to your website it’s messed up.
English
0
0
0
0