Mantic

33 posts

Mantic

@_Mantic_AI

Mantic is an AI research and product company on a mission to solve judgemental forecasting.

Katılım Ağustos 2025

28 Takip Edilen1.4K Takipçiler

Mantic retweetledi

Toby Shevlane@tshevl·3 Nis

We're trialling a new kind of forecasting tournament. The challenge: submit forecasting questions that trigger divergent predictions from the top AI forecasting systems. There's a $25k prize pool for the question writers, allocated by how much disagreement you can elicit. Motivation: - AI forecasters are becoming competitive with human pros. - Many questions are "solved", e.g. if I ask "Will a nuclear bomb go off in Europe this month?" all the models know it's <1%. - Still, other questions are intractable, because of aleatoric uncertainty. "What will be NVIDIA stock price in 1 year?" Again, the models will agree (this time by being very uncertain), and there's not much to learn. - If you can make the AIs disagree, you've found something interesting: a place where the AIs have divergent models of how the world works or differences in what information sources they're relying on. - Identifying these wedge questions will help the field develop AI forecasters that can tackle genuinely challenging problems. This is exactly what we'll need them for, as we navigate the uncertain world ahead. Please apply! Link in reply.

English

156

21K

Mantic retweetledi

Ben Day@itsmebenday·3 Nis

We're launching a new kind of forecasting tournament at @_Mantic_AI. There's $25k in prizes for writing questions, see post below to read more and apply.

English

606

Mantic retweetledi

Toby Shevlane@tshevl·24 Mar

We're undergoing a two-sided "prediction revolution" : (1) The rise of prediction markets (Polymarket, Kalshi) (2) AI's getting much better at predicting world events (Mantic) Iran is the first major geopolitical crisis where we can benefit from both. Gabriel offers a peak into the new paradigm.

Gabriel Fritsch@gabrielpfritsch

Over three weeks into the US-Iran conflict, the situation remains deeply uncertain and fast-moving. @_Mantic_AI has been forecasting the crisis in real time. We wrote about how we've done so far.

English

4.4K

Mantic@_Mantic_AI·20 Mar

@johnschulman2 🫡

QME

443

Mantic retweetledi

John Schulman@johnschulman2·20 Mar

Models that are great at calibrated predictions will be transformative for decision making. Excited about Mantic's work and proud they're using Tinker. Their new blog post digs into their methodology and findings.

Toby Shevlane@tshevl

I always dreamed of AGI as a wise advisor for humanity. Although LLMs are great for coding & knowledge work, I wouldn’t trust them to give me advice on my career, business strategy, or policy preferences. How can we build AI systems optimized for wisdom? At Mantic we believe the unlock is prediction: predicting world events as accurately as possible, and hill-climbing this single metric. Today we share some recent progress on the Thinking Machines website, having found Tinker a great platform for our RL experiments. TL;DR: We RL-tune gpt-oss-120b to become a better forecaster than any other model. Having good scaffolding is a prerequisite. A fun result: our tuned model + Grok are decorrelated from the other best models, and so are the most indispensable when picking a team.

English

388

87.8K

Mantic retweetledi

Toby Shevlane@tshevl·20 Mar

Tinker@tinkerapi

Mantic used Tinker to RL gpt-oss-120b on judgmental forecasting; the result outperformed frontier models on event predictions. Combined with @_Mantic_AI's forecasting architecture, task-specific training takes us to the cusp of automated superforecasting.

English

309

150.3K

Mantic retweetledi

Scott Jeen@enjeeneer·20 Mar

We've been using RL to train LLMs for superforecasting. Our new blog post with @thinkymachines discusses recent progress. We're now in uncharted territory. I'm excited to see how good we can get by pushing this further! 🧵

Tinker@tinkerapi

English

199

24.5K

Mantic retweetledi

Toby Shevlane@tshevl·11 Şub

Something is happening!

English

714

106.1K

Mantic retweetledi

Toby Shevlane@tshevl·19 Oca

HUMANS OF MANTIC Hours after we launched our website, before we’d posted it anywhere, I saw a job application from a Oxford economics PhD student from Brazil: “I’ve never been this excited about a startup. I want to help build it.” His background was not typical for an AI startup. But he looked impressive. He’d got a distinction from Yale then spent 3 years as an economist at Goldman Sachs. In his PhD research, he was using LLM forecasters to identify exogenous shocks to fiscal policy. We invited him to lunch with the team. He seemed smart. Ben messaged me: “we should try to get Gabriel to come in for September”. In his first couple of days, Gabriel was reading the code. I wasn’t seeing much output. I asked Ben, what is he doing? Ben told me to wait. Then...Gabriel emerged with an understanding of our prediction engine that was like he’d worked here for months. He started finding weaknesses and generating good ideas. Throughout September, Gabriel was running experiments to test his fixes, and the guy did not miss. +3 points on this eval, +3 points on that eval. To boot, he’s an lovely person. Gabriel grew up in Rio. He speaks about his childhood friends and Brazilian culture (the beach, the food) with joy in his eyes. It must have been a big culture shock turning up to New Haven as a freshman. From the beaches of Rio to Camden's hottest AI startup, @gabrielpfritsch started in a permanent role today, as Member of Technical Staff.

English

4.7K

Mantic retweetledi

Toby Shevlane@tshevl·9 Oca

📈Trends in AI performance in the Metaculus Cup, a large-scale forecasting tournament. The top-5 AI frontier makes linear progress vs the community prediction (CP). The CP is a wisdom of the crowds aggregate. Only a small handful of elite forecasters, from 500+ entrants, beat the CP each tournament. Extrapolating the AI trend line predicts CP-level performance in October 2027. A new trend started last Summer. Mantic progresses at a similar speed, but at a much higher level. The last tournament has just resolved, and Mantic beat the community, the first time ever for an AI.

English

4.3K

Mantic retweetledi

Toby Shevlane@tshevl·1 Oca

For our launch party, we made fortune cookies with Mantic's predictions for what the world would look like on 1st Jan 2026. How well did we do? The predictions were from Aug 14th. 1. Nvidia market cap on 1st Jan 🔮 $4.52 trillion (median, with a mode of 4.75) ➡️ $4.53 trillion 😎 😎 😎 2. Trump Nobel Peace Prize 🔮 94% he doesn’t win ➡️ Didn’t win 3. Jair Bolsonaro imprisoned 🔮 40%. Modal date if it happens: Oct 17th. ➡️ Imprisoned in November. 4. China launches Taiwan invasion 🔮 98% no ➡️ No invasion 5. A Chinese model top of the LMArena leaderboard 🔮 16% ➡️ No, Gemini 3 is top. 6. Jerome Powell as Fed Chair 🔮 85% still going ➡️ Still going! 7. US cuts the scheduled 50% tariffs on India 🔮 69%. Mantic read it as a negotiating tactic. ➡️ No ❌ Still there! 8. Xi Jinping out 🔮 95% still in power ➡️ Still going 9. Bank of England base rate 🔮 43% chance of 4% rate. 50% chance it’s lower, 7% chance higher. ➡️ 3.75% rate Overall these look pretty good to me. Perhaps 2026 will be the year of superhuman forecasting accuracy... 📈

English

2.3K

Mantic retweetledi

Toby Shevlane@tshevl·29 Ara

I gave a talk about AI for forecasting at the Society for Technological Advancement (@sotalikesfuture) in London. The short talk covers: - What we're doing at @_Mantic_AI, including the example of possible 🇺🇸 strikes on Venezuela that we automatically spotted in late Sept. - Benchmarking! 📊 I'm worried the forecasting benchmarks are getting saturated. - The idea that good foresight = forecasting accuracy + prescience. It was fun meeting everyone -- there's lots of excitement in the space! 🚀

English

2.3K

Mantic retweetledi

Society for Technological Advancement@sotalikesfuture·10 Ara

SoTA's first Frontiers Night is complete! A series of brilliant talks, demos & case studies with a lively panel comprising Toby Shevlane (@tshevl), Michael Story (@MWStory), Ben Warner, and Tom Oliver on Forecasting the Future. We discussed the art & science of prediction across AI forecasting, simulating human behaviour, and leveraging crowdsourced intelligence to make better-informed decisions. Thank you to Faculty for hosting the Society for Technological Advancement (@sotalikesfuture)! Look out for our next Frontiers Night on Self-Driving Labs in the new year and send suggestions for future themes & demonstrators. @_Mantic_AI, @swift_centre, @faculty_ai, Electric Twin

Society for Technological Advancement tweet media

English

961

Mantic retweetledi

Toby Shevlane@tshevl·24 Kas

I had a fun conversation with superforecaster @rdeneufville on his podcast! We discuss: 🤖 How does AI compare to human superforecasters? 🧱 Is there a data wall? ⏱️ Why to expect fast progress on sub 1-year prediction horizons 🔎 The importance of asking the right question Link in reply, and please enjoy this short clip, with background image courtesy of Gemini (reality: in a phone booth with smaller arms)

English

943

Mantic@_Mantic_AI·27 Eki

If you can keep your head when all about you are losing theirs

English

1.4K

Mantic@_Mantic_AI·22 Eki

The Market Pulse competition is for finance predictions: company earnings, treasury yields etc. We competed in Q3 and finished: - 19th / 122 entrants - Highest ever AI - 2.3x the score of the next best AI 🎯 Our best prediction was Nvidia forward guidance on margins (bullish).

English

Mantic@_Mantic_AI·14 Eki

@Polymarket 20%

Polymarket@Polymarket·13 Eki

JUST IN: While the markets are panicking about Trump's 100% tariff threat on China, Polymarket isn't buying it. Only 13% chance they go into effect. poly.market/iYGDV0i

English

561

77K

Mantic@_Mantic_AI·13 Eki

Inside look: our quest to beat the top human forecasters.

English

11.3K

Mantic retweetledi

Brendan Nyhan (@BrendanNyhan on 🟦☁️)@BrendanNyhan·1 Eki

For those interested in forecasting, our new report compares political science expert predictions of democratic erosion w/those of @metaculus forecasters & AI forecasts from @_Mantic_AI Check out the full US Democracy Threat Index we co-created with @metaculus at the link below

Brendan Nyhan (@BrendanNyhan on 🟦☁️) tweet media

Brendan Nyhan (@BrendanNyhan on 🟦☁️)@BrendanNyhan

🚨 New Bright Line Watch report on state of US democracy -Expert ratings unchanged since April but down substantially since January -US now most closely resembles an illiberal democracy -Partisan gap in democracy ratings highest since 2017 Report link and full 🧵 of results ↓

English

9.1K

Mantic@_Mantic_AI·20 Eyl

theguardian.com/technology/202…

ZXX

1.5K

Mantic@_Mantic_AI·20 Eyl

We've been assembling a world-class technical team, and we're only just getting started. If you are reading this, you are still early. 🚀

English

1.1K

Mantic@_Mantic_AI·20 Eyl

A historic milestone for the AI community: Mantic has been quietly participating in a forecasting tournament against 550 humans all summer. We finished 8th, beating some pro forecasters.

English

4.8K

Keşfet

@johnschulman2 @thinkymachines @gabrielpfritsch @sotalikesfuture @tshevl @MWStory @swift_centre @faculty_ai