Anson Ho

63 posts

Anson Ho

@ansonwhho

Researcher @EpochAIResearch

Toronto, Canada Katılım Ocak 2022

965 Takip Edilen837 Takipçiler

Anson Ho@ansonwhho·26 Şub

@timfduffy @EpochAIResearch Hmm actually it seems like the link already uses "copy link to highlight", not sure why it's not working...

English

Anson Ho@ansonwhho·26 Şub

@timfduffy @EpochAIResearch Thanks for flagging this! That part's correct — it's based on the edge case in the "results table" at that link (see screenshot). OTOH the 512x is a typo, it should be 5^12 I'll update the link w "copy text to highlight" and the second number

English

Epoch AI@EpochAIResearch·26 Şub

Developing more powerful AI isn’t just about scaling compute. It’s also about improving algorithms and data quality, which let you build better models with the same compute. We call this “AI software progress” — here’s everything you need to know about it: 🧵

English

7.1K

Anson Ho@ansonwhho·10 Şub

@SamuelAlbanie I haven't personally, but one of my colleagues tried using GPT-5.3 codex and made a lot more progress on the "port a post" task. So I think I was way too bearish for task 3, and maybe a bit too bearish for task 1 too...

English

Samuel Albanie 🇬🇧@SamuelAlbanie·9 Şub

@ansonwhho @ansonwhho curious whether you've retried with more recent models & whether they seem on trend with your forecasts?

English

4.3K

Samuel Albanie 🇬🇧@SamuelAlbanie·9 Şub

interesting post by @ansonwhho tldr: - takes some representative tasks from his job & tries to automate them - when AI fails, he guesses when each task will be completed on the first attempt (with an output that is good enough to directly use) example: his median estimate for "porting an article from google docs to substack and epoch's website" is mid-2028 epoch.ai/gradient-updat…

English

254

45.8K

Anson Ho@ansonwhho·19 Kas

@idavidrein Looks like @GregHBurnham was right! epoch.ai/gradient-updat…

English

171

david rein@idavidrein·19 Kas

Huh! I had assumed it saturated at like 85% since it's been a few months since anyone had improved on that rough score range—but this is much higher! Would be curious to see if the questions Gemini 3 Pro w/ deep think gets right are valid/correct

Miles Brundage@Miles_Brundage

93.8% is a crazy GPQA score btw

English

2.9K

Anson Ho@ansonwhho·17 Eyl

@dwarkesh_sp @EgeErdil2 Ege knows so much that when I first met him I thought he might have photographic memory. Then I discovered he doesn't, which made him seem even more impressive!

English

811

Dwarkesh Patel@dwarkesh_sp·17 Eyl

.@EgeErdil2 may be the most cracked mf I know. Extremely insightful, regardless of whether the conversation is about info/FLOP in RL vs pretraining, or why Japan won the 1905 Russo-Japanese War, or the weird distortions in many millennia long trends of population growth.

English

197

33.4K

Anson Ho@ansonwhho·14 Eyl

@datagenproc salut d'amour, schindler's list theme, sir duke

English

jsd@datagenproc·14 Eyl

I find it much easier to point to what makes a harmony click for me than a melody. Or even to point to melodies I particularly like. Not sure if it's just that i'm less sensitive to melody, or if it's because melody is less legible. What are your favorite *melodies*?

English

964

Anson Ho retweetledi

Epoch AI@EpochAIResearch·11 Tem

Introducing FrontierMath Tier 4: a benchmark of extremely challenging research-level math problems, designed to test the limits of AI’s reasoning capabilities.

English

562

81.7K

Anson Ho@ansonwhho·10 Tem

But I think that research on explosive growth has already provided strong counterarguments to the points that @lugaricano raises

English

603

Anson Ho@ansonwhho·10 Tem

I do agree that often "progress works itself out of a job", baumol effects and human bottlenecks are important and can make explosive growth less likely (especially in the next few years)

English

684

Anson Ho@ansonwhho·10 Tem

I strongly disagree with @lugaricano’s thread on explosive growth. While the thread raises important points, I think it fails to get to the core of the reasons to believe that explosive growth is plausible – allow me to explain!

Luis Garicano 🇪🇺🇺🇦@lugaricano

1/@EpochAIResearch doubles down on preiction AI will drive 20%+ annual GDP growth. Economists remain skeptical. This is the defining debate of today: AI builders see infinite prosperity ahead. Economists see the same limits that constrained every technological revolution.🧵 1/13

English

7.6K

Anson Ho@ansonwhho·25 Haz

@aidanogara_ @ben_j_todd @EpochAIResearch So for GATE specifically I might make updates of the form "I broadly over/underestimated how strong effect X might be". I definitely wouldn't trust GATE's near-term quantitative predictions (e.g. GWP growth rates in 2027)

English

Anson Ho@ansonwhho·25 Haz

@aidanogara_ @ben_j_todd @EpochAIResearch Depends on the question IMO. GATE is based on endogenous growth models, that are ok at capturing the dynamics of long-run growth, but I doubt you'd use something like this for near-term growth predictions for example

English

Benjamin Todd@ben_j_todd·24 Haz

Economists say remaining human bottlenecks will prevent explosive growth from AI (Baumol effects). @EpochAIResearch actually made an economic model, and find those bottlenecks would need to be implausibly huge to do the job. Even with extreme bottlenecks (ρ = -2), partial automation leads to 10% GDP growth soon after. (The graph below is based on a scenario with 10% automation in 2026 – the exact date is not the point here.)

English

11.3K

Anson Ho retweetledi

Epoch AI@EpochAIResearch·8 Haz

How do reasoning models solve hard math problems? We asked 14 mathematicians to review o3-mini-high’s raw, unsummarized reasoning traces on 29 FrontierMath problems. Here’s what they found:

English

534

110.1K

Keşfet

@timfduffy @EpochAIResearch @SamuelAlbanie @idavidrein @GregHBurnham @dwarkesh_sp @EgeErdil2 @datagenproc