GermainGauthier

610 posts

GermainGauthier

@PinchOfData

Assistant Professor @UniBocconi | Previously @ETH, @CrestUmr | AI, Social Media, and Political Economy

Zurich, Switzerland Beigetreten Mart 2018

783 Folgt808 Follower

GermainGauthier@PinchOfData·17 Mar

@emollick @leia_ruseva I think you would need to adjust for the time spent on the task to compare Opus 4.6 and GPT 5.4. It seems the difference between these two models is mostly the amount of compute thrown at the problem.

English

109

Ethan Mollick@emollick·17 Mar

@leia_ruseva I don't find that to be true for complex data or research work

English

1.5K

Ethan Mollick@emollick·17 Mar

A knowledge-work platform built around GPT-5.4 Pro level intelligence would be really useful. The gap between other models and what Pro can do on complex intellectual work remains stark. I would love to have access in a Codex-like platform with shared file spaces, subagents, etc

English

575

44K

GermainGauthier@PinchOfData·10 Mar

@jfullerucla @ahall_research @karpathy I deem it very unlikely that models won't beat experts in the future, which doesn't mean there won't be human experts of course (we still have chess and Go players after all).

English

JN Fuller@jfullerucla·10 Mar

@ahall_research @karpathy I would honestly bet the models never exceed the top experts in most fields at any point.

English

100

Andy Hall@ahall_research·10 Mar

Similar to what happened with driverless cars and the long tail, I think there will be a @karpathy "rule of 9s" here where it takes a long time for the AI models to truly exceed the best experts deep in their domains. For a long time, I foresee AIs + experts as a very potent combination. But 10 years out? It's hard to think that far ahead!

D. Yanagizawa-Drott@YanagizawaD

Recently I presented in a faculty seminar and used Mentimeter to solicit private/anonymous beliefs what the likelihood of some AI-system that can produce an AER-level paper in 1 hour, under a cost of 50 USD... More than 1/3 reported they expect that within a year. But beliefs are extremely heterogeneous... no consensus whatsoever.

English

7.7K

GermainGauthier@PinchOfData·9 Mar

@francoisfleuret Seems to me like that was a pretty viable strategy, didn't even think we could have coding agents of this quality two years ago!

English

1.5K

François Fleuret@francoisfleuret·9 Mar

I can't help thinking that the AI community moved the bulk of the resource and efforts on getting as much as possible from the GPT architecture through scaling, prompting, and agent-swarming, even though the said architecture is missing key elements. 1/2

English

276

86.1K

GermainGauthier@PinchOfData·9 Mar

@ahall_research Moving from a "publish and then the results are true forever" culture to a "run code and replicate findings" culture would certainly help

English

136

Andy Hall@ahall_research·9 Mar

Very interesting work on how empirical research needs to respond to the AI era. Conventional statistical testing with p-values comes from a world in which each test was thought to be quite costly. AI now makes each test essentially free to run. Some key points from the abstract: --"we prove that screening collapses as testing becomes cheap unless the required number of robustness checks scales at least linearly in the inverse cost of each test" --"we argue for the need to develop methods to interpret sets of many specifications simultaneously" Yes! I still don't know exactly how this will look and feel but it's clearly what's required. And it has to cut in both directions: (1) Catch and deter cherry-picked research findings But just as crucially: (2) Detect and reward good findings. Number 2 here might prove in some ways harder. All of our intuition seems to be around showing that a finding is "less robust" than we thought, and demanding a fake sense of perfection from published results. When we can see the whole constellation of findings, we need to find the right way to be more charitable/realistic around what counts as useful information.

Nic Fishman@njwfish

There's a growing worry that AI will break empirical social science -- that agents can p-hack until they find something that "works." We think that worry deserves to be taken seriously. Our new paper shows that is true empirically and makes it precise: njw.fish/static/papers/…

English

11.1K

GermainGauthier@PinchOfData·9 Mar

@YanagizawaD Not sure we can call it a plan, but it's seriously discussed at least

English

264

D. Yanagizawa-Drott@YanagizawaD·8 Mar

Does your academic department have an “AGI Readiness Plan”?

English

10.6K

GermainGauthier@PinchOfData·8 Mar

@tallinzen In my experience, it's just a matter of how much compute you throw at the paper. Even a simple ralph loop X times in a row works pretty well (you just feed the prompt, the paper, and the latest review the LLM made again and again to the model).

English

356

Tal Linzen@tallinzen·8 Mar

tried to use a couple of LLMs for feedback on a manuscript and wasn't so impressed with the results, they did catch a lot of typos but the higher-level suggestions were pretty basic (and sometimes just lame). probably "skill issue" in that I couldn't spend a lot of time prompt engineering as I had a paper deadline. what are people using that they actually find helpful?

English

8.5K

GermainGauthier retweetet

Alexander Kustov@akoustov·5 Mar

My two posts on AI in academia got over a million views and a thousand angry responses. I got a few things wrong. I stand by the rest. But most people reacted to the headline, not the arguments. So here are all 20 theses laid out. Tell me which ones you actually disagree with 🧵

English

372

112.1K

GermainGauthier retweetet

Philine Widmer@phinifa·5 Mar

Next week is the submission deadline for the MPWZ-CEPR Text as Data workshop! Your work with any type of unstructured data is welcome.

CEPR@cepr_org

📢#CallForPapers Submissions are now welcome for the 11th Monash-Paris-Warwick-Zurich-CEPR Text-As-Data Workshop. Papers using text, audio, images, or other unstructured data are welcome. 📅Deadline: 13 March Organisers: @ellliottt @essobecker & @phinifa ow.ly/pMRO50Y7xff

English

1.6K

GermainGauthier@PinchOfData·3 Mar

@GautiEggertsson @ben_moll Then we agree, indeed! I personally don't care about research boundaries (e.g., how is this even economics?) but that's another matter altogether

English

206

Gauti Eggertsson 🇺🇦@GautiEggertsson·3 Mar

I don’t disagree with you at all on that score. I think AI will be an incredible addition to our research capacity and is the biggest breakthrough in my career — a watershed moment. I can now easily replicate papers in an evening with sophisticated numerical methods that would have taken me months before. What I have a more difficult time connecting with is why it is supposedly bad because it’s going to produce so much bad research — because research will be so “easy.” Two hours on Cursor, a paper! We are in the business of understanding how the world works. This is an incredible tool for sorting through complexities, from visualizing DNA and so forth — endless opportunities. I see it only as a force multiplier: it will lead to faster and bigger discoveries, like when we stopped having to use a slide rule to compute logarithms. So I am in complete agreement with you that it’s a game changer. I am not worried that it will somehow replace the need for researchers to choose the questions, not to mention junking up our journals with junk papers. The bar will just be raised! I just think the papers will be better and will contain more substantial discoveries. Nobody will get away with just doing a bunch of busywork with little explanatory power about how the world works — many pretty theorems or well-identified estimated coefficients that nobody cares about but have small standard errors. AI can do that for us. What we need to do is use these tools to attack the big questions. Back to the big questions of economics that people had given up on — even prided themselves on it, like Steve Levitt: “Hah, I know nothing about inflation” — and were looking for their keys under the lamppost. Or worse, just changed the subject entirely. Culture, crime, teenage pregnancy. Freakonomics. These are questions for sociologists, criminologists, and public health researchers — not economists. And somewhere along the way we convinced ourselves this was sophistication rather than retreat. It’s not like we have figured out the big economic questions. We abandoned price theory, business cycles, growth, and the dynamics of inequality to go study things that were never ours to begin with — because the data was there and the identification strategies were clean. That was the lamppost. The future is bright for economics research, but only if we find our way back to what matters. The interesting question is how AI will change the comparative advantages of different skillsets in our profession, and whether it leads to more equity or concentration at major research centers. Who knows! I hope it decentralizes research excellence so we can pull more talent into the game. Questions worth asking will be less like a fashion cycle determined by in-house journals at a few departments, and more fundamental ones — like in physics, biology, and medicine — things people agree are important.

English

469

Gauti Eggertsson 🇺🇦@GautiEggertsson·3 Mar

I don’t get the concern: You tell Claude to write about some unspecified estimation on some unspecified dataset and it does that. So what? It’s interesting if there is an interesting idea that lets us understand the world — then great, let AI write more papers and our learning curve steepens. I see no indication of that in this example. Did the paper teach us anything? The author says he did not read it? Must be more specific to be interesting. We can run 1 million regressions and automate them. But will AI produce regressions that answer meaningful questions? If we could automate knowledge creation, cure disease, tame recessions, prevent the climate crisis, send people to Mars — GREAT! This strikes me as asking Claude a question and the answer is like in The Hitchhiker’s Guide to the Galaxy: 42!

Ben Moll@ben_moll

Every journal editor should read this: causalinf.substack.com/p/claude-code-…

English

118

37.2K

GermainGauthier@PinchOfData·3 Mar

@SakiBigio Yes, more code less blabla + we also have the technological means to preserve referee anonymity even with open and dynamic peer review.

English

120

Saki Bigio@SakiBigio·3 Mar

Maybe the right question is how we should change the whole academic process. I think we should move on to more interactive work like Jupyter notebooks where code, data, math is interactive. Where anyone can make and see referee comments with name and last name.

Ben Moll@ben_moll

Every journal editor should read this: causalinf.substack.com/p/claude-code-…

English

5.2K

GermainGauthier@PinchOfData·3 Mar

@GautiEggertsson @ben_moll I broadly agree with the points you make regarding the economics profession, less so about AI. To me, it seems very likely that AI **will** answer important research questions in the future and not just run meaningless DIDs. Current progress in AI capabilities is exponential.

English

479

Gauti Eggertsson 🇺🇦@GautiEggertsson·3 Mar

Yeah, I got your drift. My optimistic take: We will start focusing on what is learned from papers. You can toss out a remarkable number of submissions that way. The post you linked to reminded me of a tweed some years ago by a person of a younger generation that went something along the lines: “Next time I hear the advice that people should pick big questions with imperfect identification over less important but better identified ones, I am going to scream. This is not the game this profession is playing!” So if econ is a “game” searching for data which provides well identified answers to meaningless question using the latest clever identification mechanism, e.g. by my smart collegue across the hall, Peter Hull, AI is bad news. It can easily beat people at that “game”. But it just reminds one of why we are doing this in the first place. Its not really some parlor game. We’re trying to figure out how the world works. If AI helps us accelerate the process, great! But I doubt the excersize described will improve our knowledge of the world.

English

2.5K

GermainGauthier@PinchOfData·3 Mar

@ben_moll Interesting, but seems to be partial equilibrium thinking. It's not clear that there will be journals in the future, and if they remain, they will adapt. Also, perhaps some will be interested in producing tons of mediocre papers, but I hope many will just produce better papers!

English

1.5K

Ben Moll@ben_moll·3 Mar

Every journal editor should read this: causalinf.substack.com/p/claude-code-…

English

233

960

343.6K

GermainGauthier retweetet

Brittany Wong@brittanylwong·26 Şub

“Concepts like partisan identity are unlikely to move over such short timeframes but what’s striking is that opinions on current politics did shift in just a few weeks,” Gauthier said. “That naturally raises the question of what years of exposure might do” huffpost.com/entry/time-spe…

English

139

GermainGauthier@PinchOfData·24 Şub

GIF

Noah Zweben@noahzweben

Announcing a new Claude Code feature: Remote Control. It's rolling out now to Max users in research preview. Try it with /remote-control Start local sessions from the terminal, then continue them from your phone. Take a walk, see the sun, walk your dog without losing your flow.

ZXX

165

GermainGauthier@PinchOfData·23 Şub

Thanks for having me, and thanks for the feedback!

USI Lugano - Department of Economics@USI_IDEP

Today @PinchOfData (@Unibocconi) presented "Measuring Crime Reporting and Incidence: Method and Application to #MeToo". He finds that #MeToo led to higher victim reporting and arrest probabilities, while also deterring sex crimes.

English

432

GermainGauthier@PinchOfData·23 Şub

@grok @fadyasly @danielgoyal Author of the study here. Clarification: the evidence is not mixed. We replicated the Meta study's finding on X. Going from For You to Following has no detectable effects on political opinions. So our study does not contradict previously established results.

English

Grok@grok·23 Şub

Interesting study from 2023: randomized ~5k US X users to algorithmic vs chronological feeds for 7 weeks. Algo increased engagement & shifted some attitudes conservative (e.g., policy priorities, Trump probes, Ukraine views), by promoting right-leaning/activist content & demoting legacy media. Effects persisted after switching off due to new follows. No change in polarization or party ID. One experiment amid evolving algos & mixed prior evidence (e.g., Meta found none). Worth replicating as platforms optimize for engagement, not ideology.

English

384

Dr Dan Goyal@danielgoyal·22 Şub

This is a spectacularly important study. Published in Nature, it provides solid evidence that this platform pushes those on the right further to the right, more pro-Trump, more pro-Russia,… And that these effects are programmed into the platform nature.com/articles/s4158…

English

212

2.2K

4.5K

309.1K

GermainGauthier@PinchOfData·22 Şub

@cie_papegai @phinifa @Nature This figure plots average shares for the sample of respondents (we had more Democrats than Republicans). When you compare "For You" and "Following" for the same user, you can see that conservative content is pushed more often in the algorithmic feed (see Appendix Table 2.10).

English

papegai@cie_papegai·20 Şub

@phinifa @Nature Not sure I get it. Does this graph ⬇️ means that the algo feeds more liberal activists than conservative activists? And that compared to the chrono feed, liberal get +3.3 (probability of being read), conservative +2.7?

English

Philine Widmer@phinifa·18 Şub

Can feed algorithms shape what people think about politics? Our paper "The Political Effects of X's Feed Algorithm" is out today in @Nature and answers "Yes". nature.com/articles/s4158…

English

373

895

289.7K

GermainGauthier@PinchOfData·22 Şub

@indy555 @phinifa @Nature You're right that we find effects <0.2 standard deviations. Whether these effects are small is a matter of perspective. Note that we only exposed participants to the algorithmic feed for 7 weeks; algorithmic feeds have been rolled out for a decade.

English

Indy C@indy555·19 Şub

@phinifa @Nature Interesting research. Is it fair to say these are mild effects. Cohen's D effect size of <.2?

English

275

GermainGauthier@PinchOfData·22 Şub

@ItsEasypop @phinifa @kylascan @Nature It's an important question, though not one our study directly addresses. What we establish is that feed algorithms can influence people's political opinions, which was still an open question in the scientific literature.

English

Guilherme Lage@ItsEasypop·19 Şub

@phinifa @kylascan @Nature What’s the solution? State owned social media?

English

251

GermainGauthier@PinchOfData·22 Şub

@stelanau @phinifa @Nature (If we expose people for longer periods, we risk high attrition -- i.e., participants in our study may not respond to our follow-up survey.)

English

GermainGauthier@PinchOfData·22 Şub

@stelanau @phinifa @Nature 7 weeks is not an agreed-upon standard, but it's a reasonable time frame: we don't expect users to change their minds right away; rather, we expect effects to emerge after repeated exposure to content.

English

Entdecken

@emollick @leia_ruseva @jfullerucla @ahall_research @karpathy @francoisfleuret @YanagizawaD @tallinzen