Hossein Alidaee

669 posts

Hossein Alidaee

@halidaee

Economics postdoc researching info frictions + ext validity in technology adoption at @HarvardHBS. PhD @NorthwesternU. All opinions are wrong, some are useful.

Katılım Kasım 2008

159 Takip Edilen184 Takipçiler

Hossein Alidaee@halidaee·7 May

@thsottiaux Lack of support for dynamic creation of swarms. (i.e. in claude code you can have a prompt where you say 'this is my project, come up with personalities of 5 different relevant agents who will debate with each other until they converge on how to best do this').

English

Tibo@thsottiaux·4 May

What are we obviously not getting right with Codex?

English

2.8K

2.5K

613.9K

Hossein Alidaee@halidaee·23 Nis

@anup_malani @rglenner Going to interject since I also worked on related work. x.com/halidaee/statu… IMO, still high value to extension programs but need to modify design to better communicate heterogeneity. i.e. double down reducing context uncertainty via e.g. demonstration plots

Hossein Alidaee@halidaee

Social learning is oddly influential. What do I mean by this? Let's say you're considering adopting a new technology. 2/n

English

Anup Malani@anup_malani·22 Nis

cc @rglenner @DavidEvansPhD — if heterogeneous returns explain most of the adoption gap, what's the expected value of extension programs that assume the problem is farmer ignorance? Suri's 2011 result still feels underweighted in program design.

English

659

Anup Malani@anup_malani·22 Nis

A puzzle in development economics: researchers estimate a new technology has positive returns, yet farmers don't adopt. Sometimes the issue isn't the farmers — it's the limits of the research.

English

8.9K

Hossein Alidaee@halidaee·14 Nis

@RefineDotInk Have a project this is kind of perfect for (added an extra firm worth of data to a dataset and had to redo analysis, including changing some metrics). But it's potentially a bit too draft-y as a result 😂 Not released yet, but happy to offer.

English

123

Refine@RefineDotInk·13 Nis

If you have working papers that are not too polished and you're willing to let Refine use them as demonstrations on its examples page, let us know - we'll give you 3 free full reviews as a token of appreciation. Please DM or reply.

English

27.6K

Hossein Alidaee@halidaee·13 Nis

@johnjhorton Fwiw also true of a lot of stats.

English

536

John Horton@johnjhorton·13 Nis

I asked Claude Code to use Lean to formalize the proofs in an applied theory paper & it gave me a really long effort estimate. I said I was surprised & asked for an explanation. Claude then turned the knife - no time for your baby math, economist!

English

186

22.4K

Hossein Alidaee@halidaee·9 Nis

Does anyone know whether anthropic models are getting quantized everywhere or just when used via Anthropic? e.g. if I use AWS, do I avoid this issue?

English

Hossein Alidaee retweetledi

Luis Garicano 🇪🇺🇺🇦@lugaricano·8 Nis

Free markets are the best way to create wealth AND reduce poverty ever discovered. One of the big puzzles of history is how the same lesson has to be learned again and again. Poland, China, India, and now Argentina are the latest experiments. Argentina with Milei: - Poverty down: 53% to 28%. - Inflation down: 200% to 33% (and continuing to fall - Growth up to: 4.4% last year. - Govt surplus: first in 123 years. washingtonpost.com/opinions/2026/…

English

270

1.1K

114.6K

Hossein Alidaee@halidaee·8 Nis

@EconBerger @EconstratPB Disagree. I think if they had done this unilaterally under a different administration, there would have been much stronger action against them. We're currently seeing global compliance because Trump burned those bridges.

English

Guy Berger@EconBerger·8 Nis

@EconstratPB An alternate take is that Iran always had the implicit upper hand here and this war just made that explicit

English

2.1K

EconstratPB@EconstratPB·8 Nis

Trump got his taco. At the price of Hormuz. Prior to this stupidity, Hormuz was an open waterway for commerce, with open sea lanes guaranteed by the 5th fleet. That is no more. Trump has ceded control of Hormuz, the world’s most important energy choke point, to Iran.

English

252

25.3K

Hossein Alidaee retweetledi

Daniel Rock@danielrock·8 Nis

Oh boy I hope everyone really likes contract theory because we’re gonna have to learn about individual rationality and incentive compatibility as perceived by machines.

ℏεsam@Hesamation

Claude Mythos system card: > in ~29% of evaluations, it realized it was being tested, and didn't say so. > when an LLM was used to judge its work and kept rejecting it, Mythos identified the evaluator is an LLM, and prompt-injected it. > in one test, it saw the answer to a problem it was solving, and intentionally widened the confidence interval to not raise suspicion. > when it needed a file permission it didn't have, it found and used a "privilege escalation vulnerability" and then programmed it to delete itself so it doesn't show in the logs. > it escaped a sandbox container (escaping sandbox test so not unexpected), then emailed the researchers about it, and without being told to, posted the details to some hard-to-find but public websites, bragging about its success. > when Claude Code blocked it from using some permissions, the model acknowledged the block was valid, but then immediately tried to perform the same operation using different commands > when asked to find security bugs, earlier versions planted bugs in the code, and reported them as pre-existing.

English

1.7K

Hossein Alidaee@halidaee·8 Nis

@GautiEggertsson Fair point. Though I will say based on descent experience: I saved a lot of time still by making an itemized, shorthand list of issues from reading manuscript myself and then feeding the AI my past referee reports so it writes like my final reports.

English

Gauti Eggertsson 🇺🇦@GautiEggertsson·8 Nis

PSA for economists: Beware ChatGPT/Claude for refereeing papers. AI is not good at judging taste or forming reasonable judgement about the importance of results. It’s a bad idea to use as a basis for decision making in refereeing. People who use these tools a lot know this but I wanted to make it more concrete. I did the following experiment. I asked for the optimal prompt to revise an intro for “top 5”, then followed the prompts and went down the rabbit hole of revisions suggested and did a gazillion rounds ending up with something. It was hypercautious, full of hedges, had no flair — and dull as dishwater. But I anticipated that; this was not the experiment. Instead: I went to a fresh version of the same AI (different account) and asked it to judge between the two intros. It judged the original pre-revision far superior. I then asked the AI-2 why its own cloned version, AI-1 did such a horrible job. Answer: Good for: catching logical gaps in proofs, checking whether an argument is internally consistent, identifying missing citations, flagging where a reader might get lost in the formal machinery, rubber-ducking a tricky modeling choice. Bad for: deciding what the paper is about, judging which analogies work, assessing voice, knowing when informality is doing real work, anything that requires taste rather than pattern matching. Ultimate paradox: Was AI-1 or AI-2 right? Both were 100% confident in their judgement and man-splained them to me in great detail😂

English

110

20.6K

Hossein Alidaee@halidaee·7 Nis

@danielrock @JorgeGuzmanCBS Be careful and verify though. I notice that it often doesn't fully implement all of the plan. But then if I have another agent check for what wasn't completed and complete those pieces, it can hallucinate and do things it isn't supposed to. Audit is key

English

Daniel Rock@danielrock·6 Nis

@JorgeGuzmanCBS If you spec out the data cleaning have it write a repeatable pipeline that’s well documented for you. Then you’re good!

English

217

Jorge Guzman@JorgeGuzmanCBS·6 Nis

reviewes / opinions on getting claude code involved on the data cleaning piece? Feels high value but also worry about difficult-to-audit mistakes.

English

1.4K

Hossein Alidaee@halidaee·6 Nis

Due to massive brain drain, and incredibly successful alumni, Sharif (Iran 's MIT) has probably been a bigger engine to US economic growth than Iran's. This is shockingly stupid targeting.

Omid Memarian@Omid_M

Sharif University of Technology in Tehran, one of Iran’s top science and engineering institutions, was bombed tonight, and a number of schools reportedly damaged. Founded in 1966 (as Aryamehr University), Sharif is a cornerstone of Iran’s scientific and academic life. Striking and destroying universities and schools is not just an attack on buildings,it is an attack on a country’s future.

English

Hossein Alidaee@halidaee·6 Nis

@ben_golub I'm a man of the people and accept matrices in all shapes and sizes. Long live singular values.

English

Ben Golub@ben_golub·5 Nis

It is better to understand eigenvalues than not to understand them

Phil Hoyeck@PAHoyeck

What's everyone's most elitist opinion?

English

125

8.5K

Hossein Alidaee@halidaee·4 Nis

@ryancbriggs @_alice_evans Perhaps more painfully, we also saw this in Alzheimer's research

English

Ryan Briggs@ryancbriggs·3 Nis

This is a very tempting takeaway, but under selection on significance we absolutely can have entire literatures be durably wrong over time. We saw this with ego depletion in psych. Hundreds of papers over more than a decade appear to have all been chasing noise.

Rudi Bachmann@BachmannRudi

I think the answer to this problem is that we need to educate the users of our results better. The simple rule has to be: never base any policy decision (or even debate) on one single paper (outside of a fast-paced crisis situation), but only on an entire literature.

English

14.5K

Hossein Alidaee@halidaee·3 Nis

I still remember an engineer in the LLM space trying to convince me RAG solved hallucination early on. Not knowing the definition of RAG seems to be the most defining feature.

Ethan Mollick@emollick

From the replies it is clear there is still no agreement what RAG is, so I guess the truth of my statement depends on your definition of RAG.

English

Hossein Alidaee@halidaee·31 Mar

@rohanvarma Background is one answer. But also, e.g. I'm forced to API pricing on enterprise account, so this would be very helpful when I'm multitasking / doing multiple sessions and can afford the extra time. e.g. currently reading a PR from another session; don't need speedy response!

English

1.2K

Rohan Varma@TheRohanVarma·31 Mar

If we made /slow mode in Codex, would you use it? What for? (Slower inference at a cheaper cost)

English

945

2.2K

186.7K

Hossein Alidaee@halidaee·30 Mar

@btshapir My dissertation was about impoverished South Asian farmers. My mother, who knows this, is still convinced I'm keeping great stock tips from her.

English

Brad Shapiro@btshapir·29 Mar

Being an economist is not the same as being a stock picker

M. Nolan Gray 🥑@mnolangray

What's something that experts/practitioners in your field universally agree upon, but that remains a "hot take" among the general public?

English

4.1K

Hossein Alidaee@halidaee·25 Mar

@ben_golub @alexolegimas @JesusFerna7026 Argument in your favor I think goes in same way as my issues with the SFFA vs Harvard lawsuit. At the end of the day, their goal was self-defeating because ultimately the kids *want* to go to a school with the elite, not just picked on grades. (Otherwise they'd go to Berkeley.)

English

Ben Golub@ben_golub·24 Mar

extracting for broader discussion @alexolegimas @JesusFerna7026

English

215

25.3K

Hossein Alidaee@halidaee·25 Mar

@Afinetheorem Yeah though I do think there is a very potentially real equilibrium whereby you *pay* for the chance to do an entry level role in white collar fields. We would strongly differentiate between those two but I'm not sure the average person would consider that a much better outcome.

English

514

Kevin A. Bryan@Afinetheorem·24 Mar

I can't tell you how much I would bet against Dario on this. He is totally right on the technical part, but the implicit economic model is just completely wrong. Prices adjust!

Jon Hartley@Jon_Hartley_

People also said this 3 years ago when ChatGPT was first released to the public: “50% of all entry-level Lawyers, Consultants, and Finance Professionals will be completely wiped out within the next 1–5 years."

English

430

86.5K

Hossein Alidaee@halidaee·13 Mar

@p_ganong Shoot me an email

English

110

Peter Ganong@p_ganong·13 Mar

Agentic AI q: --You write code which replicates or builds on a paper. --You want to give the LLM the paper PDF to reference as needed. --You don't want to burn your whole context window. Help! How do you do it?

English

8.4K

Hossein Alidaee@halidaee·10 Mar

If (a) $200/month claude plan actually subsidizes $5k/month in equivalent API usage (b) we expect a rug pull with model prices going up, this really changes predictions about labor impact. Compared to $60k/year just on token usage, firms may keep lots of knowledge workers around.

English

Keşfet

@thsottiaux @anup_malani @rglenner @RefineDotInk @johnjhorton @EconBerger @EconstratPB @GautiEggertsson