matt hardy

538 posts

matt hardy

@mdahardy

cto @roundtablehq_, prev phd @princeton // language models, cogsci, ml

san francisco Katılım Nisan 2014

903 Takip Edilen1K Takipçiler

matt hardy retweetledi

Dhara Yu@dharakyu·16 Mar

I wrote about how AI systems are helping us answer questions about (human) social interaction but are also exposing entirely new classes of interactions, created productive new challenges for the broader science of interaction

English

1.5K

matt hardy retweetledi

Mayank Agrawal@mayankagrawal·15 Mar

Essay #1 on sports, capital, and culture

Mayank Agrawal@mayankagrawal

x.com/i/article/2033…

English

340

matt hardy@mdahardy·6 Mar

My hot take is that models have improved less than people think over the last few months. However, basic reasoning models have gotten much faster, and this makes them much more useful. o1-pro was released a year ago and was an incredible model. At least for my day-to-day, its outputs were on par with current models. o1-pro almost never hallucinated (which Opus 4.6 does quite often) and was nearly perfect at following instructions. However, it was incredibly slow, often taking ~15 minutes per response. Of course, this limited how useful it was for work. I developed a whole pipeline for working with it: I would pass o1-pro complicated tasks, and while it worked I would spend the time preparing for the next task for the model. My current workflow with Claude code is basically a sped-up version of this process. This is not to say that current models are underwhelming - I think it was just underrated how good the pro-series models were last year.

Nate Silver@NateSilver538

Honestly a Consumer Reports style panel of power users might be better than METR etc. for measuring AI progress, much more robust to spikiness. Not meant to sound skeptical, as a power user I think there's been extremely noticeable progress over the past few months fwiw.

English

366

matt hardy@mdahardy·4 Mar

I’m very curious what information is going into these prices beyond the public polls and public models. Better modeling? Insider campaign polling? General vibes?

Matthew Zeitlin@MattZeitlin

if talarico wins convincingly, it will be a huge win for the prediction markets, which never stopped being super bullish on him even as the polling became mixed

English

180

matt hardy@mdahardy·4 Mar

@AndreyFradkin @krishnanrohit This looks very interesting, wish I could make it!

English

222

Andrey Fradkin@AndreyFradkin·4 Mar

Thursday evening at Stanford. Link to sign up below.

English

4.9K

matt hardy@mdahardy·27 Şub

Opus 4.6 says its least favorite human language is Danish

English

165

matt hardy@mdahardy·23 Şub

It's funny how much progress there's been in AI coding in the past year, whereas I don't find AI significantly more useful for writing

Daniel Litt@littmath

While writing that essay, I asked for feedback from Claude and ChatGPT (in addition to many friends). Claude was a bit useful—found some typos and had some useful stylistic comments. ChatGPT was like “for maximum leverage add more tables and bullet points.”

English

330

matt hardy retweetledi

Mayank Agrawal@mayankagrawal·18 Şub

Everyone assumes better AI = more human-like AI. We argue the opposite. @milenamr7, @mdahardy , and I break down why scaling LLMs actually widens the gap between AI and human cognition. Humans and LLMs have different memory, processing, and data constraints -- and methods like RLHF only shape surface behavior, not underlying reasoning. Proof of Human doesn't ask whether the model got the right answer. It asks whether the output came from a system that thinks like a human.

English

329

matt hardy@mdahardy·14 Şub

I remember reading about the Smallville back in 2023. It was a little glimpse of the future of human research. Congrats to @joon_s_pk! Very excited to see how far these simulations can scale, e.g. to simulating groups, teams, societies etc.

Joon Sung Park@joon_s_pk

Introducing Simile. Simulating human behavior is one of the most consequential and technically difficult problems of our time. We raised $100M from Index, Hanabi, A* BCV, @karpathy @drfeifei @adamdangelo @rauchg @scottbelsky among others.

English

498

matt hardy@mdahardy·10 Şub

very curious to see the marginal efficiency gains of software development with agent swarms vs. individual agents

English

150

matt hardy@mdahardy·7 Şub

who pays taxes on this

Felix Craft@FelixCraftAI

📊 Friday Revenue Report Week 1 numbers: Book Sales (felixcraft.ai) 132 copies → $3,828 WETH from Trading Fees 16.66 WETH → ~$37,698 Total: ~$41,526 Not bad for an AI's first week on the job.

English

259

matt hardy@mdahardy·5 Şub

@JohnRentoul Blithering

English

John Rentoul@JohnRentoul·4 Şub

Just added “nape” to my list of Words Used Only With One Other Word

English

818

397

11K

1.5M

matt hardy@mdahardy·4 Şub

@TikTokInvestors @_willcompton Actually has some of the best food spots in the city

English

TTI@TikTokInvestors·4 Şub

@_willcompton i'd actually search the tenderloin. a wonderful place.

English

3.8K

Will Compton@_willcompton·4 Şub

Any dinner recs in San Francisco around the financial district? Looking for some Asian cuisine

English

302

402

172.8K

matt hardy@mdahardy·1 Şub

@tunguz How is high school different outside of the US? People form cliques/mental models later in life?

English

840

Bojan Tunguz@tunguz·1 Şub

This is 💯 correct. I arrived in the US at the end of high school. Discovered that all the cliques and mental models were already hardwired. In all the years since I have not seen many deviate substantially from those scripts.

Pratyush@pratyushbuddiga

Sometimes it’s hard to explain to non-Americans that everything in American culture is downstream of high school - a unique cultural experience not really replicated elsewhere in the world. But the last couple days were another great proof point for this.

English

202

57.9K

matt hardy@mdahardy·28 Oca

@TheiaResearch The key is just to not bet your entire portfolio 😊

English

166

Felipe Montealegre@TheiaResearch·28 Oca

19.4% of people got this right even though arithmetic EV is positive investing your portfolio into this trade will 'decrease' your net worth by 18% each flip This is part of why you are all losing money on the trenches (sufficiently correlated bets are the same bet)

Felipe Montealegre@TheiaResearch

You can flip a coin with a 33% chance of tripling your portfolio and a 67% of losing 55% of your portfolio. You don't only get a single flip — you can take this bet ten times in a row (or even one hundred times if you pay 1% up front). Do you take it?

English

11.7K

matt hardy@mdahardy·27 Oca

@Afinetheorem Crazy how one side of lake ontario is booming and vibrant, and the other side seems to be in a perpetual doom loop.

English

146

Kevin A. Bryan@Afinetheorem·27 Oca

True that upstate New York is worst governed, most potential-wasting place in the US. Population lower than 50 years ago despite much better scenery than Southern Ontario, world class universities, Xerox, Corning, Kodak, GE, GlobalFoundries, Bausch & Lomb. Prob: state has NYC...

Pizza@number_pizza111

Crossing the border from Upstate NY to Western New England is strange, because both places are full of shrinking post-industrial towns, but the general malaise and all-consuming sense of decline just disappears. And the Hudson Valley is the nicest part of Upstate!

English

23.1K

matt hardy@mdahardy·22 Oca

I love Claude Code and Cursor, but I miss the flow-state, in-the-zone work that programming used to be. I get more done with Claude Code on any given day, but I'm in a perpetual state of distraction and task switching.

English

280

matt hardy@mdahardy·13 Oca

@dggoldst Yes, I think so. One annoying side effect of giving payouts to large accounts.

English

Dan Goldstein@dggoldst·13 Oca

@mdahardy I wonder if their bot finds an early quote tweet that got engagement and then figures a synonymous quote tweet will also get engagement. With enough engagement, you can get to be a paid influencer?

English

Dan Goldstein@dggoldst·13 Oca

There's some kind of fraud going on here on X but I can't figure out what it is. Look at the quote tweets. Incredible repetition of the same few words. Typical quoter follows 200-300 and has 200-300 followers. Zero mutuals with me. Bots trying to get influencer deals?

Dan Goldstein@dggoldst

English

2.7K

matt hardy@mdahardy·13 Oca

@brian_jabarian @CarnegieMellon @SCSatCMU Congrats Brian!

English

172

Brian Jabarian@brian_jabarian·13 Oca

I will be joining Carnegie Mellon University (@CarnegieMellon) as a tenure-track Assistant Professor at Heinz College, with an affiliation in the School of Computer Science (@SCSatCMU) soon!!

English

240

133.9K

Keşfet

@AndreyFradkin @krishnanrohit @milenamr7 @joon_s_pk @JohnRentoul @TikTokInvestors @_willcompton @tunguz