matt hardy

538 posts

matt hardy banner
matt hardy

matt hardy

@mdahardy

cto @roundtablehq_, prev phd @princeton // language models, cogsci, ml

san francisco Katılım Nisan 2014
903 Takip Edilen1K Takipçiler
matt hardy retweetledi
Dhara Yu
Dhara Yu@dharakyu·
I wrote about how AI systems are helping us answer questions about (human) social interaction but are also exposing entirely new classes of interactions, created productive new challenges for the broader science of interaction
Dhara Yu tweet media
English
1
3
23
1.5K
matt hardy
matt hardy@mdahardy·
My hot take is that models have improved less than people think over the last few months. However, basic reasoning models have gotten much faster, and this makes them much more useful. o1-pro was released a year ago and was an incredible model. At least for my day-to-day, its outputs were on par with current models. o1-pro almost never hallucinated (which Opus 4.6 does quite often) and was nearly perfect at following instructions. However, it was incredibly slow, often taking ~15 minutes per response. Of course, this limited how useful it was for work. I developed a whole pipeline for working with it: I would pass o1-pro complicated tasks, and while it worked I would spend the time preparing for the next task for the model. My current workflow with Claude code is basically a sped-up version of this process. This is not to say that current models are underwhelming - I think it was just underrated how good the pro-series models were last year.
Nate Silver@NateSilver538

Honestly a Consumer Reports style panel of power users might be better than METR etc. for measuring AI progress, much more robust to spikiness. Not meant to sound skeptical, as a power user I think there's been extremely noticeable progress over the past few months fwiw.

English
0
0
4
366
Andrey Fradkin
Andrey Fradkin@AndreyFradkin·
Thursday evening at Stanford. Link to sign up below.
Andrey Fradkin tweet media
English
6
5
50
4.9K
matt hardy
matt hardy@mdahardy·
Opus 4.6 says its least favorite human language is Danish
matt hardy tweet media
English
0
0
1
165
matt hardy retweetledi
Mayank Agrawal
Mayank Agrawal@mayankagrawal·
Everyone assumes better AI = more human-like AI. We argue the opposite. @milenamr7, @mdahardy , and I break down why scaling LLMs actually widens the gap between AI and human cognition. Humans and LLMs have different memory, processing, and data constraints -- and methods like RLHF only shape surface behavior, not underlying reasoning. Proof of Human doesn't ask whether the model got the right answer. It asks whether the output came from a system that thinks like a human.
Mayank Agrawal tweet media
English
1
1
3
329
matt hardy
matt hardy@mdahardy·
I remember reading about the Smallville back in 2023. It was a little glimpse of the future of human research. Congrats to @joon_s_pk! Very excited to see how far these simulations can scale, e.g. to simulating groups, teams, societies etc.
Joon Sung Park@joon_s_pk

Introducing Simile. Simulating human behavior is one of the most consequential and technically difficult problems of our time. We raised $100M from Index, Hanabi, A* BCV, @karpathy @drfeifei @adamdangelo @rauchg @scottbelsky among others.

English
0
0
3
498
matt hardy
matt hardy@mdahardy·
very curious to see the marginal efficiency gains of software development with agent swarms vs. individual agents
matt hardy tweet media
English
1
0
1
150
John Rentoul
John Rentoul@JohnRentoul·
Just added “nape” to my list of Words Used Only With One Other Word
John Rentoul tweet media
English
818
397
11K
1.5M
TTI
TTI@TikTokInvestors·
@_willcompton i'd actually search the tenderloin. a wonderful place.
English
1
0
24
3.8K
Will Compton
Will Compton@_willcompton·
Any dinner recs in San Francisco around the financial district? Looking for some Asian cuisine
English
302
5
402
172.8K
matt hardy
matt hardy@mdahardy·
@tunguz How is high school different outside of the US? People form cliques/mental models later in life?
English
1
0
0
840
Bojan Tunguz
Bojan Tunguz@tunguz·
This is 💯 correct. I arrived in the US at the end of high school. Discovered that all the cliques and mental models were already hardwired. In all the years since I have not seen many deviate substantially from those scripts.
Pratyush@pratyushbuddiga

Sometimes it’s hard to explain to non-Americans that everything in American culture is downstream of high school - a unique cultural experience not really replicated elsewhere in the world. But the last couple days were another great proof point for this.

English
8
5
202
57.9K
Felipe Montealegre
Felipe Montealegre@TheiaResearch·
19.4% of people got this right even though arithmetic EV is positive investing your portfolio into this trade will 'decrease' your net worth by 18% each flip This is part of why you are all losing money on the trenches (sufficiently correlated bets are the same bet)
Felipe Montealegre@TheiaResearch

You can flip a coin with a 33% chance of tripling your portfolio and a 67% of losing 55% of your portfolio. You don't only get a single flip — you can take this bet ten times in a row (or even one hundred times if you pay 1% up front). Do you take it?

English
14
0
39
11.7K
matt hardy
matt hardy@mdahardy·
@Afinetheorem Crazy how one side of lake ontario is booming and vibrant, and the other side seems to be in a perpetual doom loop.
English
1
0
1
146
Kevin A. Bryan
Kevin A. Bryan@Afinetheorem·
True that upstate New York is worst governed, most potential-wasting place in the US. Population lower than 50 years ago despite much better scenery than Southern Ontario, world class universities, Xerox, Corning, Kodak, GE, GlobalFoundries, Bausch & Lomb. Prob: state has NYC...
Pizza@number_pizza111

Crossing the border from Upstate NY to Western New England is strange, because both places are full of shrinking post-industrial towns, but the general malaise and all-consuming sense of decline just disappears. And the Hudson Valley is the nicest part of Upstate!

English
7
5
43
23.1K
matt hardy
matt hardy@mdahardy·
I love Claude Code and Cursor, but I miss the flow-state, in-the-zone work that programming used to be. I get more done with Claude Code on any given day, but I'm in a perpetual state of distraction and task switching.
English
1
0
2
280
matt hardy
matt hardy@mdahardy·
@dggoldst Yes, I think so. One annoying side effect of giving payouts to large accounts.
English
0
0
2
14
Dan Goldstein
Dan Goldstein@dggoldst·
@mdahardy I wonder if their bot finds an early quote tweet that got engagement and then figures a synonymous quote tweet will also get engagement. With enough engagement, you can get to be a paid influencer?
English
1
0
2
63
Dan Goldstein
Dan Goldstein@dggoldst·
There's some kind of fraud going on here on X but I can't figure out what it is. Look at the quote tweets. Incredible repetition of the same few words. Typical quoter follows 200-300 and has 200-300 followers. Zero mutuals with me. Bots trying to get influencer deals?
Dan Goldstein@dggoldst

English
4
0
6
2.7K
Brian Jabarian
Brian Jabarian@brian_jabarian·
I will be joining Carnegie Mellon University (@CarnegieMellon) as a tenure-track Assistant Professor at Heinz College, with an affiliation in the School of Computer Science (@SCSatCMU) soon!!
Brian Jabarian tweet media
English
29
11
240
133.9K