samuel joseph troyer

699 posts

samuel joseph troyer

@samjtro

founder @ https://t.co/VasbYUotie

SF / ATX Katılım Nisan 2022

253 Takip Edilen107 Takipçiler

samuel joseph troyer retweetledi

samsja@samsja19·20 Ara

AI products will continue to be deeply unimaginative as long as we treat the model as the final product. Model should be building block, infra and post training should be accessible to people with crazy ideas Open source happened to be the best way to solve this

English

196

30.4K

samuel joseph troyer retweetledi

mal@mal_shaik·17 Ara

from my 2 months in sf most exceptional ppl that i met checked one of these boxes: - has adhd - dropped out of highschool / college - got humbled (big time) at some point in their life - is socially dysfunctional - wears the same 3 fits - has a cooked sleep schedule

English

826

59.8K

samuel joseph troyer@samjtro·11 Ara

@thesquashSH @trq212 this!

English

Nick Sweeting@thesquashSH·11 Ara

@trq212 is there any reason why you didn't do a sliding window of continuous compaction of the oldest messages? (e.g. abbreviate oldest tool call results, replace with references to the full chat history in a file that claude can read if needed) why do it all in big bursts?

English

355

Thariq@trq212·10 Ara

We buried the lede a bit here, compact summarization now happens continuously in the background so that when you need to compact the effect is instant

Claude@claudeai

Claude now compacts context exponentially faster. Compacting takes only seconds so you don’t get interrupted.

English

108

1.5K

166.8K

samuel joseph troyer retweetledi

Brian Will@brianwill·23 Eki

I do wish Go had proper enums, but this is the most tenuous alleged foot gun I've ever seen.

Dmitrii Kovanikov@ChShersh

I dunk on Go because it's a bad language. I never realised it's even deeper than a bottomless pit of despair. Wth is this

English

561

samuel joseph troyer retweetledi

will brown@willccbb·18 Eki

what are the implications of process reward modeling for the political tensions in the balkans? only time will tell

English

5.4K

samuel joseph troyer retweetledi

Prime Intellect@PrimeIntellect·11 Eki

ZXX

754

7.5K

743.6K

samuel joseph troyer retweetledi

mike64_t@mike64_t·10 Eki

x.com/i/article/1972…

ZXX

912

436.1K

samuel joseph troyer retweetledi

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·28 Eyl

practical, modern GRPO tweaks as described in Meta's Code World Models paper

Tanishq Mathew Abraham, Ph.D. tweet media

English

867

244.2K

samuel joseph troyer@samjtro·26 Eyl

@natolambert i agree with your last statement, and that's where i think rich is right; RL-in-LLMs might be a fine representation of semantic learning, but not a true model of "intelligence" as we understand it naturally.

English

Nathan Lambert@natolambert·26 Eyl

Rich is amazing, but I actually don't think he's going to be right in the LLM age. In much of the same ways I've documented that I disagree with Dwarkesh on the continual learning problem (and definition). Too much of "intelligence" is grounded on human intuitions.

Dwarkesh Patel@dwarkesh_sp

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI

English

477

81.8K

samuel joseph troyer@samjtro·26 Eyl

@finbarrtimbers rich is the 🐐

English

184

finbarr@finbarrtimbers·26 Eyl

I love Rich He used to drop hot takes like this during our internal DMA seminars, it was great (unless you were presenting)

English

6.6K

samuel joseph troyer retweetledi

Christopher Nguyen ⽗@pentagoniac·24 Eyl

The time has come for Agent Engineering. And if you do this right, you will see how @ylecun is entirely correct about LLMs.

English

44.6K

samuel joseph troyer retweetledi

m_ric@AymericRoucher·24 Eyl

We're thrilled to introduce PrediBench, our first production at @presage_labs! PrediBench a live benchmark that answers the question "could an AI model earn money on Polymarket?" TL;DR: Some models like Grok-4 or GPT-5 do beat the crowd of human betters, and they turn a profit!

Presage Labs@presage_labs

Introducing PrediBench - A live benchmark of AI models betting on prediction markets. This benchmark answers the question “How well can AI predict the future?” 1 - Each day, 10 top trending real-world events are pulled from Polymarket, with questions like “Who will be the next mayor of NYC?” 2 - Each model browses the web in agentic mode to research the question, then allocates $1 in bets. 3 - As the events resolve in real-time, we score the model’s performance : Average returns, Sharpe ratio, Brier score. ▸ Visit it at predibench.com 🧵[1/N]

English

3.5K

samuel joseph troyer retweetledi

dr. jack morris@jxmnop·24 Eyl

seems likely the world of One Big Model will end in a year or two we’ll have families of peft-adapted experts continuously retrained, merged, and reapplied under varying degrees of staleness the Train/Test split of conventional machine learning held us back for far too long

English

311

24.6K

samuel joseph troyer retweetledi

Beyang@beyang·20 Eyl

This is currently the bottleneck in all agentic coding

English

16.8K

samuel joseph troyer@samjtro·21 Eyl

@trq212 great work y'all!

English

Thariq@trq212·21 Eyl

The code is now available here! github.com/anthropics/cla… Of course it's very much still in development, next week we'll add features like drafting emails & taking action

Thariq@trq212

Making an Email Agent using the Claude Code SDK If I wasn’t at Anthropic, I would be making agents using the Claude Code SDK. But doing > talking. So I’m building in public and open sourcing a local email agent. This is part one on agentic search.

English

822

148.3K

samuel joseph troyer@samjtro·13 Eyl

@TommyFalkowski @Dorialexander haha exactly what i was thinking 🤣

English

Tommy Falkowski@TommyFalkowski·13 Eyl

@Dorialexander strudel.cc

QME

Alexander Doria@Dorialexander·13 Eyl

If you showed Claude Code to an Etruscan musician, he would be very frustrated it cannot play music at all,

English

2.6K

samuel joseph troyer@samjtro·12 Eyl

@GolerGkA @Yuchenj_UW it's called a stock-secured loan, or SBLOC; stock is like any other asset -- if the value of the underlying increases, so to does the $ you can borrow against it.

English

726

Reply guy 😐@GolerGkA·12 Eyl

@Yuchenj_UW Ok, genuine question: how can the bank be sure that the stock price won’t go down just as quickly as it went up? That doesn’t look like a reliable collateral to me, but may be I don’t under something. Do they have a margin call in that loan?

English

5.9K

Yuchen Jin@Yuchenj_UW·12 Eyl

How money works: 1. OpenAI signs $300B GPU deal with Oracle 2. Larry gains $100B (no GPUs shipped) 3. Larry invests in OpenAI’s $1T round 4. Sam uses $300B to pay Oracle 5. Oracle stock pumps again 6. Larry makes another $100B 7. Larry invests in OpenAI Flywheel go brrr.

English

523

2.3K

30K

1.8M

samuel joseph troyer@samjtro·12 Eyl

@Dorialexander phenomenal

English

Alexander Doria@Dorialexander·12 Eyl

since data quality discourse might benefit from it, introducing the chart

English

8.7K

samuel joseph troyer@samjtro·12 Eyl

@vincentweisser 🚀🚀

QME

162

Vincent Weisser@vincentweisser·12 Eyl

The RL environments hub + infra we have launched will make this kind of post-training more accessible to every AI developer.

Cursor@cursor_ai

We've trained a new Tab model that is now the default in Cursor. This model makes 21% fewer suggestions than the previous model while having a 28% higher accept rate for the suggestions it makes. Learn more about how we improved Tab with online RL.

English

260

36.6K

samuel joseph troyer retweetledi

Saurabh Shah@saurabh_shah2·12 Eyl

Holy shit they’re doing on-policy RL by just deploying the model to prod lmao that’s so baller. also 2 hrs for a training step makes our 10 minute steps feel lightning fast @hamishivi … they probably have a bigger batch size though 😅

Cursor@cursor_ai

English

579

102.6K

Keşfet

@thesquashSH @trq212 @natolambert @finbarrtimbers @ylecun @presage_labs @elonmusk @BarackObama