samuel joseph troyer

699 posts

samuel joseph troyer banner
samuel joseph troyer

samuel joseph troyer

@samjtro

founder @ https://t.co/VasbYUotie

SF / ATX เข้าร่วม Nisan 2022
253 กำลังติดตาม107 ผู้ติดตาม
samuel joseph troyer รีทวีตแล้ว
samsja
samsja@samsja19·
AI products will continue to be deeply unimaginative as long as we treat the model as the final product. Model should be building block, infra and post training should be accessible to people with crazy ideas Open source happened to be the best way to solve this
English
12
16
196
30.4K
samuel joseph troyer รีทวีตแล้ว
mal
mal@mal_shaik·
from my 2 months in sf most exceptional ppl that i met checked one of these boxes: - has adhd - dropped out of highschool / college - got humbled (big time) at some point in their life - is socially dysfunctional - wears the same 3 fits - has a cooked sleep schedule
English
50
32
826
59.8K
Nick Sweeting
Nick Sweeting@thesquashSH·
@trq212 is there any reason why you didn't do a sliding window of continuous compaction of the oldest messages? (e.g. abbreviate oldest tool call results, replace with references to the full chat history in a file that claude can read if needed) why do it all in big bursts?
English
3
0
4
355
samuel joseph troyer รีทวีตแล้ว
will brown
will brown@willccbb·
what are the implications of process reward modeling for the political tensions in the balkans? only time will tell
English
3
5
79
5.4K
samuel joseph troyer รีทวีตแล้ว
Prime Intellect
Prime Intellect@PrimeIntellect·
ZXX
91
754
7.5K
743.6K
samuel joseph troyer รีทวีตแล้ว
Tanishq Mathew Abraham, Ph.D.
Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·
practical, modern GRPO tweaks as described in Meta's Code World Models paper
Tanishq Mathew Abraham, Ph.D. tweet media
English
13
81
867
244.2K
samuel joseph troyer
samuel joseph troyer@samjtro·
@natolambert i agree with your last statement, and that's where i think rich is right; RL-in-LLMs might be a fine representation of semantic learning, but not a true model of "intelligence" as we understand it naturally.
English
0
0
0
51
Nathan Lambert
Nathan Lambert@natolambert·
Rich is amazing, but I actually don't think he's going to be right in the LLM age. In much of the same ways I've documented that I disagree with Dwarkesh on the continual learning problem (and definition). Too much of "intelligence" is grounded on human intuitions.
Dwarkesh Patel@dwarkesh_sp

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI

English
30
29
477
81.8K
finbarr
finbarr@finbarrtimbers·
I love Rich He used to drop hot takes like this during our internal DMA seminars, it was great (unless you were presenting)
finbarr tweet media
English
3
2
97
6.6K
samuel joseph troyer รีทวีตแล้ว
Christopher Nguyen ⽗
Christopher Nguyen ⽗@pentagoniac·
The time has come for Agent Engineering. And if you do this right, you will see how @ylecun is entirely correct about LLMs.
English
7
6
54
44.6K
samuel joseph troyer รีทวีตแล้ว
samuel joseph troyer รีทวีตแล้ว
dr. jack morris
dr. jack morris@jxmnop·
seems likely the world of One Big Model will end in a year or two we’ll have families of peft-adapted experts continuously retrained, merged, and reapplied under varying degrees of staleness the Train/Test split of conventional machine learning held us back for far too long
English
23
19
311
24.6K
samuel joseph troyer รีทวีตแล้ว
Beyang
Beyang@beyang·
This is currently the bottleneck in all agentic coding
Beyang tweet media
English
16
6
59
16.8K
Alexander Doria
Alexander Doria@Dorialexander·
If you showed Claude Code to an Etruscan musician, he would be very frustrated it cannot play music at all,
English
1
1
27
2.6K
samuel joseph troyer
samuel joseph troyer@samjtro·
@GolerGkA @Yuchenj_UW it's called a stock-secured loan, or SBLOC; stock is like any other asset -- if the value of the underlying increases, so to does the $ you can borrow against it.
English
0
0
8
726
Reply guy 😐
Reply guy 😐@GolerGkA·
@Yuchenj_UW Ok, genuine question: how can the bank be sure that the stock price won’t go down just as quickly as it went up? That doesn’t look like a reliable collateral to me, but may be I don’t under something. Do they have a margin call in that loan?
English
14
1
35
5.9K
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
How money works: 1. OpenAI signs $300B GPU deal with Oracle 2. Larry gains $100B (no GPUs shipped) 3. Larry invests in OpenAI’s $1T round 4. Sam uses $300B to pay Oracle 5. Oracle stock pumps again 6. Larry makes another $100B 7. Larry invests in OpenAI Flywheel go brrr.
English
523
2.3K
30K
1.8M
Alexander Doria
Alexander Doria@Dorialexander·
since data quality discourse might benefit from it, introducing the chart
Alexander Doria tweet media
English
6
6
64
8.7K
samuel joseph troyer รีทวีตแล้ว
Saurabh Shah
Saurabh Shah@saurabh_shah2·
Holy shit they’re doing on-policy RL by just deploying the model to prod lmao that’s so baller. also 2 hrs for a training step makes our 10 minute steps feel lightning fast @hamishivi … they probably have a bigger batch size though 😅
Saurabh Shah tweet media
Cursor@cursor_ai

We've trained a new Tab model that is now the default in Cursor. This model makes 21% fewer suggestions than the previous model while having a 28% higher accept rate for the suggestions it makes. Learn more about how we improved Tab with online RL.

English
12
22
579
102.6K