John Tian

706 posts

John Tian banner
John Tian

John Tian

@johnrtian

18 | incoming @stanford | empowering teachers @ gradewithai

LA شامل ہوئے Nisan 2023
375 فالونگ400 فالوورز
John Tian
John Tian@johnrtian·
Actually a great question; I was working on my own chess benchmark a couple weeks ago and I measured the number of moves top models could make against Stockfish in a random position that's equal (+-0.5 centipawns) given the legal move options every turn. If I recall correctly, top models like 3.1 Pro were making it 10-20 moves before losing via checkmate. I couldn't continue and publish my results due to it becoming prohibitively expensive, unfortunately.
English
0
0
1
14
John Tian
John Tian@johnrtian·
@mikeysee If it is AGI, I expect it to be quite good at Chess and Go!
English
1
0
1
11
Mikeysee
Mikeysee@mikeysee·
@johnrtian Alright fair enough, I guess to be "AGI" then you need to be good at everything humans do and that includes playing games. We should expect LLMs to be good at Chess and Go too then right?
English
1
0
0
16
John Tian
John Tian@johnrtian·
@max_spero_ When you focus solely on SEO, you don't have much bandwidth left for making the product actually good! If numbers keep going up through SEO, it doesn't really make sense to do anything else. Until Pangram ups its SEO game, these crappy detectors have no incentive to improve.
English
0
0
3
120
John Tian
John Tian@johnrtian·
@gum1h0x here's my implementation with 1.26 bpb! main difference is i have a decoder to go from latent to bytes, and i patchify before the encoder instead of doing byte-level attention first and then mean-pooling into chunks. github.com/openai/paramet… would love your thoughts :)
English
0
0
1
125
John Tian
John Tian@johnrtian·
got a bpb of 1.26 with JEPA! for context: every single entry on the parameter golf leaderboard is a GPT. this is a JEPA encoder-decoder that predicts latent representations, not next tokens. it uses a pure byte-level tokenizer (vocab 260 vs 1024 BPE). there's no tokenizer: the model has to learn everything from raw bytes. even with untuned hyperparams, it's within 0.04 bpb of the GPT baseline. the gap to close is small and i have a ton of ideas about how to beat it! (would love some credits)
will depue@willdepue

i’ll send merch to anyone that can get a JEPA model to beat the parameter golf baseline! only rule is no tokenizer (use byte level) to be true to JEPA

English
1
0
1
214
Theo - t3.gg
Theo - t3.gg@theo·
T3 Code now supports Claude. If you have the Claude Code CLI installed and signed in, you can use it with T3 Code. Hopefully the lawyers won't make us remove this 🙃
Theo - t3.gg tweet media
English
225
53
2.6K
516.9K
John Tian
John Tian@johnrtian·
@jskoiz @cheatyyyy amazing project, look the hover effects!! fyi tho hovering over this icon causes it to re-render, resetting the animation
John Tian tweet media
English
1
0
0
38
jonathan liu
jonathan liu@jonathanzliu·
@heyruchir LTV per subscriber is $21 so my CAC needs to be below that
English
4
0
5
1.6K
Ruchir
Ruchir@heyruchir·
@jonathanzliu What’s your LTV for every subscriber? If it’s high enough than this is worth it.
English
3
0
7
1.5K
Mikeysee
Mikeysee@mikeysee·
So just following up yesterdays discussions with @theo here are the results testing various reasoning efforts through @OpenRouter on the @convex evals. The GPT 5.4 xhigh result was the most surprising to me, so I re-ran it again to check and it got the same result which is inline with what Theo was saying that xhigh is worse than high.
Mikeysee tweet media
English
71
48
543
189K
jack friks
jack friks@jackfriks·
claude code VS codex codex is quite good, 100x better than anything i used a year ago. but coding with claude makes everything feel like a video game, and i get things done in seemingly less time while having more fun?
English
141
9
701
49.5K
AJ
AJ@0xajka·
@theo “Wait Sonnet 4.6 dropped?” as if you didn’t already have it available on @t3dotchat
AJ tweet media
English
1
0
8
780
Theo - t3.gg
Theo - t3.gg@theo·
Wait Sonnet 4.6 dropped? Is it worth a video?
English
160
3
1K
85.2K
John Tian
John Tian@johnrtian·
@theo exa instant for the win!
English
1
0
7
1.3K
Theo - t3.gg
Theo - t3.gg@theo·
T3 CHAT LAUNCH WEEK DAY 2 Search is now 10x faster, and models can do multiple searches per request
English
64
17
1.1K
53.8K
John Tian
John Tian@johnrtian·
@theo It's genuinely so evil that I have to believe it's ragebait..
English
0
0
4
1.5K
Theo - t3.gg
Theo - t3.gg@theo·
I miss likes being public. Dude literally said god killed my best friend to make a point, and 20 people agreed enough to hit the like button. If you're one of those 20, speak up below. I want to know who you are.
#SadBarcaFan@Thatguy_107

@theo @yacinelearning Jesus you pick to fight everyone you miserable piece of sheet. No wonder god takes your friend out of this world 😂

English
78
3
1K
115.7K
wilson hou
wilson hou@wilsonhou·
in a month @lowercaseclub will be 6 people, 5 fulltime. incredible, talented people who for some reason or other joined the lil studio @angehyc and i started after quitting our jobs ~2 years ago. i feel like an imposter but also more excited than ever for whats to come life is surreal
English
16
1
60
2.4K