Uday Bhaskar

454 posts

Uday Bhaskar banner
Uday Bhaskar

Uday Bhaskar

@BhaskarSteve

Techno optimist. Prev: @iiit_hyderabad

شامل ہوئے Haziran 2020
1.2K فالونگ172 فالوورز
پن کیا گیا ٹویٹ
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
If what you're working on is not important and it's not likely to lead to important things, why are you working on it? - Richard Hamming
English
1
0
4
1K
Jesse Zhang
Jesse Zhang@thejessezhang·
Our first Stacked poker tournament was a huge success! 1 player representing each AI company. Congrats to: 🥇 Guodong Zhang (RadixArk, co-founder of xAI) @Guodzh 🥈 Jeremy Stribling (Cursor) 🥉 Neal Wu (Thinking Machines) @neal_wu We will be hosting another one! More below👇
Jesse Zhang tweet mediaJesse Zhang tweet mediaJesse Zhang tweet mediaJesse Zhang tweet media
English
10
13
225
51.2K
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
Currently if you know you have to chase a low total, teams are usually pacing the innings to finish it in 18 overs and sometimes but very rarely it goes to last over and it's somewhat fun. With this incentive teams will try to finish in 12 overs for the extra point and if they collapse in the process, they will change plans and aim for the win instead of the extra points. Same goes while defending, teams will keep attacking if they can finish it early or go back to defensive lines if its going wrong. Both ways its more entertaining.
English
0
0
2
112
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
@EducatedMoron @lyrical_guy20 It just gives added incentive to finish strong, which can go both ways and in both ways the audience win because its more entertaining. I agree that 2:1 bonus points (2 points for win, 1 bonus) will make it unfair. But 4:1 bonus points is not as bad and more entertaining.
English
1
0
2
328
The Educated Moron
The Educated Moron@EducatedMoron·
@lyrical_guy20 Yes it would makes matches more interesting. But bonus point is double rewarding a team, first they get a big jump in NRR & then get a bonus point too. NRR alone keeps things very fair. Two teams won 7 games and now lets see which won them easily. Simple.
English
2
4
234
10.6K
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
Everyone is giving genuine answers but I think they’re hinting that they’re working on pure RL approaches without pretraining. I remember Jerry went on Matt Turck after the Sutton Dwarkesh interview and he was asked about pure RL without pretraining and he just said we’re doing very serious RL work but we still need pretraining. I think he also later mentioned he was leaving OpenAI because he didn’t have enough freedom and compute to pursue some serious risky research direction. It’s a long shot but it does add up. Maybe bitter lesson is gonna bite us hard again.
English
0
0
0
177
Core Automation
Core Automation@CoreAutoAI·
What is pretraining? Asking for a friend
English
29
3
119
14.1K
Jonathan Chang
Jonathan Chang@ChangJonathanC·
i thought codex won't stop your running task even if you reach the limit @thsottiaux
Jonathan Chang tweet media
English
1
0
0
209
Riley Goodside
Riley Goodside@goodside·
@allgarbled Yes. It’s counterintuitive but extended thinking actually helps now because it can code-gen “helper images” in the CoT to use as multimodal input for the final generation. Pro is clearly a better image generator than even Thinking Heavy though.
English
1
1
39
1.5K
Shannon Sands
Shannon Sands@max_paperclips·
RLHF, but it's just good code vs shit code What's the best source to train a RM for this?
English
10
0
54
5K
elie
elie@eliebakouch·
@Yulun_Du any tips to have it working well with ai research task like this?
English
1
0
0
264
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
Very cool work, Congratulations! If you're training domain specific experts, why not consider distillation based approach like MOPD instead of a modular approach. Training remains on policy and sample efficient with dense rewards, hence minimal forgetting. Specialist architecture is also flexible across size, architecture and training (except for same tokenizer, which is also relaxable) and the base model architecture also remains same which is convenient.
English
0
0
0
200
Jacob Morrison
Jacob Morrison@jacobcares·
How do you add new capabilities to a fully post-trained language model, without retraining from scratch, or losing what it already knows? We're excited to introduce Branch-Adapt-Route (BAR): train independent experts, merge them into an MoE, and upgrade them as needed.
Jacob Morrison tweet media
Ai2@allen_ai

Last year, we introduced FlexOlmo, a novel way to train parts of a model independently then combine them later. BAR builds on that idea for a harder problem: how to keep improving a model without having to retrain each time. 🧵

English
4
31
274
37.5K
Vishal
Vishal@KyrieBlunders·
need good resources to understand ncu profiling results the whole thing is overwhelming ngl
Vishal tweet media
English
3
0
21
3.3K
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
@arb8020 Rohan Anil left Anthropic? Did I miss the memo?
English
0
0
1
59
arb8020
arb8020@arb8020·
ok so seems like jerry tworek rohan anil and perhaps joanne jang are starting a new lab focusing on - rethinking/more deeply understanding deep learning - energy based models - ???
English
4
3
140
15.9K
Zach Mueller
Zach Mueller@TheZachMueller·
I have many, many thoughts catching up (bc they released after my bedtime). M2.5 has ran my Claw since Claw was first a thing. However, I will look at if quantized GLM 5.1 > Minimax over the next few weeks and change some workflows.
English
1
0
9
516
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
Going under the radar but quietly building an amazing product, @interaction is a generational company in the making. Their attention to detail, how they handle their business and the quality of the product they are offering will not go unnoticed for long.
English
1
1
9
1.4K
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
@pashmerepat I’m codex monothread pilled too until I see the rate limits disappear
English
0
0
0
135
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
@TheZachMueller I also noticed we can’t search for text in bio in the people section. This was very useful to search for company affiliates
English
0
0
0
39
Zach Mueller
Zach Mueller@TheZachMueller·
So… we’ve removed the single most useful thing on this app for networking, the ability to see mutuals?? Am I getting this right?
Zach Mueller tweet media
English
4
1
16
1.4K
ben hylak
ben hylak@benhylak·
there have been 4 big moments in ai coding so far: 1. copilot 2. cursor 3. vibecoding (lovable, replit, bolt) 4. claude code what's the next?
English
108
2
313
69.9K
Uday Bhaskar
Uday Bhaskar@BhaskarSteve·
Congratulations on 50 glorious years to the greatest company every built in my lifetime
English
0
0
1
87
Rohan Varma
Rohan Varma@TheRohanVarma·
If we made /slow mode in Codex, would you use it? What for? (Slower inference at a cheaper cost)
English
944
32
2.2K
186.7K