sankalp

44.8K posts

sankalp

sankalp

@dejavucoder

poasting about language models, coding agents and recent side quests

bangalore, india Katılım Ekim 2021
662 Takip Edilen25K Takipçiler
Sabitlenmiş Tweet
sankalp
sankalp@dejavucoder·
since sonnet 4.6 is here and claude code is rapidly improving day by day, i am once again reminding you that i wrote a guide to claude code and how to use coding agents better in general 2 months back
sankalp@dejavucoder

claude code is having it's cursor moment after karpathy sensei's post. never been a better time to try it. my latest blog on how to get the most out of claude code 2.0 and other agents in general is up now. grab a chai and have fun reading! sankalp.bearblog.dev/my-experience-…

English
9
13
214
45.4K
critter
critter@BecomingCritter·
critter tweet media
ZXX
13
244
3.4K
29.5K
sankalp
sankalp@dejavucoder·
when you finally understand how policy gradient works after going down the differentiation trenches and realising that the REINFORCE algorithm is literally the base form of policy gradient
sankalp tweet media
English
9
5
233
29.1K
sankalp
sankalp@dejavucoder·
lots of creative things you can do if you have imagination, desire & agency to do it (& a high token budget). you can custom design everything at development time. you can then try to replace yourself by distilling your taste and automate it. further extend to generative ui.
vijay singh@dprophecyguy

people are underestimating existing llms for svg capabilities by a lot. llms excel at creating much more sophisticated svgs and animation with a little bit of care and refinement all of the below interactions / animation are bespoke svgs component i've created from scratch. all of them created with @claudeai opus 4.6, the best model ever.

English
2
2
26
1.9K
sankalp
sankalp@dejavucoder·
@henrytdowling tbh, what u see in the quoted tweet is one example. you can generate everything in a custom fashion with tokens. you need to be creative with it. you can further extend the idea to distill your taste into automatically generating these by agents. further extend to generative ui.
English
0
0
1
18
Henry
Henry@henrytdowling·
@dejavucoder what's an example of an unintuitive thing you can do with a high token budget?
English
1
0
1
10
sankalp
sankalp@dejavucoder·
something i wonder about - both claude and codex get a hell lot of feature requests and they both are not bounded by cost so they can effectively build anything. how do they figure out what to build next? i mean both features and integrations. dont say product taste lmao.
English
13
0
34
4.6K
sankalp
sankalp@dejavucoder·
@thewhiteboxAI i think i heard somdthing on these lines in cat wu podcast with lenny. will check notes
English
1
0
1
55
Ignacio de Gregorio
Ignacio de Gregorio@thewhiteboxAI·
@dejavucoder I'm pretty sure they iterate and dogfood like crazy and see what sticks internally. They can probably prototype in minutes and dogfood the feature for a few days to decide what to do.
English
1
0
1
69
sankalp
sankalp@dejavucoder·
@sama now i regret for not applying lol
English
3
0
35
1.3K
Sam Altman
Sam Altman@sama·
we are gonna do something nice for everyone who applied for the GPT-5.5 party and that we didn't have space for. hope you enjoy!
English
1.2K
158
7.2K
541.4K
sankalp
sankalp@dejavucoder·
@amit05prakash to use reinforce in prod, you need on policy though...
English
1
0
2
182
gtlovell
gtlovell@gtlovell·
@dejavucoder honestly think it's a mix of usage data (what people actually use vs request) and strategic bets on where the market is headed like they probably see patterns in how people hack around missing features, then prioritize those
English
1
0
3
112
sankalp
sankalp@dejavucoder·
what's worth noticing is - once you update the gradients of the policy/model being optimized, the next time u will sample from the rollouts of the model -> you are going to be sampling from older model. so cursor decided to deploy a new checkpoint every 1.5 - 2hours (on-policy)
English
1
0
5
801
sankalp
sankalp@dejavucoder·
sep 12, 2025 blog where cursor first mentioned about online RL (using REINFORCE) #the-policy-gradient" target="_blank" rel="nofollow noopener">cursor.com/blog/tab-rl#th
English
1
0
8
1K