pushkar

2.9K posts

pushkar banner
pushkar

pushkar

@thepushkarp

• ml engineer, ex @samsungresearch • been exploring kernel optimization, inference engineering and agents lately • dms open • https://t.co/SqN3iEsUUF

Bangalore, India Katılım Eylül 2023
1K Takip Edilen1.6K Takipçiler
pushkar
pushkar@thepushkarp·
@mav3ri3k companies that make swe agents benefit from working closely with the teams who are building runtimes. openai's agent stack is heavy on python + rust, while ant is heavy on js.
English
0
0
2
84
Apurva Mishra
Apurva Mishra@mav3ri3k·
@thepushkarp But why are ai companies interested in buying runtimes ? Been hearing that openai also wants to buy deno
English
1
0
1
129
pushkar
pushkar@thepushkarp·
the top researchers and engineers have access to the exactly same frontier models as a mid-level engineer, but the gap between them is still the same, if not wider, because ai was never really the differentiator.
English
3
1
24
1.5K
pushkar
pushkar@thepushkarp·
also, I miss burning my tokens
English
0
0
1
65
pushkar
pushkar@thepushkarp·
how do i politely tell my friend that it's probably not a good idea to include me in his plans because i feel like i'm third wheeling
English
2
0
6
288
pushkar
pushkar@thepushkarp·
i'm loving building tiny cli tools and goofy stuff for myself because i just can
English
0
0
6
120
pushkar
pushkar@thepushkarp·
attention to each expert in moe next?
English
0
0
10
450
pushkar retweetledi
tokenbender
tokenbender@tokenbender·
I wrote something on Moonshot's latest research release - Attention Residuals. Intuition, notes and how you can understand standard residuals vs mHC vs attention residuals.
tokenbender@tokenbender

x.com/i/article/2033…

English
4
10
165
19.7K
himanshu
himanshu@himanshustwts·
Career update: Excited to share that I have joined the incredible team at @smallest_AI to work on Research x Devrel! The team is cooking incredible small + efficient multi-modal models and it feels like an exciting time to push the frontier on scale!
himanshu tweet media
English
205
32
1.8K
59.4K
himanshu
himanshu@himanshustwts·
@thepushkarp i am curious to know more about differentiating parameters
English
1
0
2
182
Praneeth
Praneeth@ExpressGradient·
@thepushkarp none the gpumode lectures could help but not much best way is turn the claude opus to max efforts and let it write docs by reading cute dsl source code
English
1
0
1
45
pushkar
pushkar@thepushkarp·
chat, what are some good resources to learn cute-dsl that you found really useful?
English
1
0
8
605
pushkar
pushkar@thepushkarp·
@tokenbender the same thing exactly happened with me yesterday. i don't remember how much it compressed, but it chose to obey the harness more than what I wanted it to do and just kept getting dumber every turn
English
0
0
3
42
tokenbender
tokenbender@tokenbender·
@thepushkarp compressed 220k context to 50k without reason and the agent just lost any reference and understanding of my preferences in how i want it to work.
English
1
0
6
131
tokenbender
tokenbender@tokenbender·
how uninstalling opencode-dcp feels like after it has lost its way
tokenbender tweet media
English
5
0
29
2K