Kevin Cho

197 posts

Kevin Cho

Kevin Cho

@chokevinjs

Engineer | @Microsoft

United States Katılım Nisan 2026
23 Takip Edilen11 Takipçiler
Kevin Cho
Kevin Cho@chokevinjs·
@difficultyang feel like we have barely figured out how to inference max. if we start to focus on token efficiency we might be stunting our growth prematurely
English
0
0
0
7
difficultyang
difficultyang@difficultyang·
As of today, is it better to worry about token efficiency, or is it better to spend as many SOTA tokens as you can trying to figure out how to get as much out of inference scaling as you can today.
English
6
0
8
1.7K
Kevin Cho
Kevin Cho@chokevinjs·
if you are using the copilot app you might find it useful to run /agent-garden. Didn't know about it before but I know it now
English
0
0
0
3
Tibo
Tibo@thsottiaux·
Using computer use, you can ask codex to cancel subscriptions you don't need anymore. Very pleasant to watch. No particular one in mind, works on all of them. chatgpt.com/codex/
English
342
85
3K
246.9K
Kevin Cho
Kevin Cho@chokevinjs·
@dogacel0 what made NCU more confusing for the agent?
English
0
0
0
7
Doğaç
Doğaç@dogacel0·
For profiling I've deliberately prevented agent from using NCU, as it caused more confusion than benefit. Agent profiled its own code using CUDA event markers, so it was able to reason more concerently.
English
2
0
5
390
Doğaç
Doğaç@dogacel0·
Excited to share I placed #1 (twice!) at the MLSys 2026 × NVIDIA FlashInfer AI Kernel Generation Contest, on the DeepSeek Sparse Attention track 🥇 The best part is my AI agent standalone beat every human competitor, showing the strength of self-improving agents.
Doğaç tweet media
Yixin Dong@yi_xin_dong

🚀 The wait is over! Today at #MLSys, we'll give a talk to reveal the final results and present the awards for the FlashInfer AI GPU Competition! 🏆 I'll also introduce FlashInfer-Bench: an agent-oriented Benchmark Engine designed for production kernels. Join us from 11:00 AM - 1:00 PM PT to see who takes the crown and learn more. Everyone is welcome to attend—see you there! ✨ 🌐 Competition & Results: mlsys26.flashinfer.ai 💻 FlashInfer-Bench Benchmark Engine: github.com/flashinfer-ai/… #FlashInfer #MLSys26 #AI #GPU

English
12
21
229
21K
Kevin Cho
Kevin Cho@chokevinjs·
how are the people using 5.5 none piloting? Feels like it just stops way before I would like it to.
English
0
0
0
3
Kevin Cho
Kevin Cho@chokevinjs·
double the h200s double the fun
English
0
0
0
3
Kevin Cho
Kevin Cho@chokevinjs·
I just wasted a bunch of time because what I thought I was communicating to the model was actually not even close to what I wanted.
English
0
0
0
15
Kevin Cho
Kevin Cho@chokevinjs·
I know model routing isn't there yet because how is it I can get stuck on a task using 5.3-codex and yet it just works on 5.5?
English
0
0
0
6
Kevin Cho
Kevin Cho@chokevinjs·
rl on k8s cant be so bad right?
English
0
0
0
7
Kevin Cho
Kevin Cho@chokevinjs·
Free market
Kevin Cho tweet media
English
0
0
0
15
Tibo
Tibo@thsottiaux·
tok tok goes the token silicon dreams wake in sand machines ask why now
English
92
17
770
62.3K
Kevin Cho
Kevin Cho@chokevinjs·
why is wandb the only solution out there? Do they just have everyone in a chokehold?
English
0
0
0
20
Kevin Cho
Kevin Cho@chokevinjs·
@thsottiaux This is like the gateway drug to AI Psychosis. usage limit resets
English
0
0
1
426
Kevin Cho
Kevin Cho@chokevinjs·
how are people generating their pretty graphs for loss
English
0
0
0
17
Kevin Cho
Kevin Cho@chokevinjs·
2x banger
Kevin Cho tweet media
Indonesia
0
0
0
14
Kevin Cho
Kevin Cho@chokevinjs·
@Leik0w0 ya love when it says its working correctly but its still broken
English
0
0
0
18
Léo
Léo@Leik0w0·
Love when opus does its happy dance once it got something to work 🎉 **feature works correctly !**
English
2
0
9
352
Kevin Cho
Kevin Cho@chokevinjs·
@blelbach Somehow the integrations with Gemini with Gmail and copilot with outlook can’t even do the simple things in this day and age. Maybe it’s time for agentic email 😂
English
0
0
2
261
Bryce, the CUDA Colonel
Bryce, the CUDA Colonel@blelbach·
Seriously Gemini... I have never successfully uses Gemini to do anything involving a Google product.
Bryce, the CUDA Colonel tweet media
English
6
2
57
5K
Kevin Cho
Kevin Cho@chokevinjs·
I'm surprised but also not surprised that voice models are such a sought after feature. With vibecoding pushing more accessibility to non-dev roles the average WPM naturally will go down.
English
0
0
0
11
Jamon
Jamon@jamonholmgren·
Unpopular opinion: 1- or 2-letter variable names in focused, obvious contexts are totally fine.
Jamon tweet media
English
156
11
388
621.1K