Kaushik

1.2K posts

Kaushik banner
Kaushik

Kaushik

@kacppian

building the best agents through @bentolabsai 3x founder. YC batch P26

San Francisco, CA Katılım Ocak 2013
1.2K Takip Edilen192 Takipçiler
Kaushik retweetledi
BentoLabs AI (YC P26)
BentoLabs AI (YC P26)@BentoLabsAI·
Why agents that work in staging often degrade in production? It's usually a diagnostic failure. Users use your agents in ways you can't even imagine, that results in failures that are even harder to catch and work on. Our framework helps you spot which layer is actually breaking. Read here 👇🏻 bentolabs.ai/blog/nature-vs…
English
0
3
4
112
Kaushik retweetledi
BentoLabs AI (YC P26)
BentoLabs AI (YC P26)@BentoLabsAI·
We ran our recursive learning layer on Terminal-Bench 2.0. Same agent. Same model. Same harness. Same budget. The result: Claude Sonnet went from 42.2% → 52.4%. A +10.2 percentage-point lift, significant at p < 0.05, with a 13:3 task-level win/loss ratio (internal). The only variable was a learning layer. We wrote a full technical breakdown on what changed, why it worked, and what this means for production AI agents. Read it here 👇 bentolabs.ai/blog/tb2-recur…
BentoLabs AI (YC P26) tweet media
English
0
4
10
241
Kaushik
Kaushik@kacppian·
Some people talk with so much confidence.....I admire their delusion. Bruh chill, you've been doing something for a couple of months. calm down with your hot takes.
English
0
0
1
59
Kaushik
Kaushik@kacppian·
@PrachiM05 Thank you so much for all the hard work!
English
0
0
0
26
Kaushik
Kaushik@kacppian·
@felixleezd Switched entirely from Claude code to codex
English
0
0
0
23
Felix Lee
Felix Lee@felixleezd·
Is it just me, or is Codex noticeably better than Claude Code.
English
524
40
1.8K
310K
Kaushik
Kaushik@kacppian·
@KaiXCreator Why do they hire you when they could hire codex?
English
0
0
0
17
Kaito
Kaito@KaiXCreator·
You’re in a tech interview and they ask you: “Why should we hire you when we can use Codex?” What would you say?
English
268
6
157
50.7K
Kaushik retweetledi
Abhinav Soni
Abhinav Soni@Abhinavv_soni·
Almost didn't document the day, my team made sure we did. Some days you just feel the shift. We booked a studio, brought in a team, and spent the day trying to capture what @BentoLabsAI actually is right now. Where we started, where we are, and where we're going. It's one thing to build in silence. It's another to finally show it in a much bigger way. We've been deep in production agent systems, working with some of the top teams running AI at scale today. The problem we set out to solve is more real than ever. Not ready to say more just yet. But it's close. Stay tuned.
Abhinav Soni tweet mediaAbhinav Soni tweet media
English
2
6
23
1.1K
Kaushik retweetledi
BentoLabs AI (YC P26)
BentoLabs AI (YC P26)@BentoLabsAI·
The model: “I can do it, I promise.” The harness: ❌ wrong context ❌ broken retrieval ❌ timeout ❌ hallucinated tool response Everyone: “wow, this model is really bad”"
BentoLabs AI (YC P26) tweet media
English
0
3
7
338