Kritin Vongthongsri

137 posts

Kritin Vongthongsri

@kritinv07

building evals @deepeval @confident_ai

San Fransisco انضم Haziran 2024

34 يتبع146 المتابعون

Kritin Vongthongsri أُعيد تغريده

Brian Neville-O'Neill@bnevilleoneill·1 Nis

I’m running LLM eval office hours today with @confident_ai 🧪 If you’re building anything with AI, drop a prompt + model output, and I’ll show where it breaks. I’ll look at: correctness completeness where it might fail in real use. Just quick, specific feedback #ai #LLM

English

139

Kritin Vongthongsri أُعيد تغريده

DeepEval@deepeval·12 Kas

My sister just got released, DeepTeam v1.0, 100% open-source, Apache 2.0 red teaming for LLMs. ⭐ Star on GitHub to stay on top of the latest developments in AI security and safety: github.com/confident-ai/d…

English

927

Kritin Vongthongsri@kritinv07·15 Eyl

What makes platform UI enterprise ready?

English

180

Kritin Vongthongsri@kritinv07·27 Ağu

@oleg_golev 😂

QME

Oleg Golev@oleg_golev·25 Ağu

Sentient hype train leaves soon, gonna ship shameless AI ads until that happens

English

Kritin Vongthongsri@kritinv07·25 Ağu

Making it so easy to view and evaluate threads/conversations on @confident_ai.

English

636

Kritin Vongthongsri@kritinv07·19 Ağu

Turning to clickpy.clickhouse.com/dashboard/deep… because pypistats.org is gone for good 😢.

English

202

Kritin Vongthongsri@kritinv07·18 Ağu

We're cooking up 👨‍🍳 something for our @confident_ai users...

English

431

Kritin Vongthongsri@kritinv07·18 Ağu

Most people run single-turn evals on chatbots. But that’s not enough. Conversations aren’t Q&A — they happen over multiple turns. This means your chatbot must stay context-aware across the dialogue, not just accurate in isolated responses. @deepeval, we’ve seen too many teams evaluate chatbots the wrong way. So, we wrote a comprehensive guide on how to evaluate all chatbots properly, end-to-end.👇 🔗 deepeval.com/docs/getting-s…

English

442

Kritin Vongthongsri أُعيد تغريده

Jeffrey 🐬 confident-ai.com@jeffr_yyy·14 Ağu

who took down pypi downloads? pypistats.org

English

221

Kritin Vongthongsri@kritinv07·13 Ağu

This is why you shouldn't vibe evals.

Santiago@svpino

Vibe-coding feels like magic. Until you're the one cleaning up the magic later.

English

245

Kritin Vongthongsri@kritinv07·13 Ağu

@LTantichot 🪞

QME

Land Tantichot@LTantichot·13 Ağu

@kritinv07 put the fries in the bag bro

English

Kritin Vongthongsri@kritinv07·12 Ağu

What’s more important in growing oss: building or writing docs?

English

265

Kritin Vongthongsri@kritinv07·13 Ağu

@LTantichot @greptile @soohoonchoi @helicone_ai @coleywoleyyy @0xpunnk @confident_ai @autumnpricing @ay_ushr @helixdb @xav_db @tryrevyl @anamhira @sixtyfourai it's ok to tell your friends you love them

English

307

Land Tantichot@LTantichot·13 Ağu

Top 8 companies I’m going to switch up on when I make it: 1.) @greptile (@soohoonchoi) 2.) @helicone_ai (@coleywoleyyy ) 3.) Conduit (@0xpunnk ) 4.) @confident_ai (@kritinv07) 5.) @autumnpricing (@ay_ushr ) 6.) @helixdb (@xav_db ) 7.) @tryrevyl (@anamhira ) 8.) @sixtyfourai (@saarth_ )

English

212

21.8K

Kritin Vongthongsri@kritinv07·12 Ağu

@LTantichot @spicy_liu @RajitWrites I didnt know u could do this

English

136

Kritin Vongthongsri أُعيد تغريده

Jeffrey 🐬 confident-ai.com@jeffr_yyy·12 Ağu

ten kay stars @deepeval

English

456

Kritin Vongthongsri@kritinv07·12 Ağu

At @confident_ai, we’re focused on making evals great. But since we love our users very much, we’ve also just 5×’d the tracing analytics on our platform. Now you can: 🔍 Trace analytics — follow every request end-to-end ⏱️ Span analytics — see latency and cost per component 📊 Model analytics — compare performance, latency, and cost across models 👥 User analytics — understand usage patterns and behavior ⚠️ Error analytics — track and reduce failures over time

English

497

Kritin Vongthongsri أُعيد تغريده

Jeffrey 🐬 confident-ai.com@jeffr_yyy·10 Ağu

We quietly released a new OS package a few months ago. It has now over 600 stars. github.com/confident-ai/d…

English

161

Kritin Vongthongsri@kritinv07·8 Ağu

🚀 @deepeval just hit 10,000 stars on GitHub. Next stop: 100k ⭐

English

273

Kritin Vongthongsri@kritinv07·8 Ağu

We've built a @langchain integration @confident_ai so you can evaluate your entire agent trace in one extra line of code. ... ok maybe 2 lines of code.

English

189

اكتشف

@confident_ai @oleg_golev @deepeval @LTantichot @greptile @soohoonchoi @helicone_ai @coleywoleyyy