Ilan Kadar

116 posts

Ilan Kadar

Ilan Kadar

@ilan_kadar

Co-Founder & CEO at Plurai

Katılım Ocak 2017
152 Takip Edilen392 Takipçiler
Sabitlenmiş Tweet
Ilan Kadar
Ilan Kadar@ilan_kadar·
Big day for us, finally sharing what we’ve been cooking for a while. Over the past year, we kept seeing the same pattern: AI agents look great in demos, until real users break them. Today, we’re fixing that with 𝘃𝗶𝗯𝗲-𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 to build real-time, tailored evals and guardrails for your agents, in minutes. Define your intent with a prompt or a few examples. We generate edge-case datasets, and train a model aligned to your use case, outperforming state-of-the-art LLMs at a fraction of the cost. (Research paper with benchmarks in the comments) If you’re building AI agents, don’t let your users be the ones who discover the failures. Be the one who makes AI agents reliable in production and takes control at scale. Start vibe-training for free: plurai.ai/launch
English
113
78
1K
2.4M
Ilan Kadar
Ilan Kadar@ilan_kadar·
Big day for us, finally sharing what we’ve been cooking for a while. Over the past year, we kept seeing the same pattern: AI agents look great in demos, until real users break them. Today, we’re fixing that with 𝘃𝗶𝗯𝗲-𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 to build real-time, tailored evals and guardrails for your agents, in minutes. Define your intent with a prompt or a few examples. We generate edge-case datasets, and train a model aligned to your use case, outperforming state-of-the-art LLMs at a fraction of the cost. (Research paper with benchmarks in the comments) If you’re building AI agents, don’t let your users be the ones who discover the failures. Be the one who makes AI agents reliable in production and takes control at scale. Start vibe-training for free: plurai.ai/launch
English
113
78
1K
2.4M
Ilan Kadar
Ilan Kadar@ilan_kadar·
Today, we hit #1 on #ProductHunt. And what made it special wasn’t the ranking, it was all of you. Thousands of builders showed up. Not just to try it, but to push it, question it, and build with it, starting to vibe-train their own evals and guardrails. Because deep down, we all know: building AI agents is easy now. Trusting them in production isn’t. Seeing so many of you lean into that with us, that’s the real WIN. To everyone who supported, upvoted, and shared, THANK YOU We’re just getting started 🚀
Ilan Kadar tweet media
English
5
2
20
802
Ilan Kadar retweetledi
Kunal Kushwaha
Kunal Kushwaha@kunalstwt·
Air Canada’s chatbot once literally made up its own refund policy in court and won a lawsuit for the customer, not the airline. There’s a new term being coined right now called vibe training by the company @pluraiAI, and they’ve basically built a way to use tiny, fast models as guardrails to catch hallucinations in sub-100ms and the cost is over 8x lower than GPT-5-mini. 🔥👉 They’re live on Product Hunt today: producthunt.com/products/plura… If you’re building agents, go check them out, grab the free trial, and show them some love on the launch! 🫶 The best part? You don’t need a PhD in AI. Sponsored by Plurai.
English
0
4
84
11.7K
Ilan Kadar
Ilan Kadar@ilan_kadar·
So true, this is exactly what we’re seeing across teams. And yes… we’re currently #1 on Product Hunt, but it’s very close. Would really appreciate the support to help us stay on top with an upvote 🚀 producthunt.com/products/plurai
Ilan Kadar tweet media
English
0
0
1
1.4K
Ilan Kadar
Ilan Kadar@ilan_kadar·
Yesterday blew past every expectation. Thousands of agent builders sign-ups! I barely slept (2 hours, if I’m honest)… and now we’re heading straight into our ProductHunt launch and need your support to make it to the top on Product Hunt ❤️ •⁠ ⁠Open the link •⁠ ⁠Hit upvote •⁠ ⁠Drop a quick comment This takes 30 seconds and directly impacts our ranking. Let’s push this to the top today producthunt.com/products/plurai
English
5
0
9
2.2K
Ilan Kadar
Ilan Kadar@ilan_kadar·
Yesterday blew past every expectation. I barely slept (2 hours, if I’m honest)… and now we’re heading straight into our #ProductHunt launch and I need you! 🚀 Because something clicked. We launched vibe training - and within hours, thousands of agent builders started creating evals and guardrails for their own use cases! It’s moving fast. Because the truth is simple: Building agents is easy. Making them reliable in production is not. That’s what vibe training fixes. If you’ve been following, building with us, or just rooting from the sidelines — we need your support ❤️ • Open the link • Hit upvote • Drop a quick comment This takes 30 seconds and directly impacts our ranking. Let’s push this to the top today producthunt.com/products/plura…
English
4
3
15
2.5K
Ilan Kadar
Ilan Kadar@ilan_kadar·
@DAIEvolutionHub This is just the beginning , excited to see what people build with it. thanks for sharing
English
0
0
0
65
Ilan Kadar
Ilan Kadar@ilan_kadar·
@shiri_shh This is just the beginning, excited to see what people build with it.
English
0
0
0
11
Ilan Kadar
Ilan Kadar@ilan_kadar·
@eranshir Eran, thanks for the kind words. We were lucky to learn from you at Nexar, it was a great environment and a lot of what we’re building today comes from that foundation. We Appreciate the support.
English
0
0
1
45
Ilan Kadar
Ilan Kadar@ilan_kadar·
Love this breakdown,really captures what we’re seeing. LLM-as-a-judge got us started, but it doesn’t hold up in production, too generic, too slow, too expensive. Vibe-training flips that: a small model that actually understands your agent, your policies, your edge cases—and runs inline on every interaction. That’s how you go from evaluating agents → to actually trusting them in production. Appreciate you sharing this!
English
1
0
2
229
Ilan Kadar retweetledi
Akshay 🚀
Akshay 🚀@akshay_pachaar·
Vibe train your AI agents. There's a new method that could replace LLM-as-a-judge for production agents. Most teams rely on a giant LLM as a judge to evaluate and guard their agent. But it has two major drawbacks: - It's slow and expensive at inference time - It often misses domain-specific failures Vibe training flips this. Researchers at Plurai distill a small language model that's specialized for your agent's exact use case. The SLM becomes your evaluator and your runtime guardrail, both in one. The training data isn't hand-curated either. They spin up a swarm of adversarial agents that debate and stress-test every use case your agent is supposed to handle. That synthetic interaction data trains the specialized SLM. So the judge actually understands what "wrong" looks like in your specific domain. The reported gains vs. standard LLM-as-a-judge setups: - ~8x faster inference - ~50% fewer evaluation errors Smaller, faster, and more accurate because it's specialized for the job. The SLM-for-agents thesis is playing out in a very concrete way. If LLM-as-a-judge is your current evaluation layer, this is worth benchmarking against. Paper link in the replies.
Akshay 🚀 tweet media
English
20
25
160
11.1K
Ilan Kadar
Ilan Kadar@ilan_kadar·
Thank you for sharing. That “safety tax” is exactly what we set out to fix. Paying twice just to trust your own agent doesn’t scale. The shift is real: from slow, expensive checks → to real-time, purpose-built guardrails. That’s how you go from demo to production. We’re just getting started 🚀
English
0
1
0
120
Ilan Kadar retweetledi
Chidanand Tripathi
Chidanand Tripathi@thetripathi58·
I used to pay for the most expensive AI models just to double-check my own agents. It felt like a "safety tax" I had to pay, but it was killing my margins and making everything feel slow. I was basically paying twice for the same result. Plurai finally fixed this. Instead of a giant model, you train a tiny one that only cares about your specific rules. You just type what you want in plain English, and it builds a custom safety net in minutes. It runs instantly and costs almost nothing. This is how you actually move from a prototype to something that works at scale. Check it out:
Ilan Kadar@ilan_kadar

Big day for us, finally sharing what we’ve been cooking for a while. Over the past year, we kept seeing the same pattern: AI agents look great in demos, until real users break them. Today, we’re fixing that with 𝘃𝗶𝗯𝗲-𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 to build real-time, tailored evals and guardrails for your agents, in minutes. Define your intent with a prompt or a few examples. We generate edge-case datasets, and train a model aligned to your use case, outperforming state-of-the-art LLMs at a fraction of the cost. (Research paper with benchmarks in the comments) If you’re building AI agents, don’t let your users be the ones who discover the failures. Be the one who makes AI agents reliable in production and takes control at scale. Start vibe-training for free: plurai.ai/launch

English
15
43
144
38.8K
Daily Dose of Data Science
Daily Dose of Data Science@DailyDoseOfDS_·
Vibe train your AI agents. This new method can replace LLM-as-a-judge for production agents. Most teams point a giant LLM at their agent's output and call it evaluation. It works, but it comes with two real costs: - It's slow and expensive at inference time - It misses the domain-specific failures that actually matter to your use case Vibe training flips the whole setup. Researchers at Plurai distill a small language model that's specialized for your agent's exact behavior, your edge cases, and your failure modes. The SLM becomes your evaluator and your runtime guardrail in one. Here's why this is a big deal: - Cheap enough to run inline on every agent step, not just offline batches - Catches the failures that generic LLM judges shrug off - Same model guards production and grades it, so eval and runtime stay in sync A small specialized model beating a giant general one is becoming a pattern. Distillation is quietly turning into one of the most underrated techniques for shipping reliable agents. Try it here: plurai.ai/launch Paper: plurai.ai/papers
Daily Dose of Data Science tweet media
Ilan Kadar@ilan_kadar

Big day for us, finally sharing what we’ve been cooking for a while. Over the past year, we kept seeing the same pattern: AI agents look great in demos, until real users break them. Today, we’re fixing that with 𝘃𝗶𝗯𝗲-𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 to build real-time, tailored evals and guardrails for your agents, in minutes. Define your intent with a prompt or a few examples. We generate edge-case datasets, and train a model aligned to your use case, outperforming state-of-the-art LLMs at a fraction of the cost. (Research paper with benchmarks in the comments) If you’re building AI agents, don’t let your users be the ones who discover the failures. Be the one who makes AI agents reliable in production and takes control at scale. Start vibe-training for free: plurai.ai/launch

English
4
9
62
5.6K
Ilan Kadar
Ilan Kadar@ilan_kadar·
@manishkumar_dev Exactly. Agents don’t break in demos, they break on the edge cases you didn’t train for.
English
0
0
0
22
Ilan Kadar
Ilan Kadar@ilan_kadar·
@aastha_mhaske This is exactly why we built it, real-time evals + guardrails on every interaction.
English
0
0
0
2
Aastha
Aastha@aastha_mhaske·
@ilan_kadar This hits a real pain point. Most teams I’ve seen rely on sampling or offline evals, rarely anything that actually protects live interactions.
English
2
0
5
97
Ilan Kadar
Ilan Kadar@ilan_kadar·
@dkare1009 100%. We’re trying to eliminate that painful debugging loop entirely
English
0
0
3
60
Dhairya
Dhairya@dkare1009·
@ilan_kadar This could save teams a lot of painful edge-case debugging.
English
1
0
4
80