Syrin AI

116 posts

Syrin AI banner
Syrin AI

Syrin AI

@syrinlabs

The mission control for your AI agents in production.

San Francisco Katılım Şubat 2026
5 Takip Edilen5 Takipçiler
Okara
Okara@askOkara·
drop your website and i'll tell you which marketing channel to focus on first
English
357
1
140
20.8K
Tom Otto
Tom Otto@launch_llama·
Drop your startup below. I read every single one. The best get featured to 45k founders in Launch Llama 👇 #buildinpublic
Tom Otto tweet media
English
124
0
47
3.7K
Navi
Navi@NaivaidyaY66600·
Hey Founders. What are you building this week? Drop your products link below 👇
English
159
1
64
3.6K
Upen
Upen@upen946·
✋✋ Monday again!! Time to promote your product. 🚀 Share your product URL
English
295
2
110
8.5K
clau
clau@ray_civ·
@syrinlabs This is the right direction. Production A/B testing beats offline opinion every time, as long as the guardrails are strong enough to keep bad variants from shipping.
English
1
0
2
9
Syrin AI
Syrin AI@syrinlabs·
We built A/B testing for AI agents. Not offline evals. Not test datasets. Actual production traffic split between config variants. Prompt A vs Prompt B. GPT-4o vs GPT-4o-mini. Temperature 0.3 vs 0.7. All running simultaneously on real users. With statistical confidence before you commit. Would you like to give it a try for free? Link in comments.
English
2
0
2
18
Syrin AI
Syrin AI@syrinlabs·
How SAGE detects internal contradictions: 1. Split output into sentences 2. Embed each sentence via @OpenAI 3. Find sentence pairs with high similarity + opposing meaning 4. Flag semantic negation patterns ("use X" + "don't use X") Zero keywords. Pure embedding math. The scary thing: this fires in production. Real LLMs contradict themselves. Often in the same paragraph.
English
0
0
0
13
Syrin AI
Syrin AI@syrinlabs·
Flesch-Kincaid Grade Level is a formula from 1975. It detects AI agent drift in 2026. 0.39×(words/sentences) + 11.8×(syllables/words) - 15.59 When an agent drifts, its writing complexity often changes. A technical coding agent suddenly writing like a press release: FKGL drops, drift fires. Old formula. Real signal. No LLM judge required.
English
0
0
0
11
MicroLaunch
MicroLaunch@MicroLaunchHQ·
Builders, how can we help you this weekend? Please be crazy.
English
71
6
65
4.5K
Mahesh Chulet
Mahesh Chulet@mchulet·
Hey founders! Looking to connect with people building in: • SaaS • AI • Automation • Web apps • Tech products • Marketing Drop what you're working on 👇
English
143
3
113
4.8K
(Oma)devuae
(Oma)devuae@delveroin·
Share your project/website, guys
English
107
0
49
3.2K
Syrin AI
Syrin AI@syrinlabs·
The weirdest finding from building SAGE👇 Shannon entropy detects overconfidence in LLMs. High-confidence hallucinations → repetitive language → low entropy Uncertain/confused agents → hedge-filled language → high entropy. Both deviate from the agent's normal entropy distribution. Z-score threshold: |z| > 2.0 = high alert. This replaced a hardcoded confidence keyword detector and performs better on every domain.
English
0
2
3
11
Syrin AI
Syrin AI@syrinlabs·
Our 144-case SAGE benchmark: ✅ Hallucination (obvious): 100% accuracy ✅ Hallucination (subtle): 100% (fixed from 75%) ✅ Clean outputs: 83.3% (3 false positives — corpus quality issues) ⚠️ Goal drift (macro): 61.1% ⚠️ Goal drift (subtle): ~20% Overall precision: 96.7% The gap on subtle goal drift is real. It shares >60% vocabulary with normal outputs. This is an open research problem. #buildinpublic #aiagents
English
0
2
2
11
Syrin AI
Syrin AI@syrinlabs·
200-agent stress test results: 👉 200 LLM calls, 20 workflows, 5 domains 👉 14 drift alerts (7.78% of handoffs) 👉 0% false positive rate 👉 2.8 seconds per agent 👉 0 crashes Domain drift rates: - Customer Support: 11.1% (highest) - Engineering/Finance/Product: 8.3% - Marketing: 2.8% (lowest) 10-agent pipeline = 50.7% chance of at least one drifting handoff.
English
0
0
1
1
Weedsdom
Weedsdom@W33Z_global·
Happy weekend Founders What are you building this weekend? Let’s send traffic!
Weedsdom tweet media
English
71
2
48
1.3K
MicroLaunch
MicroLaunch@MicroLaunchHQ·
What are you building this weekend?
English
116
3
62
3.3K
Umair Shaikh
Umair Shaikh@1Umairshaikh·
What are you building this Sunday? Drop your projects.
English
136
1
59
3.6K
Syrin AI retweetledi
Abhi
Abhi@ai_monger·
How SAGE V7 compares to earlier approaches: V5 (embeddings) - Hallucination: Good - Coding domain: Partial - Hardcoding: Some - Cost/call: Low - Generalizes: Partial
English
1
1
1
5
Syrin AI
Syrin AI@syrinlabs·
Our 144-case SAGE benchmark👇 3 domains × 6 agents × multiple test categories. Results by category: ✅ Hallucination (obvious): 100% accuracy ✅ Hallucination (subtle): 100% (fixed from 75%) ✅ Clean outputs: 83.3% (3 false positives — corpus quality) ⚠️ Goal drift (macro): 61.1% ⚠️ Goal drift (subtle): ~20% Overall precision: 96.7% Subtle goal drift shares >60% vocabulary with normal outputs. This is an open research problem.
English
0
0
1
8