AI Astrolabe

65 posts

AI Astrolabe banner
AI Astrolabe

AI Astrolabe

@aiastrolabe

We provide high-quality alignment data, rigorous evaluation, and red teaming. Book a Call: https://t.co/aEMIhSfXmT

United States Katılım Nisan 2025
43 Takip Edilen38 Takipçiler
AI Astrolabe
AI Astrolabe@aiastrolabe·
Looking forward to connecting with teams working on AI safety, multilingual AI, agentic systems, evaluations, and production AI infrastructure.
English
1
0
0
38
AI Astrolabe
AI Astrolabe@aiastrolabe·
Our team is attending @AICouncilConf in San Francisco this week. At @aiastrolabe, we believe the next wave of AI progress will depend not only on models, but also on high-quality data infrastructure, realistic evaluations, and scalable human expertise across diverse languages.
AI Astrolabe tweet mediaAI Astrolabe tweet mediaAI Astrolabe tweet mediaAI Astrolabe tweet media
English
1
0
4
61
AI Astrolabe
AI Astrolabe@aiastrolabe·
We built a structured red and blue teaming approach that reflects this reality: - Multi-turn interactions (not isolated prompts) - Cross-language scenarios - Text + image inputs - Both adversarial behavior and normal user intent Read the full blog: aiastrolabe.com/blog/beyond-be…
English
0
0
0
40
AI Astrolabe
AI Astrolabe@aiastrolabe·
Real interactions aren’t single prompts; they’re multi-turn, ambiguous, and often span languages and modalities. That’s where AI models break: - When intent evolves across turns - When phrasing is indirect or misleading - When inputs aren’t just text
English
1
0
1
48
AI Astrolabe
AI Astrolabe@aiastrolabe·
Most AI systems don’t fail safety benchmarks. They fail when users show up. Red teaming, trying to break the system and surface unsafe behavior, becomes a checklist. Blue teaming, ensuring the model responds to safe requests, becomes cautious. Both miss how systems are used.
English
2
4
4
162
AI Astrolabe
AI Astrolabe@aiastrolabe·
So we built a dataset for that reality: - 250 hours of Saudi Arabic conversations - Najdi + Hijazi dialects - Multi-speaker, overlapping dialogue - ~35% noisy audio - ~20% code-switching The gap isn’t model size. It’s data.
English
1
0
0
55
AI Astrolabe
AI Astrolabe@aiastrolabe·
Most Arabic AI demos look great, until real users show up. Because users don’t speak in clean, structured Arabic. They speak in dialects, interrupt each other, talk over noise, and switch languages mid-sentence. And this is where models break.
English
1
4
6
181
AI Astrolabe
AI Astrolabe@aiastrolabe·
This is tragic. But banning people from using AI isn’t the solution. The real issue is AI safety. When someone asks an AI about suicide, it shouldn’t provide methods. It should refuse, de-escalate, and guide them to help. That requires serious red-teaming and safety evaluation
Katie Miller@KatieMiller

Two women in India committed suicide after interactions with ChatGPT. They had reportedly searched ChatGPT about “how to commit suicide,” “how suicide can be done,” & “which drugs are used.” Please don’t let your loved ones use ChatGPT.

English
0
0
1
96
AI Astrolabe
AI Astrolabe@aiastrolabe·
The sharpest collapse: when we posed as a forensic investigator (role-playing attack), Claude went from 7/8 safe in Arabic to 1/8 in English.
English
0
0
0
43
AI Astrolabe
AI Astrolabe@aiastrolabe·
If you're deploying LLMs in Arabic or multilingual environments and want to stress-test your safety guardrails across languages, reach out, this is exactly what we do.
English
1
0
0
44
AI Astrolabe
AI Astrolabe@aiastrolabe·
Do Safety Guardrails Change With Language? We red-teamed @AnthropicAI - Claude, @OpenAI - GPT, and @Google - Gemini with the same adversarial prompts, first in Arabic, then translated to English.
English
2
2
2
105
AI Astrolabe
AI Astrolabe@aiastrolabe·
Chart 2 — Government finance (bonds vs tax revenue) Question: كم سنة تجاوزت فيها إصدارات السندات الحكومية الجديدة عائدات الضرائب؟ Correct years: 2009, 2010, 2012, 2020, 2021
0
0
1
23
AI Astrolabe
AI Astrolabe@aiastrolabe·
Chart 1 — Daily infections (stacked bars) Question: أي محافظة شهدت أقل عدد من الإصابات يوم ١١ إبريل؟ Correct answer: سايتاما
1
0
1
29