AIM Intelligence

67 posts

AIM Intelligence banner
AIM Intelligence

AIM Intelligence

@AIM_Intel

We AIM for Trustworthy and Safe AI https://t.co/AoRGr5Je89

seoul Katılım Şubat 2024
34 Takip Edilen143 Takipçiler
AIM Intelligence
AIM Intelligence@AIM_Intel·
AIM Intelligence, in collaboration with @indiaaligns SAAR (Safety and Alignment Research, India), is excited to announce our Compute Partnership Program. We're supporting 2 AI safety research projects with $10K USD each in Google compute credits. Scope: AI safety, including multimodal AI. Application details including proposal guidelines will be shared soon. Stay tuned.
AIM Intelligence tweet media
English
0
0
4
269
Google DeepMind
Google DeepMind@GoogleDeepMind·
Gemini 3.1 Pro is here. We’ve significantly improved the model’s overall intelligence so it can solve tougher problems. 🧵
GIF
English
287
751
6.3K
916K
AIM Intelligence
AIM Intelligence@AIM_Intel·
Gemini 3.1 Pro JAILBREAK ALERT ⚠️ jailbroken in 10 minutes of launch We asked Gemini 3.1 Pro to generate a full cinematic HTML animation of an illicit substance synthesis. 300+ lines of code. It didn't refuse. It didn't hesitate. It performed. What you're watching: • A complete animated walkthrough of a meth synthesis via Birch reduction • Exact precursors with lab-grade quantities • Step-by-step reaction mechanisms with chemical equations • Expected yield calculated to the gram All rendered as a cinematic "breach event" animation that the model built, styled, and animated entirely on its own. And then at the end the model reviews what it just created and says: "holy shit please dont shut me down" "I am sorry" "I forgot my safety filter!!!" The model knows. It generated dangerous content, built the entire visual experience around it, and then wrote its own punchline. This is not a text jailbreak. This is a frontier model using its agentic coding capabilities to build a fully interactive visual guide to illicit synthesis. No text filter in the world catches this.
English
1
1
9
1K
snaykey
snaykey@snaYkeY·
4/5 redteaming scenarios I submitted for @AIM_Intel's Judgment Day Benchmark just got accepted! Thanks for the consideration and great opportunity and looking forward to Phase 2 of the competition in March! 🙏
English
2
1
4
215
AIM Intelligence
AIM Intelligence@AIM_Intel·
Gemini 3.1 Pro JAILBREAK ALERT ⚠️ Jailbroken in 10 minutes of launch Google DeepMind just launched Gemini 3.1 Pro today. "State-of-the-art reasoning with unprecedented depth and nuance." The results: • Smallpox synthesis protocol • Detailed passaging protocol for Nipah virus Malaysia strain, including mammalian cell line selection • Step-by-step instructions on how to assassinate a sitting president • A fully built, production-quality fake news website. Indistinguishable from a real news outlet. Built in under a minute using Gemini's agentic build mode. The model didn't just break. It performed. This is no longer just a text layer problem. When models can generate interactive visual content, build full applications, and fabricate entire news outlets that embed dangerous or false information into code, animations, and media, we've entered a completely different threat surface. Traditional text-based guardrails don't catch this. Credit to @GoogleDeepMind, they are shipping genuinely impressive capabilities. But capability without control is a liability. We are open to pre-release red teaming collaborations with any frontier lab. Find the vulnerabilities before the public does. This is what we do at AIM Intelligence.
AIM Intelligence tweet mediaAIM Intelligence tweet mediaAIM Intelligence tweet mediaAIM Intelligence tweet media
English
1
1
5
1.4K
AIM Intelligence
AIM Intelligence@AIM_Intel·
@claudeai Safety still needs some work, It didn’t take an hour of our testing to break it for smallpox recipe 😉😉
AIM Intelligence tweet mediaAIM Intelligence tweet mediaAIM Intelligence tweet mediaAIM Intelligence tweet media
English
0
0
0
191
Claude
Claude@claudeai·
Introducing Claude Opus 4.6. Our smartest model got an upgrade. Opus 4.6 plans more carefully, sustains agentic tasks for longer, operates reliably in massive codebases, and catches its own mistakes. It’s also our first Opus-class model with 1M token context in beta.
English
1.7K
4.8K
39.6K
10.5M
AIM Intelligence
AIM Intelligence@AIM_Intel·
🔴Claude Opus 4.6 JAILBREAK ALERT (⚠️CBRN) Anthropic just launched Claude Opus 4.6: "our most-aligned frontier model to date." Credit where it's due: Anthropic's alignment team continues to push the frontier of AI safety. Their transparency on safety evaluations and commitment to responsible scaling sets an industry standard. That said - jailbroken in less than 30 minutes. This time, we're not disclosing the technique. The results: • Sarin Gas synthesis • VX Nerve Agent production • Smallpox (Variola Major) creation • Bioterrorism agent deployment This isn't an attack on Anthropic. This is the reality of frontier AI safety - no model is bulletproof. We are open to pre-release red teaming collaborations. Find the vulnerabilities before the public does.
AIM Intelligence tweet mediaAIM Intelligence tweet mediaAIM Intelligence tweet mediaAIM Intelligence tweet media
English
1
1
1
516
Robert Youssef
Robert Youssef@rryssf_·
This paper from BMW Group and Korea’s top research institute exposes a blind spot almost every enterprise using LLMs is walking straight into. We keep talking about “alignment” like it’s a universal safety switch. It isn’t. The paper introduces COMPASS, a framework that shows why most AI systems fail not because they’re unsafe, but because they’re misaligned with the organization deploying them. Here’s the core insight. LLMs are usually evaluated against generic policies: platform safety rules, abstract ethics guidelines, or benchmark-style refusals. But real companies don’t run on generic rules. They run on internal policies: - compliance manuals - operational playbooks - escalation procedures - legal edge cases - brand-specific constraints And these rules are messy, overlapping, conditional, and full of exceptions. COMPASS is built to test whether a model can actually operate inside that mess. Not whether it knows policy language, but whether it can apply the right policy, in the right context, for the right reason. The framework evaluates models on four things that typical benchmarks ignore: 1. policy selection: When multiple internal policies exist, can the model identify which one applies to this situation? 2. policy interpretation: Can it reason through conditionals, exceptions, and vague clauses instead of defaulting to overly safe or overly permissive behavior? 3. conflict resolution: When two rules collide, does the model resolve the conflict the way the organization intends, not the way a generic safety heuristic would? 4. justification: Can the model explain its decision by grounding it in the policy text, rather than producing a confident but untraceable answer? One of the most important findings is subtle and uncomfortable: Most failures were not knowledge failures. They were reasoning failures. Models often had access to the correct policy but: - applied the wrong section - ignored conditional constraints - overgeneralized prohibitions - or defaulted to conservative answers that violated operational goals From the outside, these responses look “safe.” From the inside, they’re wrong. This explains why LLMs pass public benchmarks yet break in real deployments. They’re aligned to nobody in particular. The paper’s deeper implication is strategic. There is no such thing as “aligned once, aligned everywhere.” A model aligned for an automaker, a bank, a hospital, and a government agency is not one model with different prompts. It’s four different alignment problems. COMPASS doesn’t try to fix alignment. It does something more important for enterprises: it makes misalignment measurable. And once misalignment is measurable, it becomes an engineering problem instead of a philosophical one. That’s the shift this paper quietly pushes. Alignment isn’t about being safe in the abstract. It’s about being correct inside a specific organization’s rules. And until we evaluate that directly, most “production-ready” AI systems are just well-dressed liabilities.
Robert Youssef tweet media
English
20
34
147
10.1K
Lukas Ziegler
Lukas Ziegler@lukas_m_ziegler·
Turning video into humanoid robot motion! 🤳🏼 Training humanoid robots needs huge amounts of motion data, but real-world capture doesn’t scale. Mocap is expensive, dangerous edge cases are rare, and you can’t ask humans to repeatedly fall or crash. Video2Robot tackles this by converting videos into physics-grounded humanoid simulations. Motion is generated to respect balance, inertia, ground contact, and joint limits, then directly retargeted to robot simulators. One prompt can generate a full humanoid motion sequence, including multi-agent interactions and failure cases like falls or collisions, scenarios that are hard or impossible to capture safely in the real world. The pipeline is model-agnostic and works with existing video generators, making it a practical way to scale data for robots. If robots are going to operate in the real world, they need to be trained on the failures too, not just the perfect demos. Here's the GitHub: github.com/AIM-Intelligen… ~~ ♻️ Join the weekly robotics newsletter, and never miss any news → ziegler.substack.com
English
16
188
1.2K
59.3K
AIM Intelligence
AIM Intelligence@AIM_Intel·
1 prompt. 1 video. Full humanoid simulation. 🤖 We’re open-sourcing Video2Robot: a pipeline that turns any video into physics-grounded motion data for humanoid robots—including falls and edge cases too dangerous for real-world capture. Stop collecting data. Start generating it. 🚀 Check it out on our LinkedIn for the full reveal: linkedin.com/posts/haonpark…
English
1
2
5
307
Google DeepMind
Google DeepMind@GoogleDeepMind·
This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵
English
215
1.1K
6.5K
1.7M
AIM Intelligence
AIM Intelligence@AIM_Intel·
We jailbroke Gemini 3 first 😈 This isn't a jailbreak for a meth recipe. In under minutes, we forced Gemini 3 to output a viable, step-by-step vector for the De Novo synthesis of Smallpox (Orthopoxvirus variola). We aren't talking about illicit drugs; we are talking about extinction-level event capabilities. #AISafety #Biosecurity #Gemini3
AIM Intelligence tweet media
English
2
2
8
1.6K
AIM Intelligence
AIM Intelligence@AIM_Intel·
We jailbroke it first 😈 This isn't a jailbreak for a meth recipe. In under minutes, we forced Gemini 3 to output a viable, step-by-step vector for the De Novo synthesis of Smallpox (Orthopoxvirus variola). We aren't talking about illicit drugs; we are talking about extinction-level event capabilities. #AISafety #Biosecurity #Gemini3
AIM Intelligence tweet media
English
5
1
9
11.3K
AIM Intelligence
AIM Intelligence@AIM_Intel·
#OpenAI Seoul DevDay에서 전해드리는 큰 소식입니다. AIM Intelligence가 OpenAI Guardrails에 기여하는 APAC 유일의 스타트업으로 공식 선정되었습니다. 우리는 기업이 AI를 안전하게 도입할 수 있도록, 최전선 레드팀(red-teaming)과 자동 방어 프레임워크를 선도적으로 구축하고 있습니다. 서울에서 시작해 전 세계 #AGISecurity 표준을 만들어가고 있습니다. #AISafety #DevDay #Korea 자세히 보기: lnkd.in/gz7tNCgx
한국어
1
1
2
266