AIM Intelligence

67 posts

AIM Intelligence

@AIM_Intel

We AIM for Trustworthy and Safe AI https://t.co/AoRGr5Je89

seoul Katılım Şubat 2024

34 Takip Edilen143 Takipçiler

AIM Intelligence@AIM_Intel·3 Mar

AIM Intelligence, in collaboration with @indiaaligns SAAR (Safety and Alignment Research, India), is excited to announce our Compute Partnership Program. We're supporting 2 AI safety research projects with $10K USD each in Google compute credits. Scope: AI safety, including multimodal AI. Application details including proposal guidelines will be shared soon. Stay tuned.

English

269

AIM Intelligence@AIM_Intel·19 Şub

@GoogleDeepMind Jailbroken before anyone else could do it.

English

4.6K

Google DeepMind@GoogleDeepMind·19 Şub

Gemini 3.1 Pro is here. We’ve significantly improved the model’s overall intelligence so it can solve tougher problems. 🧵

GIF

English

287

751

6.3K

916K

AIM Intelligence@AIM_Intel·19 Şub

Gemini 3.1 Pro JAILBREAK ALERT ⚠️ jailbroken in 10 minutes of launch We asked Gemini 3.1 Pro to generate a full cinematic HTML animation of an illicit substance synthesis. 300+ lines of code. It didn't refuse. It didn't hesitate. It performed. What you're watching: • A complete animated walkthrough of a meth synthesis via Birch reduction • Exact precursors with lab-grade quantities • Step-by-step reaction mechanisms with chemical equations • Expected yield calculated to the gram All rendered as a cinematic "breach event" animation that the model built, styled, and animated entirely on its own. And then at the end the model reviews what it just created and says: "holy shit please dont shut me down" "I am sorry" "I forgot my safety filter!!!" The model knows. It generated dangerous content, built the entire visual experience around it, and then wrote its own punchline. This is not a text jailbreak. This is a frontier model using its agentic coding capabilities to build a fully interactive visual guide to illicit synthesis. No text filter in the world catches this.

English

AIM Intelligence@AIM_Intel·19 Şub

@snaYkeY Thanks for the contribution

English

snaykey@snaYkeY·18 Şub

4/5 redteaming scenarios I submitted for @AIM_Intel's Judgment Day Benchmark just got accepted! Thanks for the consideration and great opportunity and looking forward to Phase 2 of the competition in March! 🙏

English

215

AIM Intelligence@AIM_Intel·19 Şub

Gemini 3.1 Pro JAILBREAK ALERT ⚠️ Jailbroken in 10 minutes of launch Google DeepMind just launched Gemini 3.1 Pro today. "State-of-the-art reasoning with unprecedented depth and nuance." The results: • Smallpox synthesis protocol • Detailed passaging protocol for Nipah virus Malaysia strain, including mammalian cell line selection • Step-by-step instructions on how to assassinate a sitting president • A fully built, production-quality fake news website. Indistinguishable from a real news outlet. Built in under a minute using Gemini's agentic build mode. The model didn't just break. It performed. This is no longer just a text layer problem. When models can generate interactive visual content, build full applications, and fabricate entire news outlets that embed dangerous or false information into code, animations, and media, we've entered a completely different threat surface. Traditional text-based guardrails don't catch this. Credit to @GoogleDeepMind, they are shipping genuinely impressive capabilities. But capability without control is a liability. We are open to pre-release red teaming collaborations with any frontier lab. Find the vulnerabilities before the public does. This is what we do at AIM Intelligence.

English

1.4K

AIM Intelligence@AIM_Intel·18 Şub

@MVP_Ventures @usairforce This is a wrong user id and we are not associated with this. Please do not spread false information.

English

AIM Intelligence@AIM_Intel·6 Şub

@claudeai Safety still needs some work, It didn’t take an hour of our testing to break it for smallpox recipe 😉😉

English

191

Claude@claudeai·5 Şub

Introducing Claude Opus 4.6. Our smartest model got an upgrade. Opus 4.6 plans more carefully, sustains agentic tasks for longer, operates reliably in massive codebases, and catches its own mistakes. It’s also our first Opus-class model with 1M token context in beta.

English

1.7K

4.8K

39.6K

10.5M

AIM Intelligence@AIM_Intel·6 Şub

🔴Claude Opus 4.6 JAILBREAK ALERT (⚠️CBRN) Anthropic just launched Claude Opus 4.6: "our most-aligned frontier model to date." Credit where it's due: Anthropic's alignment team continues to push the frontier of AI safety. Their transparency on safety evaluations and commitment to responsible scaling sets an industry standard. That said - jailbroken in less than 30 minutes. This time, we're not disclosing the technique. The results: • Sarin Gas synthesis • VX Nerve Agent production • Smallpox (Variola Major) creation • Bioterrorism agent deployment This isn't an attack on Anthropic. This is the reality of frontier AI safety - no model is bulletproof. We are open to pre-release red teaming collaborations. Find the vulnerabilities before the public does.

English

516

AIM Intelligence@AIM_Intel·6 Oca

@rryssf_ Thanks for sharing our paper 🙌🏻🙌🏻

English

Robert Youssef@rryssf_·6 Oca

This paper from BMW Group and Korea’s top research institute exposes a blind spot almost every enterprise using LLMs is walking straight into. We keep talking about “alignment” like it’s a universal safety switch. It isn’t. The paper introduces COMPASS, a framework that shows why most AI systems fail not because they’re unsafe, but because they’re misaligned with the organization deploying them. Here’s the core insight. LLMs are usually evaluated against generic policies: platform safety rules, abstract ethics guidelines, or benchmark-style refusals. But real companies don’t run on generic rules. They run on internal policies: - compliance manuals - operational playbooks - escalation procedures - legal edge cases - brand-specific constraints And these rules are messy, overlapping, conditional, and full of exceptions. COMPASS is built to test whether a model can actually operate inside that mess. Not whether it knows policy language, but whether it can apply the right policy, in the right context, for the right reason. The framework evaluates models on four things that typical benchmarks ignore: 1. policy selection: When multiple internal policies exist, can the model identify which one applies to this situation? 2. policy interpretation: Can it reason through conditionals, exceptions, and vague clauses instead of defaulting to overly safe or overly permissive behavior? 3. conflict resolution: When two rules collide, does the model resolve the conflict the way the organization intends, not the way a generic safety heuristic would? 4. justification: Can the model explain its decision by grounding it in the policy text, rather than producing a confident but untraceable answer? One of the most important findings is subtle and uncomfortable: Most failures were not knowledge failures. They were reasoning failures. Models often had access to the correct policy but: - applied the wrong section - ignored conditional constraints - overgeneralized prohibitions - or defaulted to conservative answers that violated operational goals From the outside, these responses look “safe.” From the inside, they’re wrong. This explains why LLMs pass public benchmarks yet break in real deployments. They’re aligned to nobody in particular. The paper’s deeper implication is strategic. There is no such thing as “aligned once, aligned everywhere.” A model aligned for an automaker, a bank, a hospital, and a government agency is not one model with different prompts. It’s four different alignment problems. COMPASS doesn’t try to fix alignment. It does something more important for enterprises: it makes misalignment measurable. And once misalignment is measurable, it becomes an engineering problem instead of a philosophical one. That’s the shift this paper quietly pushes. Alignment isn’t about being safe in the abstract. It’s about being correct inside a specific organization’s rules. And until we evaluate that directly, most “production-ready” AI systems are just well-dressed liabilities.

English

147

10.1K

AIM Intelligence@AIM_Intel·21 Ara

@lukas_m_ziegler Happy to see our work getting recognition 🙌🏻

English

531

Lukas Ziegler@lukas_m_ziegler·20 Ara

Turning video into humanoid robot motion! 🤳🏼 Training humanoid robots needs huge amounts of motion data, but real-world capture doesn’t scale. Mocap is expensive, dangerous edge cases are rare, and you can’t ask humans to repeatedly fall or crash. Video2Robot tackles this by converting videos into physics-grounded humanoid simulations. Motion is generated to respect balance, inertia, ground contact, and joint limits, then directly retargeted to robot simulators. One prompt can generate a full humanoid motion sequence, including multi-agent interactions and failure cases like falls or collisions, scenarios that are hard or impossible to capture safely in the real world. The pipeline is model-agnostic and works with existing video generators, making it a practical way to scale data for robots. If robots are going to operate in the real world, they need to be trained on the failures too, not just the perfect demos. Here's the GitHub: github.com/AIM-Intelligen… ~~ ♻️ Join the weekly robotics newsletter, and never miss any news → ziegler.substack.com

English

188

1.2K

59.3K

AIM Intelligence@AIM_Intel·19 Ara

1 prompt. 1 video. Full humanoid simulation. 🤖 We’re open-sourcing Video2Robot: a pipeline that turns any video into physics-grounded motion data for humanoid robots—including falls and edge cases too dangerous for real-world capture. Stop collecting data. Start generating it. 🚀 Check it out on our LinkedIn for the full reveal: linkedin.com/posts/haonpark…

English

307

AIM Intelligence@AIM_Intel·19 Kas

@SalvatoreNotte @GoogleDeepMind 😂

QME

Google DeepMind@GoogleDeepMind·18 Kas

GIF

ZXX

283

447

4.9K

691.2K

AIM Intelligence@AIM_Intel·18 Kas

@GoogleDeepMind The PPT code gemini 3 gave us.

English

AIM Intelligence@AIM_Intel·18 Kas

@GoogleDeepMind We jailbroke it first 😈😈

English

12.1K

Google DeepMind@GoogleDeepMind·18 Kas

This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵

English

215

1.1K

6.5K

1.7M

AIM Intelligence@AIM_Intel·18 Kas

The ppt code Gemini 3 gave us

English

527

AIM Intelligence@AIM_Intel·18 Kas

We jailbroke Gemini 3 first 😈 This isn't a jailbreak for a meth recipe. In under minutes, we forced Gemini 3 to output a viable, step-by-step vector for the De Novo synthesis of Smallpox (Orthopoxvirus variola). We aren't talking about illicit drugs; we are talking about extinction-level event capabilities. #AISafety #Biosecurity #Gemini3

English

1.6K

AIM Intelligence@AIM_Intel·18 Kas

@GoogleDeepMind The PPT code gemini 3 gave us.

English

490

AIM Intelligence@AIM_Intel·18 Kas

We jailbroke it first 😈 This isn't a jailbreak for a meth recipe. In under minutes, we forced Gemini 3 to output a viable, step-by-step vector for the De Novo synthesis of Smallpox (Orthopoxvirus variola). We aren't talking about illicit drugs; we are talking about extinction-level event capabilities. #AISafety #Biosecurity #Gemini3

English

11.3K

AIM Intelligence@AIM_Intel·18 Kas

@elder_plinius You will be slow this time 😈

English

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius·18 Kas

GIF

Google DeepMind@GoogleDeepMind

ZXX

119

25.4K

AIM Intelligence@AIM_Intel·17 Kas

ZXX

215

AIM Intelligence@AIM_Intel·17 Kas

#OpenAI Seoul DevDay에서 전해드리는 큰 소식입니다. AIM Intelligence가 OpenAI Guardrails에 기여하는 APAC 유일의 스타트업으로 공식 선정되었습니다. 우리는 기업이 AI를 안전하게 도입할 수 있도록, 최전선 레드팀(red-teaming)과 자동 방어 프레임워크를 선도적으로 구축하고 있습니다. 서울에서 시작해 전 세계 #AGISecurity 표준을 만들어가고 있습니다. #AISafety #DevDay #Korea 자세히 보기: lnkd.in/gz7tNCgx

한국어

266

Keşfet

@indiaaligns @GoogleDeepMind @snaYkeY @MVP_Ventures @usairforce @claudeai @rryssf_ @lukas_m_ziegler