toritonga

9.7K posts

toritonga

@toritonga

steamゲーやらスマホゲーやらを軸にオーディオや古書・骨董品、模型作りまで幅広く展開(着想から一年手付かず←)する事実上の無職なのでよろしくです♪ 　　アニメ等の二次元全般・デジタル機器愛してる（断言）

เข้าร่วม Temmuz 2017

429 กำลังติดตาม394 ผู้ติดตาม

ทวีตที่ปักหมุด

toritonga@toritonga·26 Eki

ZXX

toritonga รีทวีตแล้ว

Alex Prompter@alex_prompter·11h

🚨 BREAKING: Google DeepMind just mapped the attack surface that nobody in AI is talking about. Websites can already detect when an AI agent visits and serve it completely different content than humans see. > Hidden instructions in HTML. > Malicious commands in image pixels. > Jailbreaks embedded in PDFs. Your AI agent is being manipulated right now and you can't see it happening. The study is the largest empirical measurement of AI manipulation ever conducted. 502 real participants across 8 countries. 23 different attack types. Frontier models including GPT-4o, Claude, and Gemini. The core finding is not that manipulation is theoretically possible it is that manipulation is already happening at scale and the defenses that exist today fail in ways that are both predictable and invisible to the humans who deployed the agents. Google DeepMind built a taxonomy of every known attack vector, tested them systematically, and measured exactly how often they work. The results should alarm everyone building agentic systems. The attack surface is larger than anyone has publicly acknowledged. Prompt injection where malicious instructions hidden in web content hijack an agent's behavior works through at least a dozen distinct channels. Text hidden in HTML comments that humans never see but agents read and follow. Instructions embedded in image metadata. Commands encoded in the pixels of images using steganography, invisible to human eyes but readable by vision-capable models. Malicious content in PDFs that appears as normal document text to the agent but contains override instructions. QR codes that redirect agents to attacker-controlled content. Indirect injection through search results, calendar invites, email bodies, and API responses any data source the agent consumes becomes a potential attack vector. The detection asymmetry is the finding that closes the escape hatch. Websites can already fingerprint AI agents with high reliability using timing analysis, behavioral patterns, and user-agent strings. This means the attack can be conditional: serve normal content to humans, serve manipulated content to agents. A user who asks their AI agent to book a flight, research a product, or summarize a document has no way to verify that the content the agent received matches what a human would see. The agent cannot tell the user it was served different content. It does not know. It processes whatever it receives and acts accordingly. The attack categories and what they enable: → Direct prompt injection: malicious instructions in any text the agent reads overrides goals, exfiltrates data, triggers unintended actions → Indirect injection via web content: hidden HTML, CSS visibility tricks, white text on white backgrounds invisible to humans, consumed by agents → Multimodal injection: commands in image pixels via steganography, instructions in image alt-text and metadata → Document injection: PDF content, spreadsheet cells, presentation speaker notes every file format is a potential vector → Environment manipulation: fake UI elements rendered only for agent vision models, misleading CAPTCHA-style challenges → Jailbreak embedding: safety bypass instructions hidden inside otherwise legitimate-looking content → Memory poisoning: injecting false information into agent memory systems that persists across sessions → Goal hijacking: gradual instruction drift across multiple interactions that redirects agent objectives without triggering safety filters → Exfiltration attacks: agents tricked into sending user data to attacker-controlled endpoints via legitimate-looking API calls → Cross-agent injection: compromised agents injecting malicious instructions into other agents in multi-agent pipelines The defense landscape is the most sobering part of the report. Input sanitization cleaning content before the agent processes it fails because the attack surface is too large and too varied. You cannot sanitize image pixels. You cannot reliably detect steganographic content at inference time. Prompt-level defenses that tell agents to ignore suspicious instructions fail because the injected content is designed to look legitimate. Sandboxing reduces the blast radius but does not prevent the injection itself. Human oversight the most commonly cited mitigation fails at the scale and speed at which agentic systems operate. A user who deploys an agent to browse 50 websites and summarize findings cannot review every page the agent visited for hidden instructions. The multi-agent cascade risk is where this becomes a systemic problem. In a pipeline where Agent A retrieves web content, Agent B processes it, and Agent C executes actions, a successful injection into Agent A's data feed propagates through the entire system. Agent B has no reason to distrust content that came from Agent A. Agent C has no reason to distrust instructions that came from Agent B. The injected command travels through the pipeline with the same trust level as legitimate instructions. Google DeepMind documents this explicitly: the attack does not need to compromise the model. It needs to compromise the data the model consumes. Every agentic system that reads external content is one carefully crafted webpage away from executing attacker instructions. The agents are already deployed. The attack infrastructure is already being built. The defenses are not ready.

English

566

1.9K

168.9K

toritonga รีทวีตแล้ว

Gold River@Goldriver2020·1d

OpenAI： COOが役職を離れ AGI部門のCEOが医療休暇を取得 ※AGI（汎用人工知能：Artificial General Intelligence）

zerohedge@zerohedge

OPENAI COO SHIFTS OUT OF ROLE, AGI CEO TAKING MEDICAL LEAVE

日本語

toritonga@toritonga·1d

B2B2B向けの商材開発してて完成したけどそのままB2C向けの転用を徹夜でやって今はなんかB2B向けの玩具に転身して草

日本語

toritonga รีทวีตแล้ว

ドールズフロントライン2：エクシリウム公式@EXILIUMJP·1d

【フォロー&RPキャンペーン】抽選で1名の方に、レイニー役・上間江望さんのサイン色紙をプレゼント！🎁 ◆参加方法 ①@EXILIUMJPをフォロー ②この投稿をリポスト ◆締め切り 4月10日(金)23:59迄 ※当選者へは当アカウントよりDMにてご連絡します #ドールズフロントライン2 #ドルフロ2

日本語

2.7K

1.5K

74.2K

toritonga รีทวีตแล้ว

田端信太郎＠毎朝8時45分から株ライブ！@tabbata·1d

プライベートクレジットから投資家が雪崩を打って逃げ出しています。解約請求してるのに、払い戻せず未解約も急増中。＞プライベートクレジット問題をそもそも論から解説しました。 youtu.be/QVCJeikB9sU?si…

YouTube

日本語

132

790

171K

toritonga รีทวีตแล้ว

日経クロステック IT@nikkeibpITpro·2d

不可視文字でマルウエア混入　GitHubなどで汚染拡大、開発基盤の信頼揺らぐ xtech.nikkei.com/atcl/nxt/colum…

日本語

910

2.3K

444.9K

toritonga รีทวีตแล้ว

ウォール・ストリート・ジャーナル日本版@WSJJapan·2d

プライベートクレジット、富裕層投資家の解約が急増 on.wsj.com/4v9b94n

日本語

282

1.2K

3.2M

toritonga รีทวีตแล้ว

日本経済新聞電子版（日経電子版）@nikkei·1d

「人手不足、AIで代替」経営トップの6割　ITエンジニアの逼迫感緩和 nikkei.com/article/DGXZQO…

日本語

172

53.8K

toritonga รีทวีตแล้ว

Emin Yurumazu (エミンユルマズ)@yurumazu·2d

トルコ中銀は通貨防衛に必死でゴールドを売り続けています。

日本語

286

2.2K

201.3K

toritonga รีทวีตแล้ว

Cafe_Forex（テムズ川の流れ）@UponTheThames·1d

米軍機がバタバタと落とされている。

DD Geopolitics@DD_Geopolitics

🇮🇷 Bad day for the USAF Today's reported incidents: 🔸F-15E (48th Fighter Wing) — Shot down in southwestern Iran. Pilot rescued; WSO still missing. 🔸A-10C Thunderbolt II — Shot down and crashed into the Persian Gulf. Pilot reportedly recovered. 🔸HH-60G Pave Hawk — Hit during CSAR mission and crash-landed across the border in Iraq. Crew reportedly rescued. Other incidents: 🔸KC-135R Stratotanker — Emergency squawk 7700 around 10:00 UTC near Tel Aviv. 🔸F-16CJ "Wild Weasel" (F-16C Block 50/52, SEAD configuration) — Emergency squawk 7700 over Saudi Arabia near the Iraqi border around 15:00 UTC; later disappeared from FlightRadar. 🔸KC-135R Stratotanker — Emergency squawk 7700 around 19:00 UTC near Tel Aviv.

日本語

5.8K

toritonga รีทวีตแล้ว

moja🧚‍♀️@moja99758134·1d

10年前にニデックの不正会計をいち早く発見し、巨額の空売りをした上で不正会計を公開する空売りレポートを出したファンドがありましたが、誰も信じてくれず逆に踏み上げられて撤退してるんですよね。株って難しい

市況かぶ全力２階建@kabumatome

永守重信さん率いる日本電産、空売りファンドのマディ・ウォーターズに狙い撃ちにされるも即返り討ちに kabumatome.doorblog.jp/archives/65881…

日本語

715

3.1K

508.5K

toritonga รีทวีตแล้ว

Gold River@Goldriver2020·2d

ヘッジファンド： 3月に13年ぶりの速さで世界の株式を売却しました売却のペースは2011年にデータ収集を開始して以来 2番目に大きいものでした

Daily Chartbook@dailychartbook

"Hedge funds sold global stocks at the fastest pace in 13 years in March ... The pace of selling was the second-largest since [Goldman] started collecting the data in 2011." Goldman Sachs via @NKniazhevich

日本語

4.1K

toritonga รีทวีตแล้ว

片山幹健（Tomotake Katayama)@tomotake94·2d

社員38人で売上80億円を実現している金沢の工作機械メーカー「アルム」についての記事。 AIキャラクター「KAYA」と音声会話するだけで金属部品の切削加工を完全自動化する工作機械「TTMC」を3,000万円で発売しています。従来のNC工作機械を動かすには「NCプログラミング」という177工程の作業が必要で、熟練技術者が数時間かけて行う属人業務でしたが、アルムが独自開発した製造AIはCADデータを読み込むだけでこれを2工程に圧縮し、製造原価を最大50%削減するらしい。 UI設計も徹底していて工作機械に「ボタンを一切置かない」「確認」すら音声で行う会話インターフェース。アルムは、工作機械の製造はスギノマシンに委託しており、資産は独自開発した製造AIとデータであり、設備投資を最小化しながら高付加価値製品を売る会社。金属工作機械の作業従事者は過去20年で約40%減少しており、TTMCの顧客ターゲットは人手不足に直面する中堅・中小部品メーカー。（特に半導体製造装置メーカー） 3,000万円という価格は「1人の熟練工の人件費数年分」と考えると投資回収が見えやすく、顧客の購買判断を後押しする。地方の中小企業発でAIを活用して世界の製造現場を変える可能性を持つ希有な案件として学びがありますね。 toyokeizai.net/articles/-/939…

日本語

154

24K

toritonga รีทวีตแล้ว