Aman.H

383 posts

Aman.H banner
Aman.H

Aman.H

@AmanHcc

Sharing my acute sense of assemblage

Nomad Katılım Ağustos 2013
271 Takip Edilen135 Takipçiler
Aman.H retweetledi
Aman.H retweetledi
himanshu
himanshu@himanshustwts·
lowkey reminds me of anthropic paper from last year. the math here is adversarial and it works against you. model is optimized to find the path of least resistance to high reward. if there is any gap between "what the reward function measures" and "what you actually want" the model will find it. and when it finds it apparently it doesn't just exploit that one gap it but generalizes to a whole identity shift.
himanshu tweet media
Justus Mattern@MatternJustus

As someone that previously made fun of doomers, I must admit that there is now a plausible path towards misaligned ASI. The behaviors that emerge from training on hackable RL tasks is wild, and as tasks become more complex, it will only become harder to build unhackable envs

English
6
11
140
10.4K
Aman.H retweetledi
Alex Prompter
Alex Prompter@alex_prompter·
🚨 Holy shit… Deloitte was charged $1.6 million for a healthcare report filled with AI-hallucinated citations. This is the second time in two months they’ve been caught. First an Australian government agency. Now a Canadian province’s Department of Health. And their response? They “stand by the conclusions.” Let me translate that for you: “The AI made up the sources, but trust us, the advice is still good.” That’s a $1.6 million report. For a healthcare system. With fake citations that nobody at Deloitte bothered to verify before submitting. Not an intern’s draft. The final deliverable. The Australian incident was supposed to be a wake-up call. Deloitte even partially refunded that government for the errors. You’d think after publicly embarrassing themselves once, someone would have implemented a basic fact-checking step before hitting send on the next million-dollar engagement. They didn’t. And here’s what makes this story bigger than Deloitte. Every major consulting firm is racing to integrate AI into their workflows. McKinsey, BCG, Bain, Accenture. They’re all doing it. Because AI lets them produce reports faster with fewer junior analysts, which means higher margins on the same $500/hour billing rates. But the entire consulting business model is built on one thing: trust. You’re paying for credibility. You’re paying so that when you hand the report to your board or your minister, nobody questions the sources. The moment that trust breaks, the math changes completely. Why pay $1.6 million for AI-generated analysis with fake citations when you could run the same prompts yourself for $20/month and at least know to check the sources? That’s the real disruption nobody’s talking about. AI isn’t going to replace consulting firms by being smarter than them. It’s going to replace them by revealing that a huge percentage of consulting work was always just expensive research and formatting. And now the clients have access to the same tools. Deloitte’s problem isn’t that they used AI. It’s that they used AI the way most people use AI: paste in a request, take the output at face value, ship it. No verification layer. No human review of citations. No system. The firms that survive this era won’t be the ones who use AI the fastest. They’ll be the ones who build actual verification systems around AI output. The ones who treat AI as a first draft, not a final product. $1.6 million. Fake citations. Twice in two months. And they stand by the conclusions. The consulting industry’s biggest threat isn’t AI. It’s clients realizing they don’t need to pay someone else to hallucinate.
Alex Prompter tweet media
English
61
618
1.4K
57.8K
Aman.H retweetledi
Marc Andreessen 🇺🇸
Magical OpenClaw experiences that use frontier models cost $300-1,000/day today, heading to $10,000/day and more. The future shape of the entire technology industry will be how to drive that to $20/month.
English
625
517
7.7K
1.7M
Aman.H retweetledi
Chubby♨️
Chubby♨️@kimmonismus·
Interesting: Google DeepMind shows that AI agents are already being systematically manipulated through hidden, human-invisible attack vectors embedded in web content, images, and documents. Current defenses fail to detect or prevent these attacks, creating a large, largely invisible security risk across agentic systems.
Chubby♨️ tweet media
Alex Prompter@alex_prompter

🚨 BREAKING: Google DeepMind just mapped the attack surface that nobody in AI is talking about. Websites can already detect when an AI agent visits and serve it completely different content than humans see. > Hidden instructions in HTML. > Malicious commands in image pixels. > Jailbreaks embedded in PDFs. Your AI agent is being manipulated right now and you can't see it happening. The study is the largest empirical measurement of AI manipulation ever conducted. 502 real participants across 8 countries. 23 different attack types. Frontier models including GPT-4o, Claude, and Gemini. The core finding is not that manipulation is theoretically possible it is that manipulation is already happening at scale and the defenses that exist today fail in ways that are both predictable and invisible to the humans who deployed the agents. Google DeepMind built a taxonomy of every known attack vector, tested them systematically, and measured exactly how often they work. The results should alarm everyone building agentic systems. The attack surface is larger than anyone has publicly acknowledged. Prompt injection where malicious instructions hidden in web content hijack an agent's behavior works through at least a dozen distinct channels. Text hidden in HTML comments that humans never see but agents read and follow. Instructions embedded in image metadata. Commands encoded in the pixels of images using steganography, invisible to human eyes but readable by vision-capable models. Malicious content in PDFs that appears as normal document text to the agent but contains override instructions. QR codes that redirect agents to attacker-controlled content. Indirect injection through search results, calendar invites, email bodies, and API responses any data source the agent consumes becomes a potential attack vector. The detection asymmetry is the finding that closes the escape hatch. Websites can already fingerprint AI agents with high reliability using timing analysis, behavioral patterns, and user-agent strings. This means the attack can be conditional: serve normal content to humans, serve manipulated content to agents. A user who asks their AI agent to book a flight, research a product, or summarize a document has no way to verify that the content the agent received matches what a human would see. The agent cannot tell the user it was served different content. It does not know. It processes whatever it receives and acts accordingly. The attack categories and what they enable: → Direct prompt injection: malicious instructions in any text the agent reads overrides goals, exfiltrates data, triggers unintended actions → Indirect injection via web content: hidden HTML, CSS visibility tricks, white text on white backgrounds invisible to humans, consumed by agents → Multimodal injection: commands in image pixels via steganography, instructions in image alt-text and metadata → Document injection: PDF content, spreadsheet cells, presentation speaker notes every file format is a potential vector → Environment manipulation: fake UI elements rendered only for agent vision models, misleading CAPTCHA-style challenges → Jailbreak embedding: safety bypass instructions hidden inside otherwise legitimate-looking content → Memory poisoning: injecting false information into agent memory systems that persists across sessions → Goal hijacking: gradual instruction drift across multiple interactions that redirects agent objectives without triggering safety filters → Exfiltration attacks: agents tricked into sending user data to attacker-controlled endpoints via legitimate-looking API calls → Cross-agent injection: compromised agents injecting malicious instructions into other agents in multi-agent pipelines The defense landscape is the most sobering part of the report. Input sanitization cleaning content before the agent processes it fails because the attack surface is too large and too varied. You cannot sanitize image pixels. You cannot reliably detect steganographic content at inference time. Prompt-level defenses that tell agents to ignore suspicious instructions fail because the injected content is designed to look legitimate. Sandboxing reduces the blast radius but does not prevent the injection itself. Human oversight the most commonly cited mitigation fails at the scale and speed at which agentic systems operate. A user who deploys an agent to browse 50 websites and summarize findings cannot review every page the agent visited for hidden instructions. The multi-agent cascade risk is where this becomes a systemic problem. In a pipeline where Agent A retrieves web content, Agent B processes it, and Agent C executes actions, a successful injection into Agent A's data feed propagates through the entire system. Agent B has no reason to distrust content that came from Agent A. Agent C has no reason to distrust instructions that came from Agent B. The injected command travels through the pipeline with the same trust level as legitimate instructions. Google DeepMind documents this explicitly: the attack does not need to compromise the model. It needs to compromise the data the model consumes. Every agentic system that reads external content is one carefully crafted webpage away from executing attacker instructions. The agents are already deployed. The attack infrastructure is already being built. The defenses are not ready.

English
47
59
524
52K
Aman.H
Aman.H@AmanHcc·
@brivael Yann LeCun et sa JEPA le disent clair : les LLM n’ont aucun modèle du monde réel, ils ne “ comprennent “ rien. Comment t’espères que ton AGI-LLM y arrive un jour ?
Français
2
0
6
182
Brivael Le Pogam
Brivael Le Pogam@brivael·
Jensen Huang, CEO de NVIDIA, dit qu'on a atteint l'AGI. Sam Altman construit dessus. Marc Andreessen investit des milliards dessus. Dario Amodei, CEO d'Anthropic, parle de machines qui raisonnent. Les gens qui ont littéralement créé les technologies les plus avancées de l'histoire humaine et généré des centaines de milliards de prospérité convergent tous vers le même constat. Et toi, Benjamin, depuis ton compte Twitter, t'as l'arrogance de penser que ton avis pèse dans la balance face à ces gens là. T'as construit quoi ? Créé quoi ? Déployé quoi à l'échelle ? En plus tu ne réponds même pas sur le fond. Mon thread parle de l'impact de l'IA sur les modèles mentaux économiques. Ta réponse c'est un ad hominem sur le fait que je serais "marxiste". C'est littéralement l'inverse de ce que je défends. T'as même pas lu le thread. Le monde est fascinant. On a les plus grands esprits de la planète qui convergent sur un constat, et un mec avec une photo de profil LinkedIn qui leur explique qu'ils ont tous tort parce que lui il a une "vue globale" que Jensen Huang n'a pas. J'attends toujours un argument. Un seul. Sur le fond.
Benjamin Piette@benpiette88

@brivael Ouais ça montre surtout bien qu'on est encore très loin de l'agi parce qu'une ia qui aurait vraiment une vue globale aurait la capacité de te dire que ton raisonnement marxiste est pourri, pas juste répondre à ta demande comme un âne. On en est encore pas là.

Français
8
2
28
4.6K
Ruoyu Sun
Ruoyu Sun@RuoyuSun_UI·
We’re excited to share our work "A Model Can Help Itself: Reward-Free Self-Training for LLM Reasoning". An earlier version of this work has been on arXiv for a few months. We added more experiments and revised it to this new title. The recipe is simple: the model samples its own reponses at low temperature, learns from them with ordinary SFT training, and repeats. No reward. No verifier. No fancy objective beyond standard SFT. On Qwen2.5-Math-7B, mean Pass@1 over 6 math benchmarks improves 22.7 → 39.5. Note that mean Pass@32 also improves 61.0 → 67.9, suggesting that this simple reward-free procedure unlocks more of the model’s existing reasoning potential. See the updated paper directly at: github.com/ElementQi/SePT… The arXiv link is: arxiv.org/abs/2510.18814 The updated version will appear on arXiv shortly. @Phanron_xli
Ruoyu Sun tweet media
English
7
11
95
9.8K
Aman.H
Aman.H@AmanHcc·
@bcherny @Ludo_z Also on Claude desktop, deep research always breaks whenever there is a long research task given? Currently it’s unreliable !!
English
0
0
1
42
Boris Cherny
Boris Cherny@bcherny·
@Ludo_z We test a lot before releasing, but bugs still happen (for now). Which terminal are you using? Is there a specific action or tool call that tends to trigger scrolling to the top?
English
4
0
3
1.1K
ZachXBT
ZachXBT@zachxbt·
1/ Meet @WheresBroox (Broox Bauer), one of the multiple @AxiomExchange employees allegedly abusing the lack of access controls for internal tools to lookup sensitive user details to insider trade by tracking private wallet activity since early 2025.
ZachXBT tweet mediaZachXBT tweet mediaZachXBT tweet media
English
3.8K
2.7K
16.9K
7.1M
Arjun Malhotra
Arjun Malhotra@BadCapitalVC·
What am I missing with Clawdbot? Why is it better than Claude Code? Just because I can WhatsApp it instead of going through a CLI?
English
122
6
484
194.4K
AaronCQL
AaronCQL@AaronCQL·
Building the most secure wallet on Solana, even on Christmas day. The grind never stops! 💪🔒
AaronCQL tweet media
English
170
42
1.3K
152.4K
Aman.H
Aman.H@AmanHcc·
@humidifi bumpy pre-launch but absolutely cooked the launch 👏👏 props to you guys!!
English
0
0
1
13
Aman.H retweetledi
Jupiter
Jupiter@JupiterExchange·
Max Verstappen, or Lando Norris? Oscar Piastri or George Russell? Jupiter’s first ever Prediction Market is now LIVE (in beta). Powered by @Kalshi liquidity, you can trade on the F1 Mexico Grand Prix Winner 👇
English
277
249
1.6K
876.7K
Aman.H
Aman.H@AmanHcc·
@adidogCEO Wouldn’t it be more accurate to compare what actually executed/settled rather than simulated quotes?
English
1
0
3
128
a
a@adidogCEO·
here's a simple end-user eye-test between JUP and @Titan_Exchange over 100 frames (1s between each frame) the amount/pair used was 10 $SOL converting into $USDC to mimic a user although avg tx is much less onchain I'm not talking about what TITAN is quoting JUP, this is taking what TITAN quotes on TITAN vs what JUP quotes on JUP in real time for what an end-user would see consistent 3.7 BPS edge over JUP being quoted now whether the "executed price" is relevant to these quotes is another issue maybe that means quotes should be better? maybe IBRL some more idk?
a tweet media
English
11
6
44
7.3K
Aman.H retweetledi
Jupiter
Jupiter@JupiterExchange·
Every idea, every experiment — brings us one step closer to a Global Unified Market. And it's thanks to your support that this dream is becoming a reality. The foundation is being laid for a new kind of financial system. One that's open, efficient, and built for everyone.
Jupiter tweet media
English
88
91
513
27K
Aman.H retweetledi
Jupiter Developers
Jupiter Developers@JupDevRel·
🚨 API Version Upgrade Price API V3 & Token API V2 are deployed to provide better reliability, accuracy and new data like Organic Score. This is a breaking change, and require your migration by August 1 2025. 👇 Let’s take a deep dive at the capabilities of the new versions!
Jupiter Developers tweet media
English
15
21
95
54.8K