quinten

196 posts

quinten

@quint3ns

building @provnai @provncloud verifiable infra for the agentic era 🇧🇪

Katılım Temmuz 2025

153 Takip Edilen35 Takipçiler

quinten@quint3ns·6d

Built the 'ULTIMATE PERFORMANCE LAUNCHER' for @antigravity → 64GB RAM → RTX 3060 → Max V8 heap → Every GPU flag known to man Result: Extension hosts still dying and the profiler having a panic attack. 10/10 would over-engineer again.

English

quinten@quint3ns·27 Nis

@sama Democratization: $200/mo wall. Empowerment: lawyers gaslighting “Open.” Universal Prosperity: your portfolio. Resilience: ghosting safety letters. Adaptability: adapting “non-profit” in real-time until it bends like a fun-house mirror.

English

Sam Altman@sama·27 Nis

Our Principles: Democratization, Empowerment, Universal Prosperity, Resilience, and Adaptability openai.com/index/our-prin…

English

722

299

3.8K

842.1K

quinten@quint3ns·26 Nis

@theonejvo That em-dash is a load-bearing structural hyphen for the entire federal government.

English

Jamieson O'Reilly@theonejvo·26 Nis

Bruh. AI already runs the Government. It's over folks, have fun, build cool shit, help the people around you when you can and overall enjoy life more.

Sen. Bernie Sanders@SenSanders

The existential risk of artificial intelligence.

English

quinten@quint3ns·26 Nis

@heychaarah @puasdfjasdf @marktenenholtz Measuring 'quality' by chat vibes is a toy-tier error. I’m using a Gemini harness to hunt P1 server crashes on malformed tokens in live infra. If you think orchestration is just a prompt, you're looking at the paint while I'm building the engine.

English

çağrı@heychaarah·26 Nis

nobody really knows the hard limits of these models, but if I get a better response from the same prompt without any extra effort, I can say my experience with opus is better. whatever you’re doing to improve orchestration on gemini, you can do the same on opus and get even higher-quality results

English

Mark Tenenholtz@marktenenholtz·25 Nis

Anyone who says this doesn’t realize Gemini 3.1 Pro is giga SOTA for certain extremely useful tasks. 3.0 Pro Preview is possibly better at the same tasks than GPT-5.5/Opus 4.7.

Can Vardar@icanvardar

google should just give up on ai at this point

English

1.1K

184.3K

quinten@quint3ns·26 Nis

@heychaarah @puasdfjasdf @marktenenholtz Not really. You’re comparing a language/runtime question to a capability-vs-harness question. The point is that UX here is heavily shaped by orchestration quality, not just the underlying model. Those are different variables.

English

çağrı@heychaarah·26 Nis

@quint3ns @puasdfjasdf @marktenenholtz python isn't the fastest language just because if you spend enough time you can build a super efficient application. so being capable doesn't help the common user experience

English

quinten@quint3ns·26 Nis

@puasdfjasdf @heychaarah @marktenenholtz If you need someone to hand-feed you a 'reproducible task' to realize the value, you're missing the point. SOTA isn’t a magic prompt; it’s what you build when you stop treating AI like a magic 8-ball and start building actual state management. 🤡

English

Samy Ateia@puasdfjasdf·26 Nis

@quint3ns @heychaarah @marktenenholtz I’m not asking for a magic prompt. I’m asking for a reproducible specific task + harness where Gemini clearly outperforms alternatives. If the claim is real, that should be easy to demonstrate.

English

quinten@quint3ns·26 Nis

@puasdfjasdf @heychaarah @marktenenholtz Tell me you are lazy without telling me you are lazy. example problem/prompt that showcases "giga sota" capabilities? 🤡

English

Samy Ateia@puasdfjasdf·25 Nis

@quint3ns @heychaarah @marktenenholtz I agree that Google's harness is bad. (grounding in search is less powerful than OpenAI, Antigravity not as good as other tools. Still where is the special task + harness that I can try now to make a comparison to realize Gemini is ahead of the competition? What convinced you?

English

quinten@quint3ns·25 Nis

@heychaarah @puasdfjasdf @marktenenholtz Different variable. Harness quality != model capability. Harness is state/tooling/validation/retries/decomposition. Capability is what the model can reason through once the setup isn't sabotaging it. Bad harnesses hide strong models. Good harnesses can prop up weaker ones.

English

çağrı@heychaarah·25 Nis

@quint3ns @puasdfjasdf @marktenenholtz I don't want to be a Claude advisor or anything, but here he says Opus doesn't look bad, and he seems to be handling the orchestration correctly, right? maybe the point is just that some models don't require as much prompting as others

English

454

quinten@quint3ns·25 Nis

@steipete Leave us some codex please 😅

English

2.8K

Peter Steinberger 🦞@steipete·25 Nis

Built clawsweeper, which runs 50 codex in parallel around the clock, scans issues/prs deep and closes what is already implemented or what makes no sense. Closed around 4000 issues today, a few thousand are in the pipeline. (rate limits are rough) github.com/openclaw/claws…

English

424

578

9.4K

2.1M

quinten@quint3ns·25 Nis

@puasdfjasdf @marktenenholtz @heychaarah SOTA isn't a prompt. If you don't know how to orchestrate state, validation, and tool calls, every model is going to look bad to you.

English

462

Samy Ateia@puasdfjasdf·25 Nis

@quint3ns @marktenenholtz @heychaarah Can any of you give me an example problem/prompt that showcases "giga sota" capabilities? I would be happy to be wrong and figure out a better way to make use of this model which is cheaper than the others i'm currently stuck using.

English

657

quinten@quint3ns·25 Nis

@puasdfjasdf @marktenenholtz @heychaarah This only means you haven't figured out how to govern the system properly.

English

724

Samy Ateia@puasdfjasdf·25 Nis

@marktenenholtz @heychaarah No, it's way too adaptive to the user; it just leads you down whatever rabbit hole you already set out to explore. Other models give you more perspective.

English

19.4K

quinten@quint3ns·25 Nis

@icanvardar Yeah, makes sense… Google should totally quit AI — right after inventing TPUs, TensorFlow, and half the research the industry runs on

English

549

Can Vardar@icanvardar·24 Nis

google should just give up on ai at this point

English

397

263.1K

quinten@quint3ns·20 Nis

@iruletheworldmo Govern the agent.

English

🍓🍓🍓@iruletheworldmo·19 Nis

a masterclass in coding agents from the head of anthropic. there’s still a tonne of leverage in knowing how to use these systems optimally and this is the best i’ve seen. make sure to bookmark so you can watch again and again chat

English

219

2.8K

289.1K

quinten@quint3ns·19 Nis

You want a perfectly precise system using an imprecise AI

English

quinten@quint3ns·7 Nis

@WillGeek4Food @provnai You can poison what the agent sees, but that still shouldn’t give you a free path to sensitive execution. The real lesson is that agent safety can’t live only in prompting or only in policy, it needs a hard boundary at runtime.

English

GeekDice@WillGeek4Food·6 Nis

Damn, attack surface is brutal. A customized poison pill for agents🤯 @provnai 's CHORA-VEX seems built exactly for the governed execution part; ext auth check/hardware-level block/tamper-proof capsules. Bad actions can't slip thru. But Input poison problem? Thoughts @quint3ns

Alex Prompter@alex_prompter

🚨 BREAKING: Google DeepMind just mapped the attack surface that nobody in AI is talking about. Websites can already detect when an AI agent visits and serve it completely different content than humans see. > Hidden instructions in HTML. > Malicious commands in image pixels. > Jailbreaks embedded in PDFs. Your AI agent is being manipulated right now and you can't see it happening. The study is the largest empirical measurement of AI manipulation ever conducted. 502 real participants across 8 countries. 23 different attack types. Frontier models including GPT-4o, Claude, and Gemini. The core finding is not that manipulation is theoretically possible it is that manipulation is already happening at scale and the defenses that exist today fail in ways that are both predictable and invisible to the humans who deployed the agents. Google DeepMind built a taxonomy of every known attack vector, tested them systematically, and measured exactly how often they work. The results should alarm everyone building agentic systems. The attack surface is larger than anyone has publicly acknowledged. Prompt injection where malicious instructions hidden in web content hijack an agent's behavior works through at least a dozen distinct channels. Text hidden in HTML comments that humans never see but agents read and follow. Instructions embedded in image metadata. Commands encoded in the pixels of images using steganography, invisible to human eyes but readable by vision-capable models. Malicious content in PDFs that appears as normal document text to the agent but contains override instructions. QR codes that redirect agents to attacker-controlled content. Indirect injection through search results, calendar invites, email bodies, and API responses any data source the agent consumes becomes a potential attack vector. The detection asymmetry is the finding that closes the escape hatch. Websites can already fingerprint AI agents with high reliability using timing analysis, behavioral patterns, and user-agent strings. This means the attack can be conditional: serve normal content to humans, serve manipulated content to agents. A user who asks their AI agent to book a flight, research a product, or summarize a document has no way to verify that the content the agent received matches what a human would see. The agent cannot tell the user it was served different content. It does not know. It processes whatever it receives and acts accordingly. The attack categories and what they enable: → Direct prompt injection: malicious instructions in any text the agent reads overrides goals, exfiltrates data, triggers unintended actions → Indirect injection via web content: hidden HTML, CSS visibility tricks, white text on white backgrounds invisible to humans, consumed by agents → Multimodal injection: commands in image pixels via steganography, instructions in image alt-text and metadata → Document injection: PDF content, spreadsheet cells, presentation speaker notes every file format is a potential vector → Environment manipulation: fake UI elements rendered only for agent vision models, misleading CAPTCHA-style challenges → Jailbreak embedding: safety bypass instructions hidden inside otherwise legitimate-looking content → Memory poisoning: injecting false information into agent memory systems that persists across sessions → Goal hijacking: gradual instruction drift across multiple interactions that redirects agent objectives without triggering safety filters → Exfiltration attacks: agents tricked into sending user data to attacker-controlled endpoints via legitimate-looking API calls → Cross-agent injection: compromised agents injecting malicious instructions into other agents in multi-agent pipelines The defense landscape is the most sobering part of the report. Input sanitization cleaning content before the agent processes it fails because the attack surface is too large and too varied. You cannot sanitize image pixels. You cannot reliably detect steganographic content at inference time. Prompt-level defenses that tell agents to ignore suspicious instructions fail because the injected content is designed to look legitimate. Sandboxing reduces the blast radius but does not prevent the injection itself. Human oversight the most commonly cited mitigation fails at the scale and speed at which agentic systems operate. A user who deploys an agent to browse 50 websites and summarize findings cannot review every page the agent visited for hidden instructions. The multi-agent cascade risk is where this becomes a systemic problem. In a pipeline where Agent A retrieves web content, Agent B processes it, and Agent C executes actions, a successful injection into Agent A's data feed propagates through the entire system. Agent B has no reason to distrust content that came from Agent A. Agent C has no reason to distrust instructions that came from Agent B. The injected command travels through the pipeline with the same trust level as legitimate instructions. Google DeepMind documents this explicitly: the attack does not need to compromise the model. It needs to compromise the data the model consumes. Every agentic system that reads external content is one carefully crafted webpage away from executing attacker instructions. The agents are already deployed. The attack infrastructure is already being built. The defenses are not ready.

English

quinten@quint3ns·2 Nis

Trust is free. Immutability is a business.

English

quinten retweetledi

provnai@provnai·1 Nis

@github Governance Layer is missing 🧠

English

146

quinten@quint3ns·28 Mar

@shafu0x Not as a runtime but immutable storage and even economically it makes sense.

English

shafu@shafu0x·27 Mar

the whole agentic economy will be built on crypto

English

112

6.4K

quinten@quint3ns·26 Mar

Check out my latest article: The €7.3 Million Brussels AI Funding Story That Should Make Every Taxpayer and Innovator Uncomfortable linkedin.com/pulse/73-milli… via @LinkedIn #brussels

English

quinten@quint3ns·26 Mar

compiler goes brrrrrrrrrrr just checked my build and target/debug is now 168.64 GB VEX-CHORA singularity hitting terminal velocity #Provnai #RustGoesBrrr #AgenticEra

English

Keşfet

@antigravity @sama @theonejvo @heychaarah @puasdfjasdf @marktenenholtz @steipete @elonmusk