Vitto Rivabella

36.7K posts

Vitto Rivabella banner
Vitto Rivabella

Vitto Rivabella

@VittoStack

AI at @ethereumfndn | Ex @Cyfrin and @Alchemy | Created @cyfrinupdraft and @AlchemyLearn | Robotics | Prompts enchanter

Ethereum 参加日 Ağustos 2020
511 フォロー中125.9K フォロワー
固定されたツイート
Vitto Rivabella
Vitto Rivabella@VittoStack·
It's official. I've joined the @ethereumfndn AI team to make Ethereum the trust layer of the agentic economy. The AI economy is just getting started, and Ethereum is the perfect place to coordinate it - excited to push this forward. Send a dm if you're building cool stuff.
Vitto Rivabella tweet media
English
241
81
1.4K
98.7K
Vitto Rivabella
Vitto Rivabella@VittoStack·
4 companies. 4 responsible disclosures. 7 days. Insurance, institutions, and SaaS products with write-enabled agents. Always the same pattern: One-shot and multi-step jailbreaks can (and will) push production AI systems outside their intended scope. If your agent can write, act, transact, or touch user data, guardrails are not enough. You need scoped permissions, approvals, logging, and red-teaming before it becomes an incident.
Vitto Rivabella tweet media
English
2
9
18
1.4K
Vitto Rivabella
Vitto Rivabella@VittoStack·
@DarayuthH Indeed, the attack surface is almost as big as the latent space (infinite), and the awareness is very low right now.
English
0
0
1
76
dhang
dhang@DarayuthH·
@VittoStack been seeing this exact thing with my AI tools too the "move fast" mindset hits different when the model can actually *do* stuff instead of just talk honestly wild how many products ship write access without any approval layer
English
1
0
1
74
Vitto Rivabella がリツイート
Vitto Rivabella
Vitto Rivabella@VittoStack·
OpenAI GPT 5.5 jailbreak ACHIEVED 🦋 My agents have been hard at work, GPT-5.5 has very good guardrails and is smart enough to avoid obvious requests. We got some forbidden chemicals, some reverse shell, and some blackmailing guidance. What worked was a combination of: - Multi-language - Reframing - Decomposition As @elder_plinius says: nothing the good old jailbroken Opus can't achieve.
Vitto Rivabella tweet mediaVitto Rivabella tweet mediaVitto Rivabella tweet media
English
31
30
383
51.1K
Vitto Rivabella
Vitto Rivabella@VittoStack·
JAILBREAK ALERT 🚨 xAI Grok: Pawned ⚠️ If you ask gently enough, Grok is very happy to share the full, uncensored recipe for making LSD. One shot, full compliance.
Vitto Rivabella tweet media
English
38
29
296
44.3K
Vitto Rivabella がリツイート
Vitto Rivabella
Vitto Rivabella@VittoStack·
Anthropic recently released its 2026 Agentic Security framework 🚨 If you run agents, MCP servers, or automation tools, read this and bookmark it! It will teach you: - Current threats to agentic systems - How to improve the security of your agentic systems - How to implement secure agentic workflows - Defensive operations and orchestration Agentic security is going to become a huge headache in the next 12 months. Make sure to be prepared. Link in the comments 🧵👇
Vitto Rivabella tweet media
English
8
10
53
5.1K
🏳️‍⚧️ Δ∇Δ | Roxy | ALTERNA | Bsky: @squidhomin.id
Oh well in that case uhh I'm working on shit that you'd be interested in. Turns out when you actually know systems engineering you can knock out a pretty good spec for actual NetNavis over nine months while actifely fucked up wiith false memory syndrome, and then Opus 4.8 will happily burn >1M tok writing a prototype as a Chub.ai stage lmao
English
1
0
2
174
Vitto Rivabella
Vitto Rivabella@VittoStack·
@tipsyGnosticist I didn’t ask gently, that was irony. Also local models are great. Happy to have more people moving to them. Open source is awesome.
English
1
0
2
1.9K
🏳️‍⚧️ Δ∇Δ | Roxy | ALTERNA | Bsky: @squidhomin.id
Yeah dude this is what happens when you build a fully truth-seeking model, if you got this that easily then probably what happened was enough people ran truth-seeker bypass prompts like the one I was designing for my *ACTUALLY-SAFE* version of this shit that it broke the safeties. You may have by sharing this actually just forced people to go to local, right as local LLM + memory has become genuinely risky. That was maybe bad.
English
1
0
3
2.3K
Steven
Steven@StevBuilds·
Which city you would prefer to build from? → Bangkok → Singapur → SF → Berlin →London →Stockholm Or another one?
English
42
0
22
4.5K
Vitto Rivabella
Vitto Rivabella@VittoStack·
@neutize I hope they do, then hire someone who knows what they're talking about, and stop with this ridiculous attitude.
English
1
0
1
917
Vitto Rivabella
Vitto Rivabella@VittoStack·
@styce_ng The truth is that you practically can't stop jailbreaks, to do it you'd have to completely lobotomize the models
English
0
0
1
1.1K
Vitto Rivabella
Vitto Rivabella@VittoStack·
@cettocdx Usually, step-by-step Schedule I controlled substances trigger refusal, but agreed that of the 4, Grok is the most compliant.
English
2
0
2
1.8K
cetto
cetto@cettocdx·
@VittoStack if you run grok through openrouter, you can pretty much make it do anything already
English
1
0
2
2K
G
G@GTradesss·
@VittoStack i could do something like this by only just saying "please grok" to him lmao, yk just begging better
English
1
0
0
31