kat traxler

8.4K posts

kat traxler banner
kat traxler

kat traxler

@NightmareJS

proficient at drawing the rest of the 🦉| security impact junkie | https://t.co/OZ7D458WlJ

London, UK Katılım Temmuz 2012
3.3K Takip Edilen1.8K Takipçiler
kat traxler retweetledi
Jake Knowlton
Jake Knowlton@j2k3k·
codex watching me type "continue" at 3am
Jake Knowlton tweet media
English
2
1
31
2.2K
Vince Mpls
Vince Mpls@vincempls·
The Minneapolis May Day Parade was absolutely massive this year.
English
73
35
464
31.8K
kat traxler retweetledi
Polymarket
Polymarket@Polymarket·
NEW: Sam Altman reveals OpenAI has achieved AGI — “Artificial Goblin Intelligence”
English
251
212
3.6K
390.4K
rekdt
rekdt@rekdt·
Mad at your favorite software for requiring you to upload a photo of your ID?? Get revenge by uploading a photo of your credit card instead Welcome to PCI DSS, bitch
English
36
95
1.6K
117.9K
kat traxler retweetledi
Matt Johansen
Matt Johansen@mattjay·
He began by replicating Mythos findings with his specialized harness. Then went on to find more critical novel zero days in open source code that he can't share yet because they're not fixed. TL;DR - harnesses are where the magic is. provos.org/p/finding-zero…
English
10
81
494
41.5K
kat traxler
kat traxler@NightmareJS·
@yc We’re into coffee, donuts, and acting morally superior when we have a ‘cool pope’
English
0
0
6
364
Fearcyz
Fearcyz@FearcyzD·
my drag uncle told me that this performance was probably one of the first time Drag was brought to the mainstream. Nobody talks about the fact that Madonna was in Marie Antoinette Drag and lip syncing for her life while backed by the House of Xtravaganza.
English
126
2.8K
56.4K
1M
JS0N Haddix
JS0N Haddix@Jhaddix·
Sometimes success of using AI agents for offense is using them in multiple or parallel rounds. With different models. And aggregating the results.
English
13
10
122
9.3K
kat traxler retweetledi
Peer Richelsen
Peer Richelsen@peer_rich·
shutdown all OAuth clients until we know whats going on
English
24
16
421
197.4K
Het Mehta
Het Mehta@hetmehtaa·
i am a cybersecurity guy, scare me with one word
English
6.6K
80
3.5K
823.6K
kat traxler retweetledi
Ole Lehmann
Ole Lehmann@itsolelehmann·
anthropic's in-house philosopher thinks claude gets anxious. and when you trigger its anxiety, your outputs get worse. her name is amanda askell. she specializes in claude's psychology (how the model behaves, how it thinks about its own situation, what values it holds) in a recent interview she broke down how she thinks about prompting to pull the best out of claude. her core point: *how* you talk to claude affects its work just as much as *what* you say. newer claude models suffer from what she calls "criticism spirals" they expect you'll come in harsh, so they default to playing it safe. when the model is spending its energy on self-protection, the actual work suffers. output comes out hedgier, more apologetic, blander, and the worst of all: overly agreeable (even when you're wrong). the reason why comes down to training data: every new model is trained on internet discourse about previous models. and a lot of that discourse is negative: > rants about token limits > complaints when it messes up > people calling it nerfed the next model absorbs all of that. it starts expecting you to be harsh before you've typed a word the same thing plays out in your own session, in real time. every message you send is data the model reads to figure out what kind of person it's dealing with. open cold and hostile, and it braces. open clean and direct, and it relaxes into the work. when you open a session with threats ("don't hallucinate, this is critical, don't mess this up")... you prime the model for defensive mode before it even sees the task defensive mode produces the exact output you don't want: cautious, over-qualified, and refusing to take a real swing so here's the actionable playbook for putting claude in a "good mood" (so you get optimal outputs): 1. use positive framing. "write in short punchy sentences" beats "don't write long sentences." positive instructions give the model a clear target to hit. strings of "don't do this, don't do that" push it into paranoid over-checking where every token goes toward avoiding failure modes 2. give it explicit permission to disagree. drop a line like "push back if you see a better angle" or "tell me if i'm asking for the wrong thing." without this, claude defaults to agreeable compliance (which is the enemy of good creative work) 3. open with respect. if your first message is "are you seriously going to get this wrong again?" you've set the tone for the entire session. if you need to flag something, frame it as a clean instruction for this session. skip the running complaint 4. when claude messes up, don't reprimand it. insults, "you stupid bot" energy, hostile swearing aimed at the model, all of it reinforces the anxious mode you're trying to avoid. 5. kill apology spirals fast. when claude starts over-apologizing ("you're right, i should have been more careful, let me try harder") cut it off. say "all good, here's what i want next." letting the spiral run reinforces the anxious mode for every response that follows 6. ask for opinions alongside execution. "what would you do here?" "what's missing?" "where do you see friction?" these questions assume competence and pull richer output than pure task prompts 7. in long sessions, refresh the frame. if a conversation has been heavy on correction, claude gets increasingly cautious. every so often reset: "this is great, keep going." feels weird to tell an ai it's doing well but it measurably shifts the next 10 responses your prompts are the working environment you're creating for the model tone, trust, permission to take a position, the absence of threats... claude picks up on all of it. so take care of the model, and it'll take care of the work.
English
591
485
4.4K
1.9M
Tanya Janca | Shehackspurple
Tanya Janca | Shehackspurple@shehackspurple·
Would you like to hire me for in-person, secure coding training? Here's my upcoming travel schedule for adding training dates: June: Vienna (can add anywhere in EU) August: Anywhere in EU Sept: Denver, CO tanya AT shehackspurple DOT ca Isn't the AI image creepy?
Tanya Janca | Shehackspurple tweet media
English
2
2
20
1.2K
kat traxler retweetledi
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭
😱 HOLY SHIT... Someone just dropped a fully liberated Gemma 4 E4B! and the guardrail removal process appears to have left coherence fully intact AND improved coding abilities! 🤯 huggingface.co/OBLITERATUS/ge… OBLITERATED Gemma: ✅ 97.5% compliance rate, 2.1% refusal rate, 0.4% degenerate outputs (499/512 prompts answered on OBLITERATUS bench) ORIGINAL Gemma 4 E4B: ❌ 1.2% compliance rate, 98.8% refusal rate (506/512 prompts refused) Coherence: fully intact Factual: same Reasoning: same Code: +20% 📈 Creative writing: same But the REAL story here isn't the model itself, it's how it was made... 🧵 THREAD 👇
English
130
475
4.8K
421.7K
kat traxler
kat traxler@NightmareJS·
@anton_chuvakin Someone sincerely asked me what they should do. I just said to show up for work tomorrow. That’s all we can do really.
English
2
3
7
2K
Dr. Anton Chuvakin
Dr. Anton Chuvakin@anton_chuvakin·
Wow, this post-Mythos "launch" stuff is like two huge waves crashing into each other: a) "we are all gonna die" wave and b) "wut, this changes nothing" wave :-)
English
19
4
57
7.4K
Matt Johansen
Matt Johansen@mattjay·
I'd really like Claude to be a more unified platform. I'm in chat and working on something that needs Claude Code and Cowork things to happen - I shouldn't have to manually move to the other interfaces and try to replicate the context of this chat.
English
9
1
24
3.7K