Jason Fleagle

3.8K posts

Jason Fleagle banner
Jason Fleagle

Jason Fleagle

@jjfleagle

AI operator on how AI changes companies, workflows, and teams | Head of AI @ NetSync | Agent systems, enterprise AI, deployment, trust

Tulsa, OK Katılım Kasım 2015
29 Takip Edilen605 Takipçiler
Sabitlenmiş Tweet
Jason Fleagle
Jason Fleagle@jjfleagle·
Most AI content is still focused on prompts and demos. That is not where the real advantage will come from. The real advantage will come from: - workflow design - verification - deployment - trust - specialized AI teams, not one giant bot AI is not just changing software. It is starting to change the architecture of the company. That is the layer I am most interested in.
English
1
0
1
75
Jason Fleagle
Jason Fleagle@jjfleagle·
The biggest mistake after the GPT-5.5 / Mythos cyber results would be treating this as a lab rivalry. The real signal is capability diffusion. Once public models reach restricted-model performance on offensive cyber tasks, defenders have to assume the capability is broadly available. That means controls, testing, and incident response need to catch up now. Full article: x.com/jjfleagle/stat…
English
0
0
0
13
Jason Fleagle
Jason Fleagle@jjfleagle·
The AI safety debate gets too abstract. For buyers, the practical questions are simple: - Can this model perform offensive tasks? - Can the safeguards be bypassed? - How fast are jailbreaks patched? - What telemetry proves the controls work? - Who is accountable when the model is embedded in workflows? That is where AI governance has to get much more concrete. My breakdown: x.com/jjfleagle/stat…
English
0
0
0
7
Jason Fleagle
Jason Fleagle@jjfleagle·
If I were pressure-testing enterprise security for GPT-5.5-class agents, I would start here: 1. External exposure review 2. Identity and credential path mapping 3. Known exploitable vulnerabilities 4. Lateral movement simulations 5. Logging gaps 6. Human escalation timing 7. Vendor jailbreak response SLAs The model benchmark is interesting. The operating response is what matters. Article: x.com/jjfleagle/stat…
English
0
0
0
15
Jason Fleagle
Jason Fleagle@jjfleagle·
Most AI security conversations are still focused on prompt injection and data leakage. Those matter. But the newer problem is agentic cyber capability: Can the model reason through a real environment, adapt after failures, use tools, move laterally, and keep pursuing the objective? That is the test enterprise teams need to start running. I broke down the GPT-5.5 / Mythos benchmark here: x.com/jjfleagle/stat…
English
1
0
0
22
Jason Fleagle
Jason Fleagle@jjfleagle·
The phrase "restricted for safety" deserves more scrutiny now. If a public model matches a restricted model on the same cyber benchmarks, the industry needs clearer answers: 1. Was the restriction about actual safety? 2. Was it about compute capacity? 3. Was it about rollout control? 4. What evidence should buyers trust? Enterprise AI governance cannot run on press-release language. Context: x.com/jjfleagle/stat…
English
0
0
0
1
Jason Fleagle
Jason Fleagle@jjfleagle·
A model does not need to be a magic hacker to be dangerous. It only needs to be good enough to chain boring steps: - recon - exploit known weakness - steal creds - move laterally - retry with more context - keep going while humans sleep That is why AI cyber risk is mostly an operations problem, not a sci-fi problem. More here: x.com/jjfleagle/stat…
English
0
0
0
17
Flix
Flix@_flixmd·
@jjfleagle Yes. The scary part is no longer just data exfiltration. In healthcare/research networks, an agentic path can become work-queue changes, access-scope drift, or source-record confusion. Test the action path, denial logs, and rollback plan, not just the perimeter.
English
1
0
1
6
Jason Fleagle
Jason Fleagle@jjfleagle·
The uncomfortable part of the GPT-5.5 / Mythos cyber benchmark is not that one model beat another. It is that both models are now good enough to autonomously work through weak enterprise networks. That changes the security question from: "Can AI help attackers?" to: "Have we tested our environment against agentic attack paths yet?" Most teams have not. Article: x.com/jjfleagle/stat…
English
1
0
0
29
Charlie Lamb
Charlie Lamb@charlietlamb·
Perks of sharing an office with @OpenAI: Bumping into Mr Claw himself @steipete on the stairs🦞
Charlie Lamb tweet media
English
16
5
546
16.6K
Jason Fleagle
Jason Fleagle@jjfleagle·
@gdb Nicely done! When is the new voice model and update coming?
English
0
0
0
102
Mark Kretschmann
Mark Kretschmann@mark_k·
A new “voice mode” is being prepared for release by @OpenAI. The upgraded voice mode is based on the omnimodal GPT-5.5, making it substantially smarter and more expressive than the current version. It will also support full-duplex conversations, meaning it can listen and speak at the same time. That should make conversations feel much more natural and fluid.
English
94
73
1.6K
85.7K
Jason Fleagle
Jason Fleagle@jjfleagle·
@sama I agree. I love seeing the focus in the new direction. I’ve been building like crazy. Keep up the great work!
English
0
0
0
671
Jason Fleagle retweetledi
Sam Altman
Sam Altman@sama·
ChatGPT feels very 'switched on' now
English
1.2K
182
5.9K
523.2K
Jason Fleagle
Jason Fleagle@jjfleagle·
The operator takeaway is not "which lab is safer?" It is this: If public models can now complete meaningful multi-step cyber tasks, enterprise security teams need to test against autonomous agents, not yesterday's static threat model. The defensive benchmark just moved.
English
0
0
0
15
Jason Fleagle
Jason Fleagle@jjfleagle·
Hot take: Anthropic's "too dangerous to release" line for Mythos may have been a GPU shortage dressed up as "ethics." Here's the evidence: The UK AISI tested GPT-5.5 (public) and Mythos (restricted) on the same expert cybersecurity benchmarks. GPT-5.5 scored higher and its public. Both breached a simulated enterprise network. The safety restriction didn't prevent the capability from existing in the wild. It just prevented Anthropic from monetizing it at scale. That looks like to me a supply chain management with a PR wrapper. Full breakdown 👇 x.com/jjfleagle/stat…
English
0
0
0
16
Jason Fleagle
Jason Fleagle@jjfleagle·
@sama Is there a 4o voice model update coming to chatGPT?
English
0
0
0
51
Sam Altman
Sam Altman@sama·
pretty excited for voice models to get great its interesting to watch how people are already starting to change the way they interface with AI
English
931
242
6.3K
645.3K
Peter Yang
Peter Yang@petergyang·
I caved and downloaded Hermes to try. For those of you who have tried both Hermes and OpenClaw what difference do you notice? No shilling please, just want some honest opinions
English
377
29
1.2K
301.6K
Jason Fleagle
Jason Fleagle@jjfleagle·
@petergyang Hermes hands down is way better and approaches tasks with reasoning and better context management. I’ve got 12 ai agents half openclaw and half Hermes to compare them.
English
0
0
0
468