Jason Fleagle

3.8K posts

Jason Fleagle

@jjfleagle

AI operator on how AI changes companies, workflows, and teams | Head of AI @ NetSync | Agent systems, enterprise AI, deployment, trust

Tulsa, OK Katılım Kasım 2015

29 Takip Edilen605 Takipçiler

Sabitlenmiş Tweet

Jason Fleagle@jjfleagle·16 Nis

Most AI content is still focused on prompts and demos. That is not where the real advantage will come from. The real advantage will come from: - workflow design - verification - deployment - trust - specialized AI teams, not one giant bot AI is not just changing software. It is starting to change the architecture of the company. That is the layer I am most interested in.

English

Jason Fleagle@jjfleagle·40m

The biggest mistake after the GPT-5.5 / Mythos cyber results would be treating this as a lab rivalry. The real signal is capability diffusion. Once public models reach restricted-model performance on offensive cyber tasks, defenders have to assume the capability is broadly available. That means controls, testing, and incident response need to catch up now. Full article: x.com/jjfleagle/stat…

English

Jason Fleagle@jjfleagle·4h

The AI safety debate gets too abstract. For buyers, the practical questions are simple: - Can this model perform offensive tasks? - Can the safeguards be bypassed? - How fast are jailbreaks patched? - What telemetry proves the controls work? - Who is accountable when the model is embedded in workflows? That is where AI governance has to get much more concrete. My breakdown: x.com/jjfleagle/stat…

English

Jason Fleagle@jjfleagle·20h

If I were pressure-testing enterprise security for GPT-5.5-class agents, I would start here: 1. External exposure review 2. Identity and credential path mapping 3. Known exploitable vulnerabilities 4. Lateral movement simulations 5. Logging gaps 6. Human escalation timing 7. Vendor jailbreak response SLAs The model benchmark is interesting. The operating response is what matters. Article: x.com/jjfleagle/stat…

English

Jason Fleagle@jjfleagle·1d

Most AI security conversations are still focused on prompt injection and data leakage. Those matter. But the newer problem is agentic cyber capability: Can the model reason through a real environment, adapt after failures, use tools, move laterally, and keep pursuing the objective? That is the test enterprise teams need to start running. I broke down the GPT-5.5 / Mythos benchmark here: x.com/jjfleagle/stat…

English

Jason Fleagle@jjfleagle·1d

The phrase "restricted for safety" deserves more scrutiny now. If a public model matches a restricted model on the same cyber benchmarks, the industry needs clearer answers: 1. Was the restriction about actual safety? 2. Was it about compute capacity? 3. Was it about rollout control? 4. What evidence should buyers trust? Enterprise AI governance cannot run on press-release language. Context: x.com/jjfleagle/stat…

English

Jason Fleagle@jjfleagle·1d

A model does not need to be a magic hacker to be dangerous. It only needs to be good enough to chain boring steps: - recon - exploit known weakness - steal creds - move laterally - retry with more context - keep going while humans sleep That is why AI cyber risk is mostly an operations problem, not a sci-fi problem. More here: x.com/jjfleagle/stat…

English

Jason Fleagle@jjfleagle·1d

@_flixmd Yep that’s right!

English

Flix@_flixmd·1d

@jjfleagle Yes. The scary part is no longer just data exfiltration. In healthcare/research networks, an agentic path can become work-queue changes, access-scope drift, or source-record confusion. Test the action path, denial logs, and rollback plan, not just the perimeter.

English

Jason Fleagle@jjfleagle·2d

The uncomfortable part of the GPT-5.5 / Mythos cyber benchmark is not that one model beat another. It is that both models are now good enough to autonomously work through weak enterprise networks. That changes the security question from: "Can AI help attackers?" to: "Have we tested our environment against agentic attack paths yet?" Most teams have not. Article: x.com/jjfleagle/stat…

English

Jason Fleagle retweetledi

Aaron Edwards@uglyrobot·2d

I think Mythos was largely a marketing stunt. Just the normal Opus model with reasoning limits and RL guardrails taken off.

Jason Fleagle@jjfleagle

x.com/i/article/2052…

English

162

Jason Fleagle@jjfleagle·2d

@charlietlamb @OpenAI @steipete @steipete is the boss! That’s awesome.

English

Charlie Lamb@charlietlamb·3d

Perks of sharing an office with @OpenAI: Bumping into Mr Claw himself @steipete on the stairs🦞

English

546

16.6K

Jason Fleagle@jjfleagle·2d

@gdb Nicely done! When is the new voice model and update coming?

English

102

Greg Brockman@gdb·2d

ChatGPT for Excel and Google Sheets:

ChatGPT@ChatGPTapp

ChatGPT is now available as an add-on in Excel and Google Sheets. It can help analyze messy data, write formulas, update spreadsheets, and explain what it’s doing along the way—without leaving your spreadsheet. Powered by GPT-5.5. chatgpt.com/apps/spreadshe…

English

109

1.6K

185.7K

Jason Fleagle@jjfleagle·2d

@mark_k @OpenAI When’s it coming?

English

Mark Kretschmann@mark_k·3d

A new “voice mode” is being prepared for release by @OpenAI. The upgraded voice mode is based on the omnimodal GPT-5.5, making it substantially smarter and more expressive than the current version. It will also support full-duplex conversations, meaning it can listen and speak at the same time. That should make conversations feel much more natural and fluid.

English

1.6K

85.7K

Jason Fleagle@jjfleagle·2d

@sama I agree. I love seeing the focus in the new direction. I’ve been building like crazy. Keep up the great work!

English

671

Jason Fleagle retweetledi

Sam Altman@sama·2d

ChatGPT feels very 'switched on' now

English

1.2K

182

5.9K

523.2K

Jason Fleagle@jjfleagle·2d

The operator takeaway is not "which lab is safer?" It is this: If public models can now complete meaningful multi-step cyber tasks, enterprise security teams need to test against autonomous agents, not yesterday's static threat model. The defensive benchmark just moved.

English

Jason Fleagle@jjfleagle·2d

x.com/i/article/2052…

ZXX

167

Jason Fleagle@jjfleagle·2d

Hot take: Anthropic's "too dangerous to release" line for Mythos may have been a GPU shortage dressed up as "ethics." Here's the evidence: The UK AISI tested GPT-5.5 (public) and Mythos (restricted) on the same expert cybersecurity benchmarks. GPT-5.5 scored higher and its public. Both breached a simulated enterprise network. The safety restriction didn't prevent the capability from existing in the wild. It just prevented Anthropic from monetizing it at scale. That looks like to me a supply chain management with a PR wrapper. Full breakdown 👇 x.com/jjfleagle/stat…

English

Jason Fleagle@jjfleagle·3d

@sama Is there a 4o voice model update coming to chatGPT?

English

Sam Altman@sama·3d

pretty excited for voice models to get great its interesting to watch how people are already starting to change the way they interface with AI

English

931

242

6.3K

645.3K

Peter Yang@petergyang·4d

I caved and downloaded Hermes to try. For those of you who have tried both Hermes and OpenClaw what difference do you notice? No shilling please, just want some honest opinions

English

377

1.2K

301.6K

Jason Fleagle@jjfleagle·3d

@petergyang Hermes hands down is way better and approaches tasks with reasoning and better context management. I’ve got 12 ai agents half openclaw and half Hermes to compare them.

English

468

Keşfet

@_flixmd @charlietlamb @OpenAI @steipete @gdb @mark_k @sama @elonmusk